Fairness-Aware Machine Learning and Data Mining

The goal of fairness-aware machine learning or fairness-aware data mining is to analyze data while taking into account potential issues of fairness, discrimination, neutrality, and/or independence. Pedreschi, Ruggieri, and Turini in KDD2008 firstly posed this problem, and a literature about this topic was emerged.

Tutorial on Fairness-aware Machine Learning

This tutorial first shows how an algorithm made unfair decision, and how the notion of fairness in ethics or economics is formally maintained. Then, the slide shows algorithms to detect unfair decisions and methods designed for making fair decisions.

Fairness-Aware Machine Learning and Data Mining (200+ pages, regularly updated)

Table of Contents:

Backgrounds
- Types Biases, which causes unfairness, and instances of each type of biases
Formal Fairness
- What is formal fairness, and types formal fairness
- Association-based fairness, Counterfactual fairness, and Economics-based fairness.
Fairness-Aware Machine Learning
- Tasks of Fairness-aware Machine Learning
- Unfairness discovery: detecting unfairness from algorithms or databases
- Unfairness prevention: ML methods under the constraints of formal fairness
Other topics
- Mitigation of a sample selection bias, disclosure, and so on

Links to Related Sites

Tutorials and Courses

Grounding and Evaluation for Large Language Models KDD2024 Tutorial
Socially Responsible Machine Learning: A Causal Perspective KDD2023 Tutorial
Responsible AI in Industry AAAI2021, ICML2021 Tutorial
Fairness and Discrimination in Recommendation and Retrieval RecSys2019 Tutorial
Challenges of Incorporating Algorithmic Fairness into Practice FAT*2019 Tutorial
Anti-discrimination Learning: From Association to Causation KDD2018 Tutorial
Defining and Designing Fair Algorithms ICML2018
Fairness in Machine Learning NIPS2017 Tutorial
Algorithmic bias: from discrimination discovery to fairness-aware data mining KDD2016 Tutorial
A Course on Fairness, Accountability and Transparency in Machine Learning

Books

“Causal Fairness Analysis: A Causal Toolkit for Fair Machine Learning” Drago Plečko and Elias Bareinboim (2024)
“Fairness in Information Access Systems” Michael D. Ekstrand, Anubrata Das, Robin Burke, and Fernando Diaz (2022)
“Fairness and machine learning” Solon Barocas, Moritz Hardt, and Arvind Narayanan (2023)

Conferences & Workshops

Software

Fairlearn
AI Fairness 360 by IBM
audit-AI
FairBench
ML-fairness-gym by Google
Algorithmic Fairness (black-box auditing / fairness-comparison) by the Algorithmic Fairness group
delax
TensorFlow fairness-indicators
FAT-forensics
fairml Sample implementations of FADM algorithms by Žliobaitė’s group
Conditional non discrimination by Žliobaitė
Learning Fair Classifiers by Zafar

Projects

My Research Topics

Fairness-aware Classification

unfairness prevention approaches — Unfairness prevention approaches

A horizontal plane depicts a model sub-space of distributions represented by a parametric model. In the case of a standard classification task, the goal of the task is to find the best parameter, such that the resulting distribution, \(\Pr[\hat{Y},\mathbf{X},S]\), which approximates a true distribution, \(\Pr[Y,\mathbf{X},S]\). The estimated distribution is chosen so as to minimize the divergence between \(\Pr[Y,\mathbf{X},S]\) and \(\hat{\Pr}[Y,\mathbf{X},S]\) ((4) in the figure)

We turn to a case of fairness-aware classification. The goal of a fairness-aware classification task is to find a fair estimated model, \(\Pr[\hat{Y}^\circ,\mathbf{X},S]\), that best approximates a fair true distribution, \(\Pr[Y^\circ,\mathbf{X},S]\). A vertical plane depicts a fair sub-space of distributions that satisfies a pre-specified fairness constraint. A fair true distribution, \(\Pr[Y^\circ,\mathbf{X},S]\), must be in this fair sub-space. A parametric model of fair estimated distributions, \({\Pr[\hat{Y}}^\circ,\mathbf{X},S]\), must be in the product sub-space of fair and model sub-spaces, depicted by a thick line in the figure.

The goal of fairness-aware predictors is to find the best predictor to minimize the divergence between a fair true distribution and a fair estimated distribution. Unfortunately, we cannot sample from a fair true distribution due to the potential unfairness of actual decisions in real world. Therefore, the following three types of approaches have been developed.

Pre-Process — Potentially unfair data are transformed into fair data (1), and a standard classifier is applied (2)
In-Process — A fair model is learned directly from a potentially unfair dataset (3)
Post-Process — A standard classifier is first learned (4), and then the learned classifier is modified to satisfy a fairness constraint (5)

Program Codes

Fairness-Aware Classification (Soft & Data)

Analysis of the influence of biases on fairness

We analyzed the influence of a model bias and a decision rule on fairness.

Publications

“Model-Based and Actual Independence for Fairness-Aware Classification” Data Mining and Knowledge Discovery (2018)
“The Independence of the Fairness-aware Classifiers” ICDM2013 Workshop (PADM) [presentation]

Logistic Regression with Prejudice Remover Regularizer

We stated a fairness-aware classification model as an optimization problem. A penalty term that enhancing the statistical independence between a target variable and a sensitive feature is adopted as a regularizer.

Publications

“Fairness-Aware Classifier with Prejudice Remover Regularizer” ECMLPKDD2012 [presentation, ECMLPKDD 2022 Test of Time Award]
“Fairness-aware Learning through Regularization Approach” ICDM2011 Workshop (PADM) [presentation]

Independence-Enhanced Recommender System

Recommendation independence is defined as unconditional statistical independence between a recommendation result and a sensitive feature. An independence-enhanced recommender system (IERS) is a recommender system that maintains recommendation independence. An examples of IERS applications are the adherence to laws and regulations by the recommendation service, and the fair treatment of content providers, the exclusion of unwanted information.

Program Codes

Independence-Enhanced Reommender System (Soft & Data)

A Regularization Approach for a Find-Good-Items Task

A Regularization approach is applied to a find-good-items task, whose aim is to rank items according to the degrees of preference.

“Considerations on Recommendation Independence for a Find-Good-Items Task” RecSys2017 Workshop (FATREC) [presentation]

A Model-based Approach for a Predicting-Ratings Task

We modified a model of probabilistic latent semantic analysis that is designed for predicting ratings given by a user. A sensitive feature is added to the model so that it is independent from a rating variable.

Publications

“Model-Based Approaches for Independence-Enhanced Recommendation” ICDM2016 Workshop (PDDM) [presentation]

A Regularization Approach for a Predicting-Ratings Task

We introduce the penalty term to a probabilistic matrix factorization model to enhance the independence. The penalty term is designed so as to quantify the degree of independence between a binary sensitive feature and an predicted rating score.

Publications

“Recommendation Independence” FAT*2018 [presentation]
“Correcting Popularity Bias by Enhancing Recommendation Neutrality” RecSys2014 Poster [poster]
“Efficiency Improvement of Neutrality-Enhanced Recommendation” RecSys2013 Workshop (Decisions) [presentation]
“Enhancement of the Neutrality in Recommendation” RecSys2012 Workshop (Decisions) [presentation]

Other General Topics

Other survey presentations.

Publications

“Re-formalization of Individual Fairness” RecSys2023 Workshop (FAccTRec) [presentation]
“Future Directions of Fairness-aware Data Mining: Recommendation, Causality, and Theoretical Aspects” ICML2015 Workshop (FATML) [presentation]
“Considerations on Fairness-aware Data Mining” ICDM2012 Workshop (DPADM) [presentation]