The goal of fairness-aware data mining is to analyze data while taking into account potential issues of fairness, discrimination, neutrality, and/or independence. Pedreschi, Ruggieri, and Turini in KDD2008 firstly posed this problem, and a literature about this topic was emerged.

### General discussion about fairness-aware data mining

Two major tasks of FADM are unfairness detection and unfairness prevention tasks. A unfairness detection task aims to find unfair treatments in database. The aim of a unfairness prevention task is to learn a statistical model from potentially unfair data sets so that the sensitive feature does not influence the model’s outcomes, where a sensitive feature represents information that is wanted not to influence outcomes, such as socially sensitive information or information that a user want to ignore.

- “Fairness-Aware Data Mining” My survey slides from a technical viewpoint, regularly updated

#### Publications

- “Future Directions of Fairness-aware Data Mining: Recommendation, Causality, and Theoretical Aspects” ICML2015 Workshop (FATML) [presentation]
- “Considerations on Fairness-aware Data Mining” ICDM2012 Workshop (DPADM) [presentation]

### Contents

## Fairness-aware Classification

A horizontal plane depicts a model sub-space of distributions represented by a parametric model. In the case of a standard classification task, the goal of the task is to find the best parameter, such that the resulting distribution, \(\hat{\Pr}[Y,\mathbf{X},S;\boldsymbol{\Theta}^{\ast}]\), best approximates a true distribution, \(\Pr[Y,\mathbf{X},S]\). The best estimated distribution is chosen so as to minimize the divergence between \(\Pr[Y,\mathbf{X},S]\) and \(\hat{\Pr}[Y,\mathbf{X},S;\boldsymbol{\Theta}^{\ast}]\) ((4) in the figure)

We turn to a case of fairness-aware classification. The goal of a fairness-aware classification task is to find a fair estimated model, \(\hat{\Pr}^\circ[Y,\mathbf{X},S;\boldsymbol{\Theta}^{\ast}]\), that best approximates a fair true distribution, \(\Pr^\circ[Y,\mathbf{X},S]\). A vertical plane depicts a fair sub-space of distributions that satisfies a pre-specified fairness constraint. A fair true distribution, \(\Pr^\circ[Y,\mathbf{X},S]\), must be in this fair sub-space. A parametric model of fair estimated distributions, \({\hat{\Pr}}^\circ[Y,\mathbf{X},S;\boldsymbol{\Theta}^{\ast}]\), must be in the product sub-space of fair and model sub-spaces, depicted by a thick line in the figure.

The goal of fairness-aware predictors is to find the best predictor to minimize the divergence between a fair true distribution and a fair estimated distribution. Unfortunately, we cannot sample from a fair true distribution due to the potential unfairness of actual decisions in real world. Therefore, the following three types of approaches have been developed.

*Pre-Process*— Potentially unfair data are transformed into fair data (1), and a standard classifier is applied (2)-
*In-Process*— A fair model is learned directly from a potentially unfair dataset (3) -
*Post-Process*— A standard classifier is first learned (4), and then the learned classifier is modified to satisfy a fairness constraint (5)

#### Program Codes

### Analysis of the influence of biases on fairness

We analyzed the influence of a model bias and a decision rule on fairness.

#### Publications

- “Model-Based and Actual Independence for Fairness-Aware Classification” Data Mining and Knowledge Discovery (2018)
- “The Independence of the Fairness-aware Classifiers” ICDM2013 Workshop (PADM) [presentation]

### Logistic Regression with Prejudice Remover Regularizer

We stated a fairness-aware classification model as an optimization problem. A penalty term that enhancing the statistical independence between a target variable and a sensitive feature is adopted as a regularizer.

#### Publications

- “Fairness-Aware Classifier with Prejudice Remover Regularizer” ECMLPKDD2012 [presentation]
- “Fairness-aware Learning through Regularization Approach” ICDM2011 Workshop (PADM) [presentation]

## Independence-Enhanced Recommender System

Recommendation independence is defined as unconditional statistical independence between a recommendation result and a sensitive feature. An *independence-enhanced recommender system* (IERS) is a recommender system that maintains recommendation independence. An examples of IERS applications are the adherence to laws and regulations by the recommendation service, and the fair treatment of content providers, the exclusion of unwanted information.

#### Program Codes

### A Regularization Approach for a Find-Good-Items Task

A Regularization approach is applied to a find-good-items task, whose aim is to rank items according to the degrees of preference.

- “Considerations on Recommendation Independence for a Find-Good-Items Task” RecSys2017 Workshop (FATREC) [presentation]

### A Model-based Approach for a Predicting-Ratings Task

We modified a model of probabilistic latent semantic analysis that is designed for predicting ratings given by a user. A sensitive feature is added to the model so that it is independent from a rating variable.

#### Publications

- “Model-Based Approaches for Independence-Enhanced Recommendation” ICDM2016 Workshop (PDDM) [presentation]

### A Regularization Approach for a Predicting-Ratings Task

We introduce the penalty term to a probabilistic matrix factorization model to enhance the independence. The penalty term is designed so as to quantify the degree of independence between a binary sensitive feature and an predicted rating score.

#### Publications

- “Recommendation Independence” FAT*2018 [presentation]
- “Correcting Popularity Bias by Enhancing Recommendation Neutrality” RecSys2014 Poster [poster]
- “Efficiency Improvement of Neutrality-Enhanced Recommendation” RecSys2013 Workshop (Decisions) [presentation]
- “Enhancement of the Neutrality in Recommendation” RecSys2012 Workshop (Decisions) [presentation]

## Links to Related Sites

### Tutorials and Courses

- Book: “Fairness and machine learning” by Solon Barocas, Moritz Hardt, and Arvind Narayanan
- Fairness and Discrimination in Recommendation and Retrieval RecSys2019 Tutorial
- Fairness-Aware Machine Learning: Practical Challenges and Lessons Learned KDD2019 Tutorial
- Anti-discrimination Learning: From Association to Causation KDD2018 Tutorial
- Defining and Designing Fair Algorithms ICML2018
- Fairness in Machine Learning NIPS2017 Tutorial
- Algorithmic bias: from discrimination discovery to fairness-aware data mining KDD2016 Tutorial
- A Course on Fairness, Accountability and Transparency in Machine Learning

### Conferences

- Conference on Fairness, Accountability, and Transparency (FAT*)
- AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society

### Workshops

- Fairness, Accountability, and Transparency in Machine Learning
- Workshop on Responsible Recommendation
- Privacy and Discrimination in Data Mining ICDM2016
- Machine Learning and the Law NIPS016
- Discrimination and Privacy-Aware Data Mining ICDM2012

### Program codes

- AI Fairness 360 by IBM
- ML-fairness-gym by Google
- Algorithmic Fairness (black-box auditing / fairness-comparison) by the Algorithmic Fairness group
- fairlearn
- TensorFlow fairness-indicators
- FAT-forensics
- fairml Sample implementations of FADM algorithms by Žliobaitė’s group
- Conditional non discrimination by Žliobaitė
- DCUBE: Discrimination Discovery in Databases by Pedrescchi, Ruggieri, and Turini
- Learning Fair Classifiers by Zafar