# Fairness-Aware Machine Learning and Data Mining

Pocket

The goal of fairness-aware machine learning or fairness-aware data mining is to analyze data while taking into account potential issues of fairness, discrimination, neutrality, and/or independence. Pedreschi, Ruggieri, and Turini in KDD2008 firstly posed this problem, and a literature about this topic was emerged.

### General discussion about fairness-aware machine learning

Two major tasks of FAML are unfairness detection and unfairness prevention tasks. A unfairness detection task aims to find unfair treatments in database. The aim of a unfairness prevention task is to learn a statistical model from potentially unfair data sets so that the sensitive feature does not influence the model’s outcomes, where a sensitive feature represents information that is wanted not to influence outcomes, such as socially sensitive information or information that a user want to ignore.

## Fairness-aware Classification

A horizontal plane depicts a model sub-space of distributions represented by a parametric model. In the case of a standard classification task, the goal of the task is to find the best parameter, such that the resulting distribution, $$\hat{\Pr}[Y,\mathbf{X},S;\boldsymbol{\Theta}^{\ast}]$$, best approximates a true distribution, $$\Pr[Y,\mathbf{X},S]$$. The best estimated distribution is chosen so as to minimize the divergence between $$\Pr[Y,\mathbf{X},S]$$ and $$\hat{\Pr}[Y,\mathbf{X},S;\boldsymbol{\Theta}^{\ast}]$$ ((4) in the figure)

We turn to a case of fairness-aware classification. The goal of a fairness-aware classification task is to find a fair estimated model, $$\hat{\Pr}^\circ[Y,\mathbf{X},S;\boldsymbol{\Theta}^{\ast}]$$, that best approximates a fair true distribution, $$\Pr^\circ[Y,\mathbf{X},S]$$. A vertical plane depicts a fair sub-space of distributions that satisfies a pre-specified fairness constraint. A fair true distribution, $$\Pr^\circ[Y,\mathbf{X},S]$$, must be in this fair sub-space. A parametric model of fair estimated distributions, $${\hat{\Pr}}^\circ[Y,\mathbf{X},S;\boldsymbol{\Theta}^{\ast}]$$, must be in the product sub-space of fair and model sub-spaces, depicted by a thick line in the figure.

The goal of fairness-aware predictors is to find the best predictor to minimize the divergence between a fair true distribution and a fair estimated distribution. Unfortunately, we cannot sample from a fair true distribution due to the potential unfairness of actual decisions in real world. Therefore, the following three types of approaches have been developed.

• Pre-Process — Potentially unfair data are transformed into fair data (1), and a standard classifier is applied (2)
• In-Process — A fair model is learned directly from a potentially unfair dataset (3)
• Post-Process — A standard classifier is first learned (4), and then the learned classifier is modified to satisfy a fairness constraint (5)

### Analysis of the influence of biases on fairness

We analyzed the influence of a model bias and a decision rule on fairness.

### Logistic Regression with Prejudice Remover Regularizer

We stated a fairness-aware classification model as an optimization problem. A penalty term that enhancing the statistical independence between a target variable and a sensitive feature is adopted as a regularizer.

## Independence-Enhanced Recommender System

Recommendation independence is defined as unconditional statistical independence between a recommendation result and a sensitive feature. An independence-enhanced recommender system (IERS) is a recommender system that maintains recommendation independence. An examples of IERS applications are the adherence to laws and regulations by the recommendation service, and the fair treatment of content providers, the exclusion of unwanted information.

### A Regularization Approach for a Find-Good-Items Task

A Regularization approach is applied to a find-good-items task, whose aim is to rank items according to the degrees of preference.

### A Model-based Approach for a Predicting-Ratings Task

We modified a model of probabilistic latent semantic analysis that is designed for predicting ratings given by a user. A sensitive feature is added to the model so that it is independent from a rating variable.

### A Regularization Approach for a Predicting-Ratings Task

We introduce the penalty term to a probabilistic matrix factorization model to enhance the independence. The penalty term is designed so as to quantify the degree of independence between a binary sensitive feature and an predicted rating score.