[1801.05398] On the Direction of Discrimination: An Information-Theoretic Analysis of Disparate Impact in Machine Learning

amarashar's bookmarks 2018-02-05


In the context of machine learning, disparate impact refers to a form of systematic discrimination whereby the output distribution of a model depends on the value of a sensitive attribute (e.g., race or gender). In this paper, we present an information-theoretic framework to analyze the disparate impact of a binary classification model. We view the model as a fixed channel, and quantify disparate impact as the divergence in output distributions over two groups. We then aim to find a \textit{correction function} that can be used to perturb the input distributions of each group in order to align their output distributions. We present an optimization problem that can be solved to obtain a correction function that will make the output distributions statistically indistinguishable. We derive closed-form expression for the correction function that can be used to compute it efficiently. We illustrate the use of the correction function for a recidivism prediction application derived from the ProPublica COMPAS dataset.



From feeds:

Ethics/Gov of AI ยป amarashar's bookmarks



Date tagged:

02/05/2018, 11:23

Date published:

02/05/2018, 06:23