arxiv:1811.07867v3 [stat.ap] 24 apr 2020

22
Prediction-Based Decisions and Fairness: A Catalogue of Choices, Assumptions, and Definitions Shira Mitchell Civis Analytics [email protected] Eric Potash University of Chicago [email protected] Solon Barocas Microsoft Research and Cornell University [email protected] Alexander D’Amour Google Research [email protected] Kristian Lum University of Pennsylvania [email protected] April 28, 2020 Abstract A recent flurry of research activity has attempted to quantitatively define “fairness” for decisions based on statistical and machine learning (ML) predictions. The rapid growth of this new field has led to wildly inconsistent terminology and notation, presenting a serious challenge for cataloguing and comparing definitions. This paper attempts to bring much-needed order. First, we explicate the various choices and assumptions made—often implicitly—to justify the use of prediction-based decisions. Next, we show how such choices and assumptions can raise concerns about fairness and we present a notationally consistent catalogue of fairness definitions from the ML literature. In doing so, we offer a concise reference for thinking through the choices, assumptions, and fairness considerations of prediction-based decision systems. 1 Introduction Prediction-based decision-making has swept through industry and is quickly making its way into government. These techniques are already common in lending [52, 84, 42], hiring [88, 89, 55], and online advertising [119], and increasingly figure into decisions regarding pretrial detention [3, 80, 27], immigration detention [77], child maltreatment screening [122, 34, 13], public health [95, 105], and welfare eligibility [34]. Across these domains, decisions are based on predictions of an outcome deemed relevant to the decision. In recent years, attention has focused on how consequential prediction models may be “biased”– a now overloaded word that in popular media has come to mean that the model’s predictive performance (however defined) unjustifiably differs across disadvantaged groups along social axes such as race, gender, and class. Uncovering and rectifying such “biases” via alterations to standard statistical and machine learning models has motivated a field of research we will call Fairness in Machine Learning (ML). Fairness in ML has been explored in popular books [98, 34] and a White House report [35], surveyed in technical review papers [8, 40, 15, 126], and an in-progress textbook [5], and inspired a number of software packages [132, 38, 2, 43, 47]. Though the ML fairness conversation is somewhat new, it resembles older work. For example, since the 1960s, psychometricians have studied the fairness of educational tests based on their ability to predict performance (at school or work) [16, 120, 24, 33, 104, 57, 81, 58]. More recently, Dorans and Cook review broader notions of fairness, including in test design and administration [30]. 1 arXiv:1811.07867v3 [stat.AP] 24 Apr 2020

Upload: others

Post on 11-Dec-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Prediction-Based Decisions and Fairness:

A Catalogue of Choices, Assumptions, and Definitions

Shira MitchellCivis Analytics

[email protected]

Eric PotashUniversity of [email protected]

Solon BarocasMicrosoft Research and Cornell University

[email protected]

Alexander D’AmourGoogle Research

[email protected]

Kristian LumUniversity of Pennsylvania

[email protected]

April 28, 2020

Abstract

A recent flurry of research activity has attempted to quantitatively define “fairness” for decisionsbased on statistical and machine learning (ML) predictions. The rapid growth of this new field hasled to wildly inconsistent terminology and notation, presenting a serious challenge for cataloguing andcomparing definitions. This paper attempts to bring much-needed order.

First, we explicate the various choices and assumptions made—often implicitly—to justify the use ofprediction-based decisions. Next, we show how such choices and assumptions can raise concerns aboutfairness and we present a notationally consistent catalogue of fairness definitions from the ML literature.In doing so, we offer a concise reference for thinking through the choices, assumptions, and fairnessconsiderations of prediction-based decision systems.

1 Introduction

Prediction-based decision-making has swept through industry and is quickly making its way into government.These techniques are already common in lending [52, 84, 42], hiring [88, 89, 55], and online advertising [119],and increasingly figure into decisions regarding pretrial detention [3, 80, 27], immigration detention [77],child maltreatment screening [122, 34, 13], public health [95, 105], and welfare eligibility [34]. Across thesedomains, decisions are based on predictions of an outcome deemed relevant to the decision. In recentyears, attention has focused on how consequential prediction models may be “biased”– a now overloadedword that in popular media has come to mean that the model’s predictive performance (however defined)unjustifiably differs across disadvantaged groups along social axes such as race, gender, and class. Uncoveringand rectifying such “biases” via alterations to standard statistical and machine learning models has motivateda field of research we will call Fairness in Machine Learning (ML).

Fairness in ML has been explored in popular books [98, 34] and a White House report [35], surveyed intechnical review papers [8, 40, 15, 126], and an in-progress textbook [5], and inspired a number of softwarepackages [132, 38, 2, 43, 47]. Though the ML fairness conversation is somewhat new, it resembles older work.For example, since the 1960s, psychometricians have studied the fairness of educational tests based on theirability to predict performance (at school or work) [16, 120, 24, 33, 104, 57, 81, 58]. More recently, Doransand Cook review broader notions of fairness, including in test design and administration [30].

1

arX

iv:1

811.

0786

7v3

[st

at.A

P] 2

4 A

pr 2

020

Importantly, the Fairness in ML field is not purely mathematical. Any definition of fairness necessarilyencodes social goals in mathematical formalism. Thus, the formalism itself is not meaningful outside of theparticular social context in which a model is built and deployed. For this reason, the goal of this paper is notto advance axiomatic definitions of fairness, but to summarize the definitions and results in this area thathave been formalized to date. Our hope with this paper is to contribute a concise, cautious, and reasonablycomprehensive catalogue of the important fairness-relevant choices that are made in designing a predictivemodel, the assumptions that underlie many models where fairness is a concern, and some metrics and methodsfor evaluating the fairness of models. Alongside this summary, we point out gaps between mathematicallyconvenient formalism and the larger social goals that many of these concepts were introduced to address.

Throughout this article, we ground our theoretical and conceptual discussion of Fairness in ML in real-world example cases that are prevalent in the literature: pretrial risk assessment and lending models. In atypical pretrial risk assessment model, information about a person who has been arrested is used to predictwhether they will commit a (violent) crime in the future or whether they will fail to appear for court ifreleased. These predictions are often based on demographic information, the individual’s criminal history,and sometimes their responses to more in-depth interview questions. These predictions are then used toinform a judge, who must make an extremely consequential decision: whether the person who has beenarrested should be released before their case has concluded, and if so, under what conditions. Although theoptions available to the judge or magistrate are many, including simply releasing the person, setting bail,requiring participation in a supervised release program, etc., in this literature, this decision is often reducedto a binary decision: release or detain. In lending, the task is to predict whether an individual will repaya loan if one is granted. These predictions are based on employment, credit history, and other covariates.The loan officer then incorporates this prediction into their decision-making to decide whether the applicantshould be granted a loan, and if so, under what terms. In the Fairness in ML literature, the decision spaceis also often reduced to the decision to grant or deny the loan.

The paper proceeds as follows. Section 2 outlines common choices and assumptions that are made thatare often considered outside the scope of the model but have material consequences for the fairness of themodel’s performance in practice. In Section 3, as a segue to the slightly more mathematical parts of thepaper, we introduce our main setup and notation. Section 4 begins the discussion of the various mathematicalnotions of fairness that have been introduced, including tensions and impossibilities among them. In Section5 we explore causal frameworks for reasoning about algorithmic fairness. Section 6 offers some suggestionsand ideas for future work in this area, and Section 7 concludes.

2 Choices, assumptions, and considerations

Several recent papers have demonstrated how social goals are, sometimes clumsily, abstracted and formulatedto fit into a prediction task [23, 114, 48, 49, 94, 29, 113, 32, 100]. In this section, we link these socio-technicalconcerns to choices and assumptions made in the policy design process. Broadly, these assumptions takethe problem of evaluating the desirability of a policy, and reduce it to the simpler problem of evaluating thecharacteristics of a model that predicts a single outcome.

2.1 The policy question

Much of the technical discussion in Fairness in ML takes as given the social objective of deploying a model,the set of individuals subject to the decision, and the decision space available to decision-makers who willinteract with the model’s predictions. Each of these are choices that—although sometimes prescribed bypolicies or people external to the immediate model building process—are fundamental to whether the modelwill ultimately advance fairness in society, however defined.

2

2.1.1 The over-arching goal

Models for which fairness is a concern are typically deployed in the service of achieving some larger goal. Fora benevolent social planner, this may be some notion of justice or social welfare [56]. For a criminal justiceactor, this goal may be reducing the number of people who are detained before their trial while simultaneouslyprotecting public safety. For a bank making lending decisions, the goal may be maximizing profits. Oftenthere are genuinely different and conflicting goals, which are not resolved by more data [34, 99]. If onedisagrees with the over-arching goal, a model that successfully advances this goal—regardless of whether itattains any of the mathematical notions of fairness discussed here—will not be acceptable [100].

Prediction-based decision systems often implicitly assume the pursuit of the over-arching social goal willbe served by better predicting some small number of outcomes. For example, in pretrial decisions, outcomesof interest are typically crime (measured as arrest or arrest for a violent crime) and non-appearance in court.In contrast, human decision-makers may consider several outcomes, including impacts on a defendant’s well-being or caretaker status [9]. In making decisions about college admissions, it may be tempting to narrowthe larger goals of higher education to simply the future GPAs of admitted students [73]. Narrowing focusto a chosen, measured outcome can fall short of larger goals (this is sometimes called omitted payoff bias[72]).

Furthermore, prediction-based decision systems usually only focus on outcomes under one decision (e.g.crime if released) and assume the outcome under an alternative decision is known (e.g. crime if detained).More recent work has formalized this oversight using the language of counterfactuals and potential outcomes[20]. Finally, prediction-based decisions are formulated by assuming progress towards the over-arching goalcan be expressed as a scalar utility function that depends only on decisions and outcomes, see Section 3.

2.1.2 The population

A model’s predictions are not applied to all people, but typically to a specific sub-population. In some cases,individuals choose to enter this population; in other cases, they do not. In pretrial decisions, the populationis people who have been arrested. In lending decisions, the population is loan applicants. These populationsare sampled from a larger population by some mechanism, e.g. people are arrested because a police officerdetermines that the individual’s observed or reported behavior is sufficiently unlawful to warrant an arrest;creditors target potential applicants with offers or applicants independently decide to apply to a particularlending company for a loan. The mechanism of entry into this population may reflect objectionable socialstructures, e.g. policing that targets racial minorities for arrest [1] and discrimination in loan pre-applicationscreening [21]. A model that satisfies fairness criteria when evaluated only on the population to which themodel is applied may overlook unfairness in the process by which individuals came to be subject to themodel in the first place.

2.1.3 The decision space

The decision space is the set of actions available to a decision maker. For example, in a simplified pre-trialcontext, the decision space might consist of three options: release the arrested person on recognizance, setbail that must be paid to secure the individual’s release, or detain the individual. In the lending example, thedecision space might only consist of the options to grant or deny the loan application. Both of these decisionspaces leave out many other possible interventions or options. For example, a different pre-trial decisionspace might include the option that the judge recommend the person who has been arrested make use ofexpanded pretrial services [75, 125, 86], including offering funds to help defendants with transportationto court, paying for childcare, text message reminders of court dates [116, 17, 121, 93], drug counseling,or locating job opportunities. In lending, a broader decision space could include providing longer-term,lower-interest loans. In several contexts, one could consider replacing individual-level with community-levelpolicies. While mathematical definitions of the ML fairness literature discussed below may be able to certifythe fair allocation of decisions across a population, they have nothing to say about whether any of theavailable actions are acceptable in the first place.

3

2.2 The statistical learning problem

Mathematical definitions of fairness generally treat the statistical learning problem that is used to formulatea predictive model as external to the fairness evaluation. Here, too, there are a number of choices that canhave larger social implications, but go unmeasured by the fairness evaluation. We focus specifically on thechoices of training data, model, and predictive evaluation.

2.2.1 The data

The foundation of a predictive model is the data on which it is trained. The data available for any consequen-tial prediction task, especially data measuring and categorizing people, can induce undesirable propertieswhen used as a basis for decision-making. In the Fairness in ML literature, data with these undesirableproperties are often labeled informally as “biased” (e.g., [64, 6, 12, 82]). Here, we decompose this notion ofbias into more precise notions of statistical bias—i.e. concerns about non-representative sampling and mea-surement error—and societal bias—i.e. concerns about objectionable social structures that are representedin the data (figure 1). We treat each of these notions in turn.

world as it should and could be

world as it is

retrospective injustice (societal bias)

world according to data

non-representative sampling measurement error

(statistical bias)

Figure 1: A cartoon showing two components of “biased data”: societal bias and statistical bias.

Statistical bias Here we consider statistical bias to be a systematic mismatch between the sample usedto train a predictive model, and the world as it currently is. Specifically, we consider how sampling bias andmeasurement errror can induce fairness implications that are usually unmeasured by mathematical fairnessdefinitions.

Sampling bias occurs when a dataset is not representative of the full population to which the resultingmodel will be applied. For example, in pretrial decisions, the sample used to train the model is typicallydrawn from the population of people who were released pre-trial. For people who were not released beforetheir case has concluded, it is not possible to directly measure typical prediction outcomes, like re-arrest, asthey do not have the opportunity to be re-arrested. These people are typically excluded from training datasets. In the lending example, models may be based only on those individuals who were granted a loan, asit is not possible to measure the outcome variable– whether they would have repaid the loan had it beengranted. This problem is sometimes called selective labels [72, 79, 13, 18, 26]. Incorrectly assuming thesample is representative can lead to biased estimation of conditional probabilities (e.g. probability of crimegiven covariates), biased estimation of utility, and inadequate fairness adjustments [63].

4

Under a missing at random assumption, modeling could hope to avoid this selection bias [46]. But therecan be regions of the covariates where no data exists to fit a model; for there may be values of a covariatefor which no defendants were ever released and hence the outcome in that region is unobserved [15]. Oneoption is extrapolation: fit the model to released defendants, then apply the model to all defendants evenif this includes new regions of the variables. However, ML models can perform poorly with such a shift incovariates [106]. Another option is to trust previous decisions: assume that regions of the covariates whereno defendants were released are regions where all defendants would have committed crimes if released [26].Though convenient from a modeling perspective, these types of assumptions are unlikely to hold.

Similarly systematic measurement error, particularly when the error is greater for some groups thanothers (known as differential measurement error [123]), can have profound consequences. In lending, somemeasures of past success in loan repayment (measures that may be used to predict future loan repayment)only account for repayment of loans through formal banking institutions. At least historically, immigrantcommunities were more likely to engage in more informal lending, and so measurements of past success inloan repayment may systematically understate the value of past loans repaid for people who participate ininformal lending relative to those who go through formal channels [25]. Similar issues exist in our runningexample of pretrial risk assessment, where commission of a crime is often measured by re-arrest during thepre-trial period. Individuals and groups that are more likely to be re-arrested following the commission of acrime will be measured in the data as more criminal, regardless of whether those differences are the result intrue underlying rates of the commission of the crime or bias in the process by which the decision to arrestis made.

Additionally, the perceived fairness of a model may hinge on measurement choices that incorporatemoral or normative arguments. For example, in one author’s experience, recent updates to a pre-trial riskassessment tool have incorporated changes such that past failures to appear for court appointments have asunset window for inclusion in a model. In the name of fairness, after the sunset window, they can no longerbe “counted against” the defendant. In lending, it is already the case that bankruptcy is erased from one’scredit report after seven (chapter 13 bankruptcy) or 10 years (chapter 7 bankruptcy). These measurementconsiderations have less to do with the accuracy or bias of the measurement and more to do with normativedecisions about how a person ought to be evaluated.

Societal bias Even if training data are representative and accurate, they may still record objectionablesocial structures that, if encoded in a policy, run counter to the decision-maker’s goals. This corresponds toa non-statistical notion of societal bias [118]. There is overlap between the two concepts, e.g. using arrests asa measure of crime can introduce statistical bias from measurement error that is differential by race becauseof a racist policing system [1, 85]. But suppose we could perfectly measure crime, does this make the datafree from “bias”? In a statistical sense, yes.1 In a normative sense, no, because crime rates reflect societalbias, including how crime is defined [107]. In general, addressing issues of societal bias may require adjustingdata collection processes, or may not have technical solutions at all.

2.2.2 The predictive model

Statistical and machine learning models are designed to identify patterns in the data used to train them.As such, they will reproduce, to the extent that they are predictive, unfair patterns encoded in the data orunfairness in the problem formulation described above.

Furthermore, there are many choices for building a model. Indeed, this is the subject of much of statisticsand machine learning research. Choices include a class of model, a functional form, and model parameters,among others. These choices, too, have ramifications for fairness. For example, it has been shown thatdifferent functional forms can differ in their estimates for individuals and disparities between groups ofindividuals [14].

1In statistics, “bias” refers to properties of an estimator, not data. Here we mean bias in the estimation of conditionalprobabilities or fairness metrics that could result from non-representative data, measurement error, or model misspecification.

5

Models also differ in their interpretability. For example, some use point systems, where the presence ofabsence of characteristics are assigned integer point values. The individual’s final score or prediction is thenthe sum of those integer-valued points. These and other simple models can facilitate ease of understanding ofhow perturbations to the inputs might impact the model’s predictions. Human-interpretable models allowsinsight into cases in which the model will produce counter-intuitive or non-sensical results, and can helphumans catch errors when they occur [112].

Finally, there is the choice of which covariates to include in the model. Outcome predictions can changedepending on what we condition on. Person A can have a higher predicted value than Person B with a choiceof covariates V , but a lower predicted value with a choice of covariates V ′.

2.2.3 Evaluation

Fairness concerns aside, predictive models are typically built, evaluated, and selected using various mea-sures of their predictive performance such as (for continuous predictions) mean squared error or (for binarypredictions) positive-predictive value, sensitivity, specificity, etc.

We note that such evaluations generally make three important assumptions. First, they assume thatdecisions can be evaluated as an aggregation of separately evaluated individual decisions. This includesassuming that outcomes are not affected by the decisions for others, an assumption known as no interference[59]. In the loan setting, denying one family member a loan may directly impact another family member’sability to repay their own loan. If both are evaluated by the same model, this dependence is in conflict withthe “separately” assumption.

This assumption resembles beliefs from utilitarianism, which represents social welfare as a sum of indi-vidual utilities [108]. With binary predictions and decisions, each individual decision’s utility can in turn beexpressed in terms of four utilities for each possibility of true positive, false positive, false negative, and truenegative.

Second, these evaluations assume that all individuals can be considered symmetrically, i.e. identically.This assumes, for example, that the harm of denying a loan to someone who could repay is equal acrosspeople. Denying someone a loan for education will likely have a very different impact on their life thandenying a different person a similarly sized loan for a vacation home.

Third, these evaluations assume that decisions are evaluated simultaneously. That is, they are evaluatedin a batch (as opposed to serially [15]) and so do not consider potentially important temporal dynamics.For example, Harcourt shows that if the population changes their behavior in response to changing decisionprobabilities, prediction-based decisions can backfire and diminish social welfare [51].

A fundamental question of Fairness in ML is to what extent prediction measures making these assumptionsare relevant to fairness.

2.3 Axes of fairness and protected groups

A final choice in mathematical formulations of fairness is the axis (or axes) along which fairness is measured.For example, much of the ML fairness literature considers the simple case of two groups (advantaged anddisadvantaged). In this setting, typically only one variable is selected as a “sensitive attribute,” which definesthe mapping to the advantaged and disadvantaged groups. Deciding how attributes map individuals tothese groups is important and highly context-specific. Does race determine advantaged group membership?Does gender? In regulated domains such as employment, credit, and housing, these so-called “protectedcharacteristics” are specified in the relevant discrimination laws. Even in the absence of formal regulation,though, certain attributes might be viewed as sensitive, given specific histories of oppression, the task athand, and the context of a model’s use. Differential treatment or outcomes by race is often concerning, evenwhen it does not violate a specific law, but which racial groups are salient will often vary by cultural context.

Though most Fairness in ML work considers only one sensitive attribute at a time, discrimination mightaffect members at the intersection of two groups, even if neither group experiences discrimination in isolation[22]. This fact was most famously highlighted in the influential work of Crenshaw, who analyzed failed em-ployment discrimination lawsuits involving black women who could only seek recourse against discrimination

6

as black women which they were unable to establish simply as sex discrimination (since it does not applyto white women) or as race discrimination (since it does not apply to black men) [22]. More recently, theimportance of considering combinations of attributes to define advantage and disadvantage in Fairness inML applications is seen in [10], which evaluated commercial gender classification systems and found thatdarker-skinned females are the most misclassified group.

3 Setup and notation

We now turn to mathematical formulations of fairness, having presented a number of key choices, assump-tions, and consideration that make this abstraction possible. Here, we introduce notation for some canonicalproblem formulations considered in the Fairness in ML literature. We follow much of the recent discoursein this literature and focus on fairness in the context of binary decisions that are made on the basis ofpredictions of binary outcomes.

We consider a population about whom we want to make decisions. We index people by i = 1, ..., n.We assume this finite population (of size n) is large enough to approximate the “superpopulation” [46]distribution from which they were drawn, and refer to both as the “population”. Each person has covariates(i.e. features, variables, or inputs) vi ∈ V that are known at decision time. In some cases, we can separatethese into sensitive variable(s) ai (e.g. race, gender, or class) and other variables xi, writing vi = (ai, xi).

A binary decision di is made for each person. We restrict decisions to be functions of variables known atdecision time, δ : V → {0, 1}, where di = δ(vi). We define random variables V , Y , D = δ(V ) as the valuesof a person randomly drawn from the population.

In prediction-based decisions, decisions are made based on a prediction of an outcome, yi, that is unknownat decision time. Specifically, decisions are made by first estimating the conditional probability

P [Y = 1|V = vi].

The decision system does not know the true conditional probability; instead it uses ψ : V → [0, 1] wheresi = ψ(vi) is a score intended to estimate P [Y = 1|V = vi]. Let S = ψ(V ) be a random score from thepopulation. A prediction-based decision system then has a decision function δ that is a function of the scorealone, i.e. δ(v) = f(ψ(v)) for some function f . Both ψ and δ are functions of a sample of {(vi, yi)} that wehope resembles the population.

For example, in pre-trial risk assessment we predict whether an individual i will be re-arrested (yi = 1)or not (yi = 0) using vi, summaries of criminal history and basic demographic information. A decision isthen made to detain (di = 1) or release (di = 0) the individual on the basis of the prediction. In the lendingsetting, the goal is to predict whether an individual will default on (yi = 0) or repay (yi = 1) as a functionof vi, their credit history. The decision space consists only of the decision to deny (di = 0) or grant (di = 1)the loan. In both examples, we choose notation such that when yi = di, the “correct” decision has beenmade.

4 Flavors of fairness definitions from data alone

We begin our exposition of formal fairness definitions with so-called “oblivious” definitions of fairness [52]that depend on the observed data alone. These definitions equate fairness with certain parities that can bederived from the distributions of the observed features V , outcomes Y , scores S, and decisions D, withoutreference to additional structure or context. These stand in contrast to non-oblivious fairness definitions,presented in Section 5.

4.1 Unconstrained utility maximization and single-threshold fairness

As a default, we consider a definition of fairness, which we call single-threshold fairness, that is fully com-patible with simply maximizing a specific kind of utility function without treating fairness as a separate

7

consideration. Here, a decision is considered to be fair if individuals with the same score si = ψ(vi) aretreated equally, regardless of group membership [18].

This notion of fairness is connected to utility maximization by a set of results showing that, for a certainset of utility functions and scores ψ(v), the utility-maximizing decision rule δ is necessarily a single-thresholdrule [68, 7, 19, 82]. Rules of this form select a threshold c and apply one decision to individuals with scoresbelow c and another to individuals with scores above c. Formally we have

δ(v) = I(ψ(v) ≥ c).

For example, if the outcome is loan repayment, individuals with scores above the threshold would be granteda loan while those with scores below would be denied. A key condition for the optimality of single-thresholdrules is that the score ψ(v) is a good approximation the true conditional probability P [Y = 1|V = v].

Moreover, since the optimal threshold depends only on the utility function, the single-threshold rulemaximizes utility within any subgroup as well. In this sense the single-threshold rule is a viable group-sensitive definition of fairness, as it is optimal for both groups under the outlined assumptions.

The desirability of single-threshold rules is sensitive to a number of the choices outlined in Section 2.First, these rules are a direct function of the score ψ(v) and the utility function used to evaluate the decisionδ(v). These functions are specified by the decision-maker, and are sensitive to data collection, measurement,and modeling choices (Section 2.2). In addition, the optimality results here make strong assumptions aboutthe form of the utility function used to evaluate the decision δ(v). In particular, they only hold for utilityfunctions which satisfy the separate, symmetric, and simultaneous assumptions of Section 2.2.3.

These sensitivities motivate notions of fairness that are external to the utility maximization problem,and which can be evaluated without taking scoring models or utility functions for granted.

4.2 Equal prediction measures

If the impacts of decisions are contained within groups, their utilities can be considered group-specific. Anotion of fairness might ask these to be equal.

When false positives and false negatives have equal cost, this corresponds to the fairness definition ofEqual accuracy: P [D = Y |A = a] = P [D = Y |A = a′]. This definition is based on the understanding thatfairness is embodied by the predictions being “correct” at the same rate among groups. For example, wemight want a medical diagnostic tool to be equally accurate for people of all races or genders.

Instead of comparing overall accuracies, we could restrict the comparison to subsets of our advantagedand disadvantaged groups defined by their predictions or their outcomes. All such possible subsets aresummarized by a confusion matrix, which illustrates match and mismatch between Y and D, with marginsexpressing conditioning on subsets; see Figure 2, which defines common terminology for quantities containedin this table that will be discussed in this manuscript.

Y=1 Y=0 P(Y=1|D) P(Y=0|D)

D=1 True Positive (TP) False Positive (FP) P(Y=1|D=1):Positive Predictive

Value (PPV)

P(Y=0|D=0):False Discovery Rate

(FDR)

D=0 False Negative (FN) True Negative (TN) P(Y=1|D=0):False Omission

Rate (FOR)

P(Y=0|D=0):Negative Predictive

Value (NPR)

P(D=1|Y) P(D=1 | Y=1): True Positive Rate

(TPR)

P(D=1|Y=0):False Positive Rate

(FPR)

P(D=0|Y) P(D=0|Y=1):False Negative Rate

(FNR)

P(D=0|Y=0):True Negative rate

(TNR)

P(D=Y):Accuracy

Figure 2: Confusion matrix defining terminology used in this article for relationships between Y and D.

8

4.2.1 Definitions from the confusion matrix

For any box in the confusion matrix involving the decision D, we can require equality across groups. Forexample, we could define fairness by Equality of False Positive Rates by requiring that the model satisfyP [D = 1|Y = 0, A = a] = P [D = 1|Y = 0, A = a′]. All other cells of the confusion matrix can similarlydefine fairness analogously by adding the conditioning on sensitive group attribute, A. We list commondefinitions of fairness that have been proposed in the literature from the margins of the confusion matrix,grouped by pairs that sum to one. Equality of one member of the pair immediately implies equality of theother.

Conditional on outcome First consider conditioning on the outcome Y . This leads to two pairs offairness definitions. The first pair, Equality of False Positive Rates and Equality of True NegativeRates, conditions on Y = 0 and is equivalent to requiring D ⊥ A | Y = 0. This definition of fairnessdemands equality from the perspective of those individuals with the outcome defined by zero. For example,this reflects the perspective of “innocent” defendants, in requiring that all individuals who do not go on tobe re-arrested have the same likelihood of being released, regardless of whether they are in the advantagedor disadvantaged group [92].

The second pair, Equality of True Positive Rates and Equality of False Negative Rates, isequivalent to D ⊥ A | Y = 1. This definition of fairness takes the perspective of those individuals withoutcome defined by one. For example, this is equivalent to the requirement that all individuals who will goon to repay a loan have the same likelihood of receiving a loan, regardless of their sensitive group membership.This condition alone has been called equal opportunity [52]. Taken together with Equality of FPR/TNR,these notions of fairness are called error rate balance [12], separation [5] or equalized odds [52]).

These two pairs reflect a fairness notion that people with the same outcome are treated the same,regardless of sensitive group membership. We posit that this notion of fairness is more closely aligned withthe perspective of the population evaluated by the model as it demands that people who are actually similar(with respect to their outcomes) be treated similarly. [52] gives an algorithm for model fitting that is optimalwith respect to this notion of fairness.

Conditional on decision Turning to the other margin of the confusion matrix, Equality of NegativePredictive Value and Equality of Negative Predictive Value are defined by the statement Y ⊥A | D = 0. This notion of fairness conditions on having received the decision defined by zero. For example,this definition requires that all individuals who were granted a loan were equally likely to have defaulted.The other pair of definitions that appear on this margin are Equality of Positive Predictive Valueand Equality of False Discovery Rate. This is defined by the conditional independence relationshipY ⊥ A | D = 1. This definition of fairness is also sometimes called called predictive parity [12], and it isassessed by an outcome test [115]. This definition of fairness, for example, is met when people from both theadvantaged and disadvantaged groups who are denied loans would have repay them at the same rate. Thesetwo pairs of definitions taken together have been called sufficiency [5]. They reflect a fairness notion thatpeople with the same decision would have had similar outcomes, regardless of group.

These two pairs of definitions of fairness reflect the viewpoint of the decision-maker or modeler, asindividuals are grouped with respect to the decision or model’s prediction, not with respect to their actualoutcome. [28] argues that this definition of fairness is more appropriate because at the time of the decision,the decision-maker knows only the prediction, not the eventual outcome for the individual, and so individualsshould be grouped by the characteristic that is known at the time of the decision.

Zafar et al. call all four pairs of definitions (as well as equal accuracy) avoiding disparate mistreatment[130]. Berk et al. [8] also consider a definition based on several of the above elements of the confusion matrix.

They define Treatment Equality to be the ratio of false negatives to false positives: P [Y=1,D=0|A=a]P [Y=0,D=1|A=a] =

P [Y=1,D=0|A=a′]P [Y=0,D=1|A=a′] .

9

4.2.2 Analogues with scores

In the previous section, we focused on fairness definitions defined by the relationship between Y and D, abinary decision. As discussed in Section 4.1, the ultimate decision is typically arrived at by thresholding ascore S = ψ(V ) that is intended to estimate P [Y = 1|V = v]. In this section, we consider definitions basedon the relationship between S and Y and draw connections to the confusion matrix-based definitions.

AUC parity is the requirement that the area under the receiver operating characteristic (ROC) curveis the same across groups. This is analogous to Equality of Accuracy in the previous section, as the AUC isa measure of model accuracy.

Balance for the Negative Class is defined by E(S|Y = 0, A = a) = E(S|Y = 0, A = a′). This issimilar to Equality of False Positive rates in that if the score function is the same as the binary decision,i.e. S = D, then achieving Equality of False Positive rates implies Balance for the Negative Class. Thisis easily seen because the conditional expectation of a binary variable is, by definition, the same as theconditional probability that variable takes value one. Balance for the Positive Class is similarly definedas E(S|Y = 1, A = a) = E(S|Y = 1, A = a′). By an analogous argument to that above, this definition offairness is closely related to Equality of True Positive Rates. Separation denotes conditional independenceof the protected group variable with the score or decision given Y and covers all definitions discussed in thisparagraph.

Calibration within groups is satisfied when P [Y = 1|S,A = a] = S. That is, when the score functionaccurately reflects the conditional probability of Y | V . Barocas et al. point out that calibration withingroups is satisfied without a fairness-specific effort [5]. With enough (representative, well-measured) dataand model flexibility, a score S can be very close to E(Y |A,X). With many X variables, A may be well-predicted by them, i.e. there is a function a(X) that is approximately A. Then we can get calibration withingroups even without using A because E(Y |A,X) = E(Y |X).

Calibration within groups is the multi-valued analogue of Equality of Positive Predictive Value andEquality of Negative Predictive Value. Sufficiency, Y ⊥ A | D or Y ⊥ A | S, is closely related to both ofthese cases. In terms of S, calibration within groups implies sufficiency. Conversely, if S satisfies sufficiencythen there exists a function l such that l(S) satisfies calibration within groups [5]).

4.3 Equal decision measures

We now turn to fairness notions that focus on decisions D without consideration of Y . These can bemotivated in a few ways. Suppose that from the perspective of the population about whom we make decisions,one decision is always preferable to another, regardless of Y (e.g. non-detention, admission into college2) [92].In other words, allocation of benefits and harms across groups can be examined by looking at the decision(D) alone. Furthermore, while the decisions (e.g. detentions) are observed, the outcome being predicted(e.g. crime if released) may be unobserved or poorly measured, making error rates unknown. Therefore,disparity in decisions (e.g. racial disparity in detention rates) may be more publicly visible than disparity inerror rates (e.g. racial disparity in detention rates among those who would not have committed a crime).

Yet another motivation to consider fairness constraints without the outcome Y is measurement error (seeSection 2.2.1). For example, if Y suffers from differential measurement error, fairness constraints based onY may be unappealing [61]. One might believe that all group differences in Y are a result of measurementerror, and that the true outcomes on which we want to base decisions are actually similar across groups [39].

Even more broadly, we might consider the relationship between A and Y to be unfair, even if the observedrelationship in the data is accurately capturing a real world phenomenon. These considerations can allmotivate requiring Demographic Parity, Statistical Parity, Group Fairness [31]): equal decision ratesacross groups regardless of outcome Y . This fairness definition can be thought of in terms of (unconditional)independence: S ⊥ A or D ⊥ A. [65, 11, 36, 61] give algorithms to build models achieving this notion offairness

2In contrast, lending to someone unable to repay could hurt their credit score [84]. Of course, the ability to repay maystrongly depend on the terms of the loan.

10

A related definition considers parity within strata: Conditional demographic parity is defined by thecondition that D ⊥ A | Data. When Data = Y , conditional demographic parity is equivalent to separation.When Data = X (the insensitive variables), it is equivalent to Fairness through Unawareness [78], anti-classification [18], or treatment parity [82]. This is easily achieved by not allowing a model to directlyaccess information about A. Unawareness implies that people with the same x are treated the same, i.e.δ(vi) = δ(vi′) if xi = xi′ . Note the opposite is not true, since it is possible that no two people have thesame x. A related idea requires people who are similar in x to be treated similarly. More generally, we coulddefine a similarity metric between people that is aware of the sensitive variables, motivating the next flavorof fairness definitions [31].

4.4 Impossibilities

Although each flavor of fairness definition presented in this section formalizes an intuitive notion of fairness,these definitions are not mathematically or morally compatible in general. In this section, we review severalimpossibility results about definitions of fairness, providing context for how this discussion unfolded in theFairness in ML literature.

4.4.1 The COMPAS debate

In the Fairness in ML literature, incompatibilities between fairness definitions were brought to the fore ina public debate over a tool called COMPAS (Correctional Offender Management Profiling for AlternativeSanctions) deployed in criminal justice settings.

In 2016, ProPublica published a highly influential analysis based on data obtained through public recordsrequests [3, 80]. Their most discussed finding was that COMPAS does not satisfy equal FPRs by race: amongdefendants who did not get rearrested, black defendants were twice as likely to be misclassified as high risk.Based largely on this and other similar findings, they described the tool as “biased against blacks.”

Northpointe (now Equivant), the developers of COMPAS, critiqued ProPublica’s work and pointed outthat COMPAS satisfies equal PPVs: among those called higher risk, the proportion of defendants who gotrearrested is approximately the same regardless of race [27]. COMPAS also satisfies calibration within groups[37]. Much of the subsequent conversation consisted of either trying to harmonize these definitions of fairnessor asserting that one or the other is correct. As it turns out, there can be no harmony among definitions ina world where inequality and imperfect prediction is the reality.

4.4.2 Separation and sufficiency

Tension between margins of the confusion matrix is expressed in three very similar results. Barocas et al. [5]and Wasserman [127] show that under the assumption of separation (S ⊥ A|Y ) and sufficiency (Y ⊥ A|S),then either (Y, S) ⊥ A or an event in the joint distribution has probability zero.

Putting this in context and recalling that sufficiency implies equality of PPVs and separation impliesequality of FPRs, this result shows that both supported definitions of fairness in the COMPAS debate canonly be met when (1) the rate of recidivism and the distribution of scores are the same for all racial groupsor (2) there are some groups that never experience some of the outcomes (e.g. white people are neverre-arrested).

A related result was given in Kleinberg et al. [74]. Here they showed that if a model satisfies balance forthe negative class, balance for the positive class, and calibration within groups, then there are either equalbase rates (Y ⊥ A) or there was perfect prediction (P [Y = 1|V = v] = 0 or 1 for all v ∈ V). A very similarresult was shown by Chouldechova [12]. In the context of the COMPAS debate, this requires either thatreality be fair (i.e. there is no racial disparity in recidivism rates) or the model is perfectly able to predictrecidivism (a reality that is as of now unattainable). Equal base rates and perfect prediction can be calledtrivial, degenerate, or even utopian (representing two very different utopias). Regardless of description, theseconditions were not met in ProPublica’s data on COMPAS and so the definitions of fairness championed bythe different sides of the debate cannot simultaneously be achieved.

11

4.4.3 Incompatibilities with demographic parity

Here we describe several impossibility results involving demographic parity, though they have not playedso prominent a role in the public debate about fairness. Barocas et al. [5]] showed that when Y is binaryand the scoring function exhibits separation (S ⊥ A|Y ) and demographic parity (S ⊥ A), then there mustbe at least one of equal base rates (Y ⊥ A) or a useless predictor (Y ⊥ S). That is, the only way bothseparation and demographic parity are jointly possible is if WAE is exactly true or the scoring function hasno predictive utility for predicting Y .

Similarly, Barocas et al.[5]] also showed that if a model satisfies sufficiency (Y ⊥ A|S) and demographicparity (S ⊥ A), then there must also be equal base rates: Y ⊥ A. Taken together, in the context of theCOMPAS debate, even if we could decide that Equality of PPVs or Equality of NPVs was the relevant notionof fairness, if we also want to constrain the model to avoid disparate impact by requiring demographic parity,this would only be possible if we lived in a world in which there are no racial disparities in re-arrest and/orwe have a completely useless predictive model.

Finally, Corbett-Davies et al. and Lipton et al. both note that a decision rule δ that maximizes utilityunder a demographic parity constraint (in general) uses the sensitive variables a both in estimating theconditional probabilities and for determining their thresholds [19, 82]. Therefore, solutions such as disparatelearning processes (DLPs), which allow the use of sensitive variables during model building but not prediction,are either sub- or equi-optimal [102, 64, 66, 67, 131].

5 Flavors of fairness definitions incorporating additional context

So far, we have discussed “oblivious” fairness based on summaries of the joint and marginal distributionsof Y , V , S, and D. In this section, we consider fairness definitions that incorporate additional context, inthe form of metrics and causal models, to inform fairness considerations. This external context providesadditional degrees of freedom for mapping social goals onto mathematical formalism.

5.1 Metric fairness

Assume there is a metric that defines similarity based on all variables, m : V × V → R. Then, MetricFairness is defined such that for every v, v′ ∈ V, their closeness implies closeness in decisions |δ(v) −δ(v′)| ≤ m(v, v′). This is also known as individual fairness, the m-Lipschitz property [31], and perfect metricfairness [109]. In cases where the metric only considers insensitive variables, m(x, x′), metric fairness impliesunawareness.

Definitions of the metric differ. In the original work, Dwork et al. consider a similarity metric overindividuals [31]. But in subsequent research, the metric is often defined over the variables input to theclassifier [109, 71].

Either way, the metric is meant to “capture ground truth” [31]. This inspired Friedler et al. to define theconstruct space, the variables on which we want to base decisions [39]. For example, suppose we want to basedecisions on (the probability of) the outcome for an individual i. Let I be a random individual drawn fromthe population. We can express the construct space as CS = {ti} where ti = P [Y = 1|I = i]. But we cannotestimate P [Y = 1|I = i] because we only have one individual i and we do not observe their outcome in timeto make the decision. Instead, we calculate scores si = ψ(vi) intended to estimate P [Y = 1|V = vi]. Asnoted above, the conditional probabilities P [Y = 1|V = vi] are sometimes misleadingly called an individual’s“true risk” [19, 18]. But the probabilities do not condition on the individual, only some measured variables.(The computer science literature sometimes conflates an individual with their variables, see e.g. [69].)

Friedler et al. introduce an assumption they call WYSIWYG (what you see is what you get), i.e.that we can define a metric in the observed space that approximates a metric in the construct space:m(vi, vi′) ≈ mCS(ti, ti′). To satisfy WYSIWYG, m may need to be aware of the sensitive variables [31]. Onereason is that the insensitive variables X may predict Y differently for different groups. For example, supposewe want to predict who likes math so we can recruit them to the school’s math team. Let Y = 1 be liking

12

math and X be choice of major. Suppose in one group, students who like math are steered towards economics,and in the other group towards engineering. To predict liking math, we should use group membership inaddition to X. Using this terminology, Dwork et al.’s metric can be defined as aligning differences in theconstruct space with differences in the observed space [31].

Friedler et al. also introduce an alternate assumption called WAE (we’re all equal), i.e. that the groupson average have small distance in the construct space [39]. On this basis, we could adjust a metric in theobserved space so that the groups have small distance [31]. Several papers describe methods adjusting theinsensitive variables X so that they are independent of group, which is consistent with adjusting the observedspace so that it is consistent with a WAE understanding of the construct space [61].

Though conceptually appealing, one major difficulty of implementing metric fairness is that it can bedifficult to define the metric itself, especially in high dimensions. Recent work bypasses explicit elicitation ofm and instead queries individuals only on whether |δ(v)− δ(v′)| is small for many pairs of individuals i andi′ [62]. For example, it requires data on whether individuals i and i′ ought to be treated similarly withoutexplicitly requiring the decision-maker to give an analytical expression for m. The objective function formodel fitting then incorporates this notion of fairness by enforcing the elicited pairwise constraints.

5.2 Causal definitions

In this section, we discuss an alternative framework for conceiving of model fairness: causality. Causaldefinitions are a flavor of fairness, but we set it apart from Section 4 given its different orientation.

Importantly, framing fairness issues with causal language can make value judgments more explicit. Inparticular, this framing allows practitioners to designate which causal pathways from sensitive attributes todecisions constitute “acceptable” or “unacceptable” sources of dependence between sensitive attributes anddecisions. A number of the key questions involved in mapping social goals to mathematical formalism canthus be addressed by examining a causal graph, and discussing these value judgments.

We have already touched on causal notions, considering the potential or counterfactual values underdifferent decisions in section 2.1.1 [111, 59, 53]. Causal fairness definitions consider instead counterfactualsunder different settings of a sensitive variable. Let vi(a) = (a, xi(a)) be the covariates if the individual hadtheir sensitive variable set to a. We write di(a) = δ(vi(a)) = δ(a, xi(a)) for the corresponding decision, e.g.what would the hiring decision be if an black job candidate had been white? We define random variablesV (a), D(a) as values randomly drawn from the population. [76]

There is debate over whether these counterfactuals are well-defined. Pearl allows counterfactuals underconditions without specifying how those conditions are established, e.g. “if they had been white”. In contrast,Hernan and Robins introduce counterfactuals only under well-defined interventions, e.g. the interventionstudied by Greiner and Rubin: “if the name on their resume were set to be atypical among black people”[50, 53].

Putting these issues to the side, we can proceed to define fairness in terms of counterfactual decisionsunder different settings of a sensitive variable. Ordered from strongest to weakest, we have Individ-ual Counterfactual Fairness (di(a) = di(a

′) for all i), Conditional Counterfactual Fairness[78](E[D(a) | Data] = E[D(a′) | Data]), and Counterfactual Parity (E[D(a)] = E[D(a′)]). These first threecausal definitions consider the total effect of A (e.g. race) on D (e.g. hiring). However, it is possible toconsider some causal pathways from the sensitive variable to be fair. For example, suppose race affects edu-cation obtained. If hiring decisions are based on the applicant’s education, then race affects hiring decisions.Perhaps one considers this path from race to hiring through education to be fair. It often helps to visualizecausal relationships graphically, see Figure 3.3

In this cartoon version of the world, a complex historical process creates an individual’s race and so-cioeconomic status at birth [124, 60]. These both affect the hiring decision, including through education.Let xi = (ci,mi) where mi are variables possibly affected by race, so xi(a) = (ci,mi(a)). We can defineeffects along paths by defining fancier counterfactuals. Let di(a

′,mi(a)) be the decision if applicant i had

3See Pearl [101] section 1.3.1 for a definition of causal graphs, which encode conditional independence statements for coun-terfactuals.

13

History

A: Race C: SES

M: Education

D: Hiring

Figure 3: A causal graph showing direct (red), indirect (blue), and back-door (green) paths from race tohiring.

their race set to a′, while their education were set to whatever value it would have attained if they hadtheir race set to a [91]. To disallow the red path in Figure 3, we can define No Direct Effect Fairness:di(a) = di(a

′,mi(a)) for all i.However, mi(a) is only observed when ai = a, so it is not possible to confirm no direct effect fairness

without direct access to the model’s inner workings.If we are willing to make stronger assumptions and assume ignorability, M(a) ⊥ A | C, we can, however,

check No Average Direct Effect Fairness: E[D(a)] = E[D(a′,M(a))].Beyond direct effects, one could consider other directed paths from race to be fair or unfair. For example,

No Unfair Path-Specific Effects specifies no average effects along unfair (user-specified) directed pathsfrom A to D [91]. Relatedly, No Unresolved Discrimination states that there should exist no directedpath from A to D unless through a resolving variable (a variable that is influenced by A in a manner thatwe accept as nondiscriminatory) [70].

All of the above causal definitions consider only directed paths from race. In Figure 3, these include thered and blue paths. But what about the green paths? Known as back-door paths [101], these do not representcausal effects of race, and therefore are permitted under causal definitions of fairness. However, back-doorpaths can contribute to the association between race and hiring. Indeed, they are why we say “correlationis not causation.”4 Zhang and Bareinboim decompose the total disparity into disparities from each type ofpath (direct, indirect, back-door) [133]. In contrast to the causal fairness definitions, health disparities aredefined to include contribution from back-door paths (e.g. through socioeconomics at birth) [96, 60].

Causal definitions of fairness focus our attention on how to compensate for causal influences at decisiontime. Causal reasoning can be used instead to design interventions (to reduce disparities and improve overalloutcomes), rather than to define fairness. In particular, causal graphs can be used to develop interventionsat earlier points, prior to decision-making [60, 4].

6 Ways Forward

In this paper, we have been careful to identify assumptions, choices, and considerations in prediction-baseddecision-making that are and are not challenged by various mathematically-defined notions of fairness. This

4If C satisfies the back-door criterion (C includes no descendants of A and blocks all back-door paths between A and D) ina causal graph, then unconfoundedness (D(a) ⊥ A|C ∀a) holds [101, 103]. The converse is not true in general.

14

paper does not, however, discuss what to do with a flavor of fairness. The dominant focus on the Fairnessin ML literature has been to constrain decision functions to satisfy particular fairness flavors, and to treatfair decision-making as a constrained optimization problem (see, e.g., [52]).

Another way forward is to address the choices and assumptions outlined in Section 2 directly. Here wesketch that approach, including pointing to some of the relevant statistics literature.

Starting with clearly articulated goals can improve both fairness and accountability. To best serve thosegoals, consider whether interventions should be made at the individual or aggregate level. Carefully describethe eligible population to clarify who is impacted by the possible interventions. Expanding the decisionspace to include less harmful and more supportive interventions can benefit all groups and mitigate fairnessconcerns.

To build a decision system aligned with the stated goals, choose outcomes carefully, considering datalimitations. Using prior information [46] can help specify a realistic utility function. For example, insteadof assuming benefits and harms are constant across decisions (the “symmetric” assumption), prior data caninform a more realistic distribution.

Instead of assuming one potential outcome is known, causal methods can be used to estimate effects ofdecisions. Furthermore, these effects may not be separate and constant across the population. As such,causal methods can be used to study interference [97, 87] and heterogeneous effects [45].

Documenting data collection (e.g. sampling and measurement) [44, 54] enables modeling that appropri-ately accounts for the specific conditions under which data has been collected (Chapter 8 of [46], [110, 83]).Combining all choices in one expanded model [129] can mitigate sensitivity of decisions to model selection.Documenting model performance [117, 128, 90], both overall and within subgroups, is crucial to effectiveevaluation and can also help check decision systems against some of the “fairness” definitions from Section4.

7 Conclusion

The fact that much of the work on fairness in prediction-based decision making has emerged from computerscience and statistics has sometimes led to the mistaken impression that these concerns only arise in casesthat involve predictive models and automated decision making. Yet many of the issues identified in thescholarship apply to human decision making as well, to the extent that human decision making also rests onpredictions. Notably, scholars have emphasized that the tradeoff between different notions of fairness wouldapply even if a human were the ones making the prediction [74]. Rejecting model-driven or automateddecision making is not a way to avoid these problems.

At best, formal fairness metrics can instead illustrate when changes to prediction-based decision makingare insufficient to achieve different outcomes—and when interventions are necessary to bring about a differentworld. Recognizing the trade-offs involved in prediction-based decision making does not mean that we haveto accept them as a given; doing so can also spur us to think more creatively about the range of options wehave beyond making predictions to realize our policy or normative goals.

Recent criticisms of this line of work have rightly pointed out that quantitative notions of fairness canfunnel our thinking into narrow silos where we aim to make adjustments to a decision-making process,rather than to address the structural conditions that sustain inequality in society [49, 94]. While algorithmicthinking runs such risks, quantitative modeling and quantitative measures can also force us to make ourassumptions more explicit and clarify what we’re treating as background conditions (and thus not the targetof intervention). In doing so, we have the opportunity to foster more meaningful deliberation and debateabout the difficult policy issues that we might otherwise hand wave away: what is our objective and how dowe want to go about achieving it?

Used with care and humility, the recent work on fairness can play a helpful part in revealing problemswith prediction-based decision making and working through addressing them. While mathematical formalismcannot solve these problems on its own, it should not be dismissed as necessarily preserving the status quo.The opportunity exists to employ quantitative methods to make progress on policy goals [41, 105]. The

15

fairness literature that we have reviewed facilitates a critical reflection on the way we go about choosingthose goals as well as procedures for realizing them.

Acknowledgements

Jackie Shadlen provided substantial insights that contributed greatly to this paper. We are grateful to allcited authors for their work and for answering our many questions. Funding for this research was providedby the Harvard-MIT Ethics and Governance of Artificial Intelligence Initiative.

References

[1] Michelle Alexander. The New Jim Crow. New Press, 2012.[2] Rico Angell, Brittany Johnson, Yuriy Brun, and Alexandra Meliou. Themis: Automatically testing

software for discrimination. In Proceedings of the 2018 26th ACM Joint Meeting on European SoftwareEngineering Conference and Symposium on the Foundations of Software Engineering, pages 871–875.ACM, 2018.

[3] Julia Angwin, Jeff Larson, Surya Mattu, and Lauren Kirchner. Machine bias: There’s softwareused across the country to predict future criminals. and it’s biased against blacks. https://www.

propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing, 2016.[4] Chelsea Barabas, Karthik Dinakar, Joichi Ito, Madars Virza, and Jonathan Zittrain. Interventions over

predictions: Reframing the ethical debate for actuarial risk assessment. In Conference on Fairness,Accountability and Transparency (FAT*), 2018.

[5] Solon Barocas, Moritz Hardt, and Arvind Narayanan. Fairness and Machine Learning. 2018. http:

//www.fairmlbook.org.[6] Solon Barocas and Andrew D. Selbst. Big data’s disparate impact. Cal. L. Rev., 104:671, 2016.[7] James O. Berger. Statistical Decision Theory and Bayesian Analysis. Springer Series in Statistics.

Springer New York, 1985.[8] Richard Berk, Hoda Heidari, Shahin Jabbari, Michael Kearns, and Aaron Roth. Fairness in criminal

justice risk assessments: the state of the art. 2017. https://arxiv.org/abs/1703.09207.[9] William N. Brownsberger. Bill s.770: An act providing community-based sentencing alternatives for

primary caretakers of dependent children who have been convicted of non-violent crimes. https:

//malegislature.gov/Bills/190/S770, 2017.[10] Joy Buolamwini and Timnit Gebru. Gender shades: Intersectional accuracy disparities in commercial

gender classification. In Conference on Fairness, Accountability and Transparency (FAT*), pages 77–91, 2018.

[11] Toon Calders and Sicco Verwer. Three naive bayes approaches for discrimination-free classification.Data Mining and Knowledge Discovery, 21(2):277–292, 2010.

[12] Alexandra Chouldechova. Fair prediction with disparate impact: A study of bias in recidivism pre-diction instruments. Big data, 5(2):153–163, 2017. https://www.andrew.cmu.edu/user/achoulde/

files/disparate_impact.pdf.[13] Alexandra Chouldechova, Diana Benavides-Prado, Oleksandr Fialko, and Rhema Vaithianathan. A

case study of algorithm-assisted decision making in child maltreatment hotline screening decisions. InConference on Fairness, Accountability and Transparency (FAT*), pages 134–148, 2018.

[14] Alexandra Chouldechova and Max G’Sell. Fairer and more accurate, but for whom? 2017. https:

//arxiv.org/abs/1707.00046.[15] Alexandra Chouldechova and Aaron Roth. The frontiers of fairness in machine learning. 2018. https:

//arxiv.org/abs/1810.08810.[16] Anne T. Cleary. Test bias: Validity of the scholastic aptitude test for negro and white students in

integrated colleges. ETS Research Bulletin Series, 1966(2):i–23, 1966.

16

[17] Brice Cooke, Binta Zahra Diop, Alissa Fishbane, Jonathan Hayes, Aurelie Ouss, and Anuj Shah.Using behavioral science to improve criminal justice outcomes. 2018. https://urbanlabs.uchicago.edu/attachments/store/9c86b123e3b00a5da58318f438a6e787dd01d66d0efad54d66aa232a6473/

I42-954_NYCSummonsPaper_Final_Mar2018.pdf.[18] Sam Corbett-Davies and Sharad Goel. The measure and mismeasure of fairness: A critical review of

fair machine learning, 2018. https://arxiv.org/abs/1808.00023.[19] Sam Corbett-Davies, Emma Pierson, Avi Feller, Sharad Goel, and Aziz Huq. Algorithmic decision

making and the cost of fairness. In Proceedings of the 23rd ACM SIGKDD International Conferenceon Knowledge Discovery and Data Mining, pages 797–806. ACM, 2017.

[20] Amanda Coston, Alan Mishler, Edward H Kennedy, and Alexandra Chouldechova. Counterfactual riskassessments, evaluation, and fairness. In Proceedings of the 2020 Conference on Fairness, Accountabil-ity, and Transparency, pages 582–593, 2020.

[21] Marsha Courchane, David Nebhut, and David Nickerson. Lessons learned: Statistical techniques andfair lending. Journal of Housing Research, pages 277–295, 2000.

[22] Kimberle Crenshaw. Demarginalizing the intersection of race and sex: A black feminist critique ofantidiscrimination doctrine, feminist theory and antiracist politics. U. Chi. Legal F., page 139, 1989.https://chicagounbound.uchicago.edu/uclf/vol1989/iss1/8/.

[23] David Danks and Alex John London. Algorithmic bias in autonomous systems. In IJCAI, pages4691–4697, 2017.

[24] Richard B. Darlington. Another look at ‘cultural fairness’. Journal of Educational Measurement,8(2):71–82, 1971.

[25] Jared N Day. Credit, capital and community: Informal banking in immigrant communities in theunited states, 1880–1924. Financial History Review, 9(1):65–78, 2002.

[26] Maria De-Arteaga, Artur Dubrawski, and Alexandra Chouldechova. Learning under selective labels inthe presence of expert consistency. 2018. Presented at the 5th Workshop on Fairness, Accountability,and Transparency in Machine Learning (FAT/ML 2018), Stockholm, Sweden.

[27] William Dieterich, Christina Mendoza, and Tim Brennan. Compas risk scales: Demonstrating accuracyequity and predictive parity. Northpoint Inc, 2016. http://go.volarisgroup.com/rs/430-MBX-989/images/ProPublica_Commentary_Final_070616.pdf.

[28] William Dieterich, Christina Mendoza, and Tim Brennan. Compas risk scales: Demonstrating accuracyequity and predictive parity. Northpoint Inc, 2016.

[29] Roel Dobbe, Sarah Dean, Thomas Gilbert, and Nitin Kohli. A broader view on bias in automateddecision-making: Reflecting on epistemology and dynamics. 2018. Presented at the 5th Workshop onFairness, Accountability, and Transparency in Machine Learning (FAT/ML 2018), Stockholm, Sweden.

[30] Neil J. Dorans and Linda L. Cook. Fairness in Educational Assessment and Measurement. Taylor &Francis, 2016.

[31] Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. Fairness throughawareness. In Proceedings of the 3rd innovations in theoretical computer science conference, pages214–226. ACM, 2012.

[32] Laurel Eckhouse, Kristian Lum, Cynthia Conti-Cook, and Julie Ciccolini. Layers of bias: A unifiedapproach for understanding problems with risk assessment. Criminal Justice and Behavior, page0093854818811379, 2018.

[33] Hillel J. Einhorn and Alan R. Bass. Methodological considerations relevant to discrimination in em-ployment testing. Psychological Bulletin, 75(4):261, 1971.

[34] Virginia Eubanks. Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor.St. Martin’s Press, 2018.

[35] Executive Office of the President and Cecilia Munoz (Director, Domestic Policy Council) andMegan Smith (U.S. Chief Technology Officer, Office of Science and Technology Policy) and DJPatil (Deputy Chief Technology Officer for Data Policy and Chief Data Scientist, Office of Sci-ence and Technology Policy). Big data: A report on algorithmic systems, opportunity, and civil

17

rights. 2016. https://obamawhitehouse.archives.gov/sites/default/files/microsites/ostp/

2016_0504_data_discrimination.pdf.[36] Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubrama-

nian. Certifying and removing disparate impact. In proceedings of the 21th ACM SIGKDD internationalconference on knowledge discovery and data mining, pages 259–268, 2015.

[37] Anthony W. Flores, Kristin Bechtel, and Christopher T. Lowenkamp. False positives, false negatives,and false analyses: A rejoinder to machine bias: There’s software used across the country to predictfuture criminals. and it’s biased against blacks. Fed. Probation, 80:38, 2016.

[38] Center for Data Science and University of Chicago Public Policy. Aequitas. https://dsapp.uchicago.edu/aequitas/, 2018.

[39] Sorelle A. Friedler, Carlos Scheidegger, and Suresh Venkatasubramanian. On the (im)possibility offairness. 2016. https://arxiv.org/abs/1609.07236.

[40] Sorelle A. Friedler, Carlos Scheidegger, Suresh Venkatasubramanian, Sonam Choudhary, Evan P.Hamilton, and Derek Roth. A comparative study of fairness-enhancing interventions in machine learn-ing. 2018. https://arxiv.org/abs/1802.04422.

[41] Sidney Fussell. The algorithm that could save vulnerable new york-ers from being forced out of their homes. https://gizmodo.com/

the-algorithm-that-could-save-vulnerable-new-yorkers-fr-1826807459, August 2018.[42] Andreas Fuster, Paul Goldsmith-Pinkham, Tarun Ramadorai, and Ansgar Walther. Predictably un-

equal? the effects of machine learning on credit markets. 2018. https://papers.ssrn.com/sol3/

papers.cfm?abstract_id=3072038.[43] Sainyam Galhotra, Yuriy Brun, and Alexandra Meliou. Fairness testing: testing software for discrimi-

nation. In Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, pages498–510. ACM, 2017.

[44] Timnit Gebru, Jamie Morgenstern, Briana Vecchione, Jennifer Wortman Vaughan, Hanna Wallach, HalDaumee III, and Kate Crawford. Datasheets for datasets. 2018. https://arxiv.org/abs/1803.09010.

[45] Andrew Gelman. The connection between varying treatment effects and the crisis of unreplicableresearch: A bayesian perspective, 2015.

[46] Andrew Gelman, John B. Carlin, Hal S. Stern, David B. Dunson, Aki Vehtari, and Donald B. Rubin.Bayesian Data Analysis, Third Edition. Chapman & Hall/CRC Texts in Statistical Science. Taylor &Francis, 2013.

[47] Google. The what-if tool: Code-free probing of machine learning models. https://ai.googleblog.

com/2018/09/the-what-if-tool-code-free-probing-of.html, 2018. Accessed: 2019-01-02.[48] Ben Green. “fair” risk assessments: A precarious approach for criminal justice reform. 2018. Presented

at the 5th Workshop on Fairness, Accountability, and Transparency in Machine Learning (FAT/ML2018), Stockholm, Sweden.

[49] Ben Green and Lily Hu. The myth in the methodology: Towards a recontextualization of fairnessin machine learning. 2018. Presented at the Machine Learning: The Debates workshop at the 35thInternational Conference on Machine Learning (ICML).

[50] James D. Greiner and Donald B. Rubin. Causal effects of perceived immutable characteristics. Reviewof Economics and Statistics, 93(3):775–785, 2011.

[51] Bernard E. Harcourt. Against prediction: Profiling, policing, and punishing in an actuarial age. Uni-versity of Chicago Press, 2008.

[52] Moritz Hardt, Eric Price, Nati Srebro, et al. Equality of opportunity in supervised learning. InAdvances in Neural Information Processing Systems (NeurIPS), pages 3315–3323, 2016.

[53] Miguel A. Hernan and James M. Robins. Causal Inference. Chapman & Hall/CRC, 2018.[54] Sarah Holland, Ahmed Hosny, Sarah Newman, Joshua Joseph, and Kasia Chmielinski. The dataset

nutrition label: A framework to drive higher data quality standards. 2018. https://arxiv.org/abs/1805.03677.

[55] Lily Hu and Yiling Chen. A short-term intervention for long-term fairness in the labor market. 2018.https://arxiv.org/abs/1712.00064.

18

[56] Lily Hu and Yiling Chen. Welfare and distributional impacts of fair classification. 2018. Presentedat the 5th Workshop on Fairness, Accountability, and Transparency in Machine Learning (FAT/ML2018), Stockholm, Sweden.

[57] John E Hunter and Frank L Schmidt. Critical analysis of the statistical and ethical implications ofvarious definitions of test bias. Psychological Bulletin, 83(6):1053, 1976.

[58] Ben Hutchinson and Margaret Mitchell. 50 years of test (un)fairness: Lessons for machine learning.In Conference on Fairness, Accountability and Transparency (FAT*), 2019.

[59] Guido W. Imbens and Donald B. Rubin. Causal Inference in Statistics, Social, and Biomedical Sci-ences. Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. CambridgeUniversity Press, 2015.

[60] John W Jackson and Tyler J VanderWeele. Decomposition analysis to identify intervention targets forreducing disparities. Epidemiology, 29(6):825–835, 2018.

[61] James E. Johndrow and Kristian Lum. An algorithm for removing sensitive information: applicationto race-independent recidivism prediction. 2017. https://arxiv.org/abs/1703.04957.

[62] Christopher Jung, Michael Kearns, Seth Neel, Aaron Roth, Logan Stapleton, and Zhiwei Steven Wu.Eliciting and enforcing subjective individual fairness, 2019.

[63] Nathan Kallus and Angela Zhou. Residual unfairness in fair machine learning from prejudiced data.In Proceedings of the 35th International Conference on Machine Learning (ICML), 2018.

[64] Faisal Kamiran and Toon Calders. Classifying without discriminating. In Computer, Control andCommunication, 2009. IC4 2009. 2nd International Conference on, pages 1–6. IEEE, 2009.

[65] Faisal Kamiran and Toon Calders. Classifying without discriminating. In 2009 2nd InternationalConference on Computer, Control and Communication, pages 1–6. IEEE, 2009.

[66] Faisal Kamiran, Toon Calders, and Mykola Pechenizkiy. Discrimination aware decision tree learning.In Data Mining (ICDM), 2010 IEEE 10th International Conference on, pages 869–874. IEEE, 2010.

[67] Toshihiro Kamishima, Shotaro Akaho, and Jun Sakuma. Fairness-aware learning through regularizationapproach. In Data Mining Workshops (ICDMW), 2011 IEEE 11th International Conference on, pages643–650. IEEE, 2011.

[68] Samuel Karlin and Herman Rubin. The theory of decision procedures for distributions with monotonelikelihood ratio. The Annals of Mathematical Statistics, pages 272–299, 1956.

[69] Michael Kearns, Seth Neel, Aaron Roth, and Zhiwei Steven Wu. Preventing fairness gerrymandering:Auditing and learning for subgroup fairness. In Proceedings of the 35th International Conference onMachine Learning (ICML), 2018.

[70] Niki Kilbertus, Mateo Rojas Carulla, Giambattista Parascandolo, Moritz Hardt, Dominik Janzing,and Bernhard Scholkopf. Avoiding discrimination through causal reasoning. In Advances in NeuralInformation Processing Systems (NeurIPS), pages 656–666, 2017.

[71] Michael P. Kim, Omer Reingold, and Guy N. Rothblum. Fairness through computationally-boundedawareness. In Advances in Neural Information Processing Systems (NeurIPS), 2018.

[72] Jon Kleinberg, Himabindu Lakkaraju, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan. Humandecisions and machine predictions. The quarterly journal of economics, 133(1):237–293, 2017.

[73] Jon Kleinberg, Jens Ludwig, Sendhil Mullainathan, and Ashesh Rambachan. Algorithmic fairness. InAEA Papers and Proceedings, volume 108, pages 22–27, 2018.

[74] Jon Kleinberg, Sendhil Mullainathan, and Manish Raghavan. Inherent trade-offs in the fair determi-nation of risk scores. 2016. https://arxiv.org/abs/1609.05807.

[75] John Logan Koepke and David G Robinson. Danger ahead: Risk assessment and the future of bailreform. Washington Law Review, 2018.

[76] Issa Kohler-Hausmann. Eddie murphy and the dangers of counterfactual causal thinking about detect-ing racial discrimination. Nw. UL Rev., 113:1163, 2018.

[77] Robert Koulish. Immigration detention in the risk classification assessment era. Connecticut PublicInterest Law Journal, 16(1), November 2016. https://ssrn.com/abstract=2865972.

[78] Matt J. Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. Counterfactual fairness. In Advancesin Neural Information Processing Systems (NeurIPS), pages 4066–4076, 2017.

19

[79] Himabindu Lakkaraju, Jon Kleinberg, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan. Theselective labels problem: Evaluating algorithmic predictions in the presence of unobservables. InProceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and DataMining, pages 275–284. ACM, 2017.

[80] Jeff Larson, Surya Mattu, Lauren Kirchner, and Julia Angwin. How we ana-lyzed the compas recidivism algorithm. https://www.propublica.org/article/

how-we-analyzed-the-compas-recidivism-algorithm, 2016.[81] Mary A. Lewis. A comparison of three models for determining test fairness. Technical report, Federal

Aviation Administration Washington DC Office of Aviation Medicine, 1978.[82] Zachary Lipton, Julian McAuley, and Alexandra Chouldechova. Does mitigating ml’s impact disparity

require treatment disparity? In Advances in Neural Information Processing Systems (NeurIPS), pages8136–8146, 2018.

[83] Roderick J Little. To model or not to model? competing modes of inference for finite populationsampling. Journal of the American Statistical Association, 99(466):546–556, 2004.

[84] Lydia T. Liu, Sarah Dean, Esther Rolf, Max Simchowitz, and Moritz Hardt. Delayed impact of fairmachine learning. In Proceedings of the 35th International Conference on Machine Learning (ICML),2018.

[85] Kristian Lum and William Isaac. To predict and serve? Significance, 13(5):14–19, 2016.[86] Barry Mahoney, Bruce D Beaudin, John A Carver III, Daniel B Ryan, and Richard B Hoffman. Pretrial

services programs: Responsibilities and potential. Diane Publishing Company, 2004.[87] Caleb H. Miles, Maya Petersen, and Mark J. van der Laan. Causal inference for a single group of

causally-connected units under stratified interference. 2017. https://arxiv.org/abs/1710.09588.[88] Claire Cain Miller. Can an algorithm hire better than a human? https://www.nytimes.com/2015/

06/26/upshot/can-an-algorithm-hire-better-than-a-human.html, June 2015.[89] Claire Cain Miller. When algorithms discriminate. https://www.nytimes.com/2015/06/26/upshot/

can-an-algorithm-hire-better-than-a-human.html, July 2015.[90] Margaret Mitchell, Simone Wu, Andrew Zaldivar, Parker Barnes, Lucy Vasserman, Ben Hutchinson,

Elena Spitzer, Inioluwa Deborah Raji, and Timnit Gebru. Model cards for model reporting. 2018.https://arxiv.org/abs/1810.03993.

[91] Razieh Nabi and Ilya Shpitser. Fair inference on outcomes. In Proceedings of the AAAI Conferenceon Artificial Intelligence, volume 2018, page 1931. NIH Public Access, 2018.

[92] Arvind Narayanan. 21 fairness definitions and their politics, 2018. Presented at FAT* 2018, February2018, New York, NY USA.

[93] NYC. New text message reminders for summons recipients improves attendance in courtand dramatically cuts warrants. http://www1.nyc.gov/office-of-the-mayor/news/058-18/

new-text-message-reminders-summons-recipients-improves-attendance-court-dramatically.[94] Rodrigo Ochigame, Chelsea Barabas, Karthik Dinakar, Madars Virza, and Joichi Ito. Beyond legitima-

tion: Rethinking fairness, interpretability, and accuracy in machine learning. 2018. Presented at theMachine Learning: The Debates workshop at the 35th International Conference on Machine Learning(ICML).

[95] The Mayor’s Office of Data Analytics. Legionnaires’ disease response: Moda assisted in a city-wide response effort after an outbreak of legionnaires’ disease. https://moda-nyc.github.io/

Project-Library/projects/cooling-towers/, 2018.[96] Institutes of Medicine (IOM). Unequal Treatment: Confronting Racial and Ethnic Disparities in Health

Care. Washington, DC: National Academies Press, 2003.[97] Elizabeth L Ogburn, Tyler J VanderWeele, et al. Causal diagrams for interference. Statistical science,

29(4):559–578, 2014.[98] Cathy O’Neil. Weapons of Math Destruction: How Big Data Increases Inequality and Threatens

Democracy. Crown/Archetype, 2016.[99] Cathy O’Neil and Hanna Gunn. Near term artificial intelligence and the ethical matrix, 2018. Book

chapter, to appear.

20

[100] Samir Passi and Solon Barocas. Problem formulation and fairness. In Conference on Fairness, Ac-countability and Transparency (FAT*), 2019.

[101] Judea Pearl. Causality. Cambridge University Press, 2009.[102] Dino Pedreshi, Salvatore Ruggieri, and Franco Turini. Discrimination-aware data mining. In Proceed-

ings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining,pages 560–568. ACM, 2008.

[103] Emilija Perkovic, Johannes Textor, Markus Kalisch, and Marloes H Maathuis. A complete generalizedadjustment criterion. 2015. https://arxiv.org/abs/1507.01524.

[104] Nancy S. Petersen and Melvin R. Novick. An evaluation of some models for culture-fair selection.Journal of Educational Measurement, 13(1):3–29, 1976.

[105] Eric Potash, Joe Brew, Alexander Loewi, Subhabrata Majumdar, Andrew Reece, Joe Walsh, EricRozier, Emile Jorgenson, Raed Mansour, and Rayid Ghani. Predictive modeling for public health:Preventing childhood lead poisoning. In Proceedings of the 21th ACM SIGKDD International Confer-ence on Knowledge Discovery and Data Mining, pages 2039–2047. ACM, 2015.

[106] Stephan Rabanser, Stephan Gunnemann, and Zachary C. Lipton. Failing loudly: An empirical studyof methods for detecting dataset shift. 2018. https://arxiv.org/abs/1810.11953.

[107] John Raphling. Criminalizing homelessness violates basic human rights. https://www.thenation.

com/article/criminalizing-homelessness-violates-basic-human-rights/, 2018.[108] John Rawls. A Theory of Justice. Harvard University Press, 1971.[109] Guy N. Rothblum and Gal Yona. Probably approximately metric-fair learning. In Proceedings of the

35th International Conference on Machine Learning (ICML), 2018. Appearing in the 5th Workshop onFairness, Accountability, and Transparency in Machine Learning (FAT/ML 2018), Stockholm, Sweden.

[110] Donald B Rubin. Inference and missing data. Biometrika, 63(3):581–592, 1976.[111] Donald B. Rubin. Causal inference using potential outcomes: Design, modeling, decisions. Journal of

the American Statistical Association, 100(469):322–331, 2005.[112] Cynthia Rudin. Please stop explaining black box models for high stakes decisions. ArXiv,

abs/1811.10154, 2018.[113] Andrew D. Selbst, danah boyd, Sorelle Friedler, Suresh Venkatasubramanian, and Janet Vertesi. Fair-

ness and abstraction in sociotechnical systems. In Conference on Fairness, Accountability and Trans-parency (FAT*), 2019.

[114] Selena Silva and Martin Kenney. Algorithms, platforms, and ethnic bias: An integrative essay. Phylon(1960-), 55(1 & 2):9–37, 2018.

[115] Camelia Simoiu, Sam Corbett-Davies, and Sharad Goel. The problem of infra-marginality in outcometests for discrimination. The Annals of Applied Statistics, 11(3):1193–1216, 2017.

[116] Stanfords Legal Design Lab. The court messaging project. http://www.legaltechdesign.com/

CourtMessagingProject/.[117] Julia Stoyanovich. Refining the concept of a nutritional label

for data and models. https://freedom-to-tinker.com/2018/05/03/

refining-the-concept-of-a-nutritional-label-for-data-and-models/, 2018.[118] Harini Suresh and John V Guttag. A framework for understanding unintended consequences of machine

learning. 2019. https://arxiv.org/abs/1901.10002.[119] Latanya Sweeney. Discrimination in online ad delivery. Queue, 11(3):10, 2013.[120] Robert L. Thorndike. Concepts of culture-fairness. Journal of Educational Measurement, 8(2):63–70,

1971.[121] Uptrust. What we do - our results. http://www.uptrust.co/what-we-do#our-results-section.[122] Rhema Vaithianathan, Tim Maloney, Emily Putnam-Hornstein, and Nan Jiang. Children in the public

benefit system at risk of maltreatment: Identification via predictive modeling. American journal ofpreventive medicine, 45(3):354–359, 2013.

[123] Tyler J. VanderWeele and Miguel A. Hernan. Results on differential and dependent measurementerror of the exposure and the outcome using signed directed acyclic graphs. American journal ofepidemiology, 175(12):1303–1310, 2012.

21

[124] Tyler J. VanderWeele and Whitney R. Robinson. On causal interpretation of race in regressionsadjusting for confounding and mediating variables. Epidemiology (Cambridge, Mass.), 25(4):473, 2014.

[125] Marie VanNostrand, Kenneth Rose, and Kimberly Weibrecht. State of the science of pretrial releaserecommendations and supervision. Pretrial Justice Institute, 2011.

[126] Sahil Verma and Julia Rubin. Fairness definitions explained. In 2018 IEEE/ACM InternationalWorkshop on Software Fairness (FairWare), pages 1–7. IEEE, 2018.

[127] Larry Wasserman. All of Statistics: A Concise Course in Statistical Inference. Springer Texts inStatistics. Springer New York, 2010.

[128] Ke Yang, Julia Stoyanovich, Abolfazl Asudeh, Bill Howe, HV Jagadish, and Gerome Miklau. Anutritional label for rankings. In Proceedings of the 2018 International Conference on Management ofData, pages 1773–1776. ACM, 2018.

[129] Yuling Yao, Aki Vehtari, Daniel Simpson, Andrew Gelman, et al. Using stacking to average bayesianpredictive distributions. Bayesian Analysis, 2018.

[130] Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P Gummadi. Fairnessbeyond disparate treatment & disparate impact: Learning classification without disparate mistreat-ment. In Proceedings of the 26th International Conference on World Wide Web, pages 1171–1180.International World Wide Web Conferences Steering Committee, 2017.

[131] Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P. Gummadi. Fairnessconstraints: Mechanisms for fair classification. AISTATS, 2017.

[132] Meike Zehlike, Carlos Castillo, Francesco Bonchi, Ricardo Baeza-Yates, Sara Hajian, and MohamedMegahed. Fairness measures: A platform for data collection and benchmarking in discrimination-awareml. http://fairness-measures.org, June 2017.

[133] Junzhe Zhang and Elias Bareinboim. Fairness in decision-making–the causal explanation formula. In32nd AAAI Conference on Artificial Intelligence, 2018.

22