fairness, part ii

31
Fairness, Part II Giulia Fanti Fall 2019 Based in part on slides by Anupam Datta

Upload: others

Post on 08-May-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Fairness, Part II

Fairness, Part IIGiulia Fanti

Fall 2019Based in part on slides by Anupam Datta

Page 2: Fairness, Part II

Administrative

โ€ข HW4 out todayโ€ข Fairness + anonymous communication (next unit)โ€ข You will have ~3 weeks

โ€ข Presentations starting on Wednesdayโ€ข Upload your slides to Canvas by midnight the night before so we can

download them in the morningโ€ข Sign up for groups on Canvas so that we can assign group gradesโ€ข Presentation rubric on Canvas!โ€ข Volunteer in SV to share their laptop on Wednesday?

Page 3: Fairness, Part II

In-class Quiz

โ€ข On Canvas

Page 4: Fairness, Part II

Last time

โ€ข Group fairnessโ€ข Statistical parityโ€ข Demographic parityโ€ข Ensures that same ratio of people from each group get the โ€œdesirableโ€

outcome

โ€ข Individual fairnessโ€ข Ensures that similar individuals are treated similarlyโ€ข Can learn a fair classifier by solving linear program

Page 5: Fairness, Part II

Today

โ€ข When does individual fairness imply group fairness?

โ€ข Connections to differential privacy

โ€ข How do we take already-trained classifiers and make them fair?

Page 6: Fairness, Part II

Paper from last time:

Page 7: Fairness, Part II

V: Individuals O: outcomes

Classifier(e.g. tracking network)

x M(x)

Vendor(e.g. capital one)

A: actions

M : V ! O ฦ’ : O! A

๐‘ฅ โˆˆ ๐‘‰ โŠ† โ„&

S

T

Definedistributionsovereachset:

๐‘† ๐‘ฅ = ๐‘ƒ<(๐‘ฅ)

Page 8: Fairness, Part II

Individual fairness formulation:

maxAB

๐”ผD~F๐”ผG~AB ๐‘ˆ(๐‘ฅ, ๐‘œ)

s.t. ๐‘€D โˆ’๐‘€M โ‰ค ๐‘‘ ๐‘ฅ, ๐‘ฆ โˆ€ ๐‘ฅ, ๐‘ฆ โˆˆ ๐‘‰

Q: What are the downsides to this formulation?- Need a similarity metric between users - Very high-dimensional LP may be difficult to solve- Classifier must be trained a priori with fairness

Maximize utility

Subject to fairness constraint

Page 9: Fairness, Part II

When does Individual Fairness imply Group Fairness?

Suppose we enforce a metric d.

Question: Which groups of individuals receive (approximately) equal outcomes?

Answer is given by Earthmover distance(w.r.t. d) between the two groups.

Page 10: Fairness, Part II

How different are S and T?Earthmover Distance: โ€œCostโ€ of transforming one distribution to another, by โ€movingโ€ probability mass (โ€œearthโ€).

TS

(V,d)

h(x,y) โ€“ how much probability of x in S to move to y in T

Page 11: Fairness, Part II

Example: Compute Earth-Moverโ€™s Distance

โ€ข On document cam

Page 12: Fairness, Part II

bias ๐‘‘, ๐‘†, ๐‘‡ = maxS:TUVWXYZ[W\] ^_T`a

P ๐‘€ ๐‘ฅ = ๐‘œ|๐‘ฅ โˆˆ ๐‘† โˆ’ P ๐‘€ ๐‘ฅ = ๐‘œ|๐‘ฅ โˆˆ ๐‘‡

Theorem: Any Lipschitz mapping M satisfies group fairness up to bias ๐‘‘, ๐‘†, ๐‘‡ .

Further, bias(๐‘‘, ๐‘†, ๐‘‡) โ‰ค ๐‘‘๐ธ๐‘€(๐‘†, ๐‘‡)

Page 13: Fairness, Part II

Some observationsbias ๐‘‘, ๐‘†, ๐‘‡ = max

S:TUVWXYZ[W\] ^_T`aP ๐‘€ ๐‘ฅ = ๐‘œ|๐‘ฅ โˆˆ ๐‘† โˆ’ P ๐‘€ ๐‘ฅ = ๐‘œ|๐‘ฅ โˆˆ ๐‘‡

Theorem: Any Lipschitz mapping M satisfies group fairness up to bias ๐‘‘, ๐‘†, ๐‘‡ .

โ€ข By definition, the bias is the maximum deviation from group fairness that can be achieved!

โ€ข Indeed, for TV distance between distributions and binary classification, bias ๐‘‘, ๐‘†, ๐‘‡ = ๐‘‘๐ธ๐‘€(๐‘†, ๐‘‡)

โ€ข Takeaway message: If your groups are very far away (in EMD), the Lipschitz condition can only get you so far in terms of group fairness!

Page 14: Fairness, Part II

Connections to Differential Privacy

maxAB

๐”ผD~F๐”ผG~AB ๐‘ˆ(๐‘ฅ, ๐‘œ)

s.t. ๐‘€D โˆ’๐‘€M โ‰ค ๐‘‘ ๐‘ฅ, ๐‘ฆ โˆ€ ๐‘ฅ, ๐‘ฆ โˆˆ ๐‘‰

What if we donโ€™t use TV distance for ๐‘€D โˆ’๐‘€M ?

๐‘ƒ โˆ’ ๐‘„ f โ‰œ supiโˆˆj

log max๐‘ƒ ๐‘Ž๐‘„ ๐‘Ž ,

๐‘„ ๐‘Ž๐‘ƒ ๐‘Ž

A mapping M satisfies ๐œ–-differential privacy iff it satisfies the Lipschitz property!

Page 15: Fairness, Part II

Summary: Individual Fairness

โ€ข Formalized fairness property based on treating similar individuals similarlyโ€ข Incorporated vendorโ€™s utility

โ€ข Explored relationship between individual fairness and group fairnessโ€ข Earthmover distance

Page 16: Fairness, Part II

Individual fairness formulation:

maxAB

๐”ผD~F๐”ผG~AB ๐‘ˆ(๐‘ฅ, ๐‘œ)

s.t. ๐‘€D โˆ’๐‘€M โ‰ค ๐‘‘ ๐‘ฅ, ๐‘ฆ โˆ€ ๐‘ฅ, ๐‘ฆ โˆˆ ๐‘‰

Q: What are the downsides to this formulation?- Need a similarity metric between users - Very high-dimensional LP may be difficult to solve- Classifier must be trained a priori with fairness

Maximize utility

Subject to fairness constraint

Page 17: Fairness, Part II

Lots of open problems/directionโ€ข Metric

โ€ข Social aspects, who will define them?โ€ข How to generate metric (semi-)automatically?

โ€ข Earthmover characterization when probability metric is not statistical distance

โ€ข Next: How can we compute a fair classifier from an already-computed unfair one?

Page 18: Fairness, Part II

More definitions of fair classifiers

โ€ข NeurIPS 2016

Page 19: Fairness, Part II

Individualโ€™sData

True label

x ๐‘Œ๐‘ฅ =

๐‘ฅpโ€ฆ๐‘ฅr

Protected attribute ๐ด

t๐‘Œ

LearnedClassifier

u๐‘Œ

New (fair)Classifier

S๐‘จ = ๐Ÿ

T๐‘จ = ๐ŸŽ

Page 20: Fairness, Part II

Equalized odds

โ€ข Consider binary classifiersโ€ข We say a classifier t๐‘Œ has equalized odds if for all true labels ๐‘ฆ,

๐‘ƒ t๐‘Œ = 1|๐ด = 0, ๐‘Œ = ๐‘ฆ = ๐‘ƒ t๐‘Œ = 1|๐ด = 1, ๐‘Œ = ๐‘ฆ

Q: How would this definition look if we only wanted to enforce group fairness?A: ๐‘ƒ t๐‘Œ = 1|๐ด = 0 = ๐‘ƒ t๐‘Œ = 1|๐ด = 1

Page 21: Fairness, Part II

Equal opportunity

โ€ข Suppose ๐‘Œ = 1 is the desirable outcomeโ€ข E.g., getting a loan

โ€ข We say a classifier t๐‘Œ has equal opportunity if

๐‘ƒ t๐‘Œ = 1|๐ด = 0, ๐‘Œ = 1 = ๐‘ƒ t๐‘Œ = 1|๐ด = 1, ๐‘Œ = 1

Interpretation: True positive rate is the same for both classes

Weaker notion of fairness ร  can enable better utility

Page 22: Fairness, Part II

How can we create a predictor that meets these definitions? โ€ข Key property: Should be oblivious

โ€ข A property of predictor t๐‘Œ is oblivious if it only depends on the joint distribution of ๐‘Œ, ๐ด, t๐‘Œ

โ€ข What does this mean? โ€ข It does not depend on training data ๐‘‹

Page 23: Fairness, Part II

Need 4 parameters to define u๐‘Œ from ( t๐‘Œ, ๐ด)

0 1

0 ๐‘}} = ๐‘ƒ( u๐‘Œ = 1|๐ด = 0, t๐‘Œ = 0) ๐‘}p = ๐‘ƒ( u๐‘Œ = 1|๐ด = 1, t๐‘Œ = 0)

1 ๐‘p} = ๐‘ƒ( u๐‘Œ = 1|๐ด = 0, t๐‘Œ = 1) ๐‘pp = ๐‘ƒ( u๐‘Œ = 1|๐ด = 1, t๐‘Œ = 1)

Protected attribute ๐ด

Predicted Label t๐‘Œ

Page 24: Fairness, Part II

Once our ๐‘~~โ€™s are definedโ€ฆ

โ€ข How do we check that equalized odds are satisfied? ๐›พi u๐‘Œ โ‰œ ๐‘ƒ u๐‘Œ = 1|๐ด = ๐‘Ž, ๐‘Œ = 0 , ๐‘ƒ u๐‘Œ = 1|๐ด = ๐‘Ž, ๐‘Œ = 1

Compute ๐›พp( u๐‘Œ) and ๐›พ}( u๐‘Œ). (Depends on joint distribution of ๐‘Œ, ๐ด, t๐‘Œ ) They should be equal (to satisfy equalized odds)

Q: What condition do we need for an equal opportunity classifier?

A: The 2nd entries of ๐›พp( u๐‘Œ) and ๐›พ} u๐‘Œ should match

Page 25: Fairness, Part II

Geometric Interpretation via ROC curves

Page 26: Fairness, Part II

Write equalized odds as an optimization

โ€ข Letโ€™s define a loss function โ„“( u๐‘Œ๏ฟฝ, ๐‘Œ) describing loss of utility from using u๐‘Œ๏ฟฝ instead of ๐‘Œ

โ€ข Now we can optimize:

โ€ข Objective and constraints are both linear in vector of ๐‘ values!

Page 27: Fairness, Part II

What about continuous values?

โ€ข E.g., suppose we use a numeric credit score ๐‘… to predict binary value ๐‘Œ

โ€ข You can threshold the score to get a comparable definition of equalized odds

โ€ข If ๐‘… satisfies equalized odds, then so does any predictor t๐‘Œ =๐ผ ๐‘… > ๐‘ก , where ๐‘ก is some threshold

Page 28: Fairness, Part II

Case study: FICO Scores

โ€ข Credit scores ๐‘… range from 300 to 850โ€ข Binary variable ๐‘Œ = whether someone will default on loan

Page 29: Fairness, Part II

Experiment

โ€ข False positive โ€“ giving a loan to someone who will defaultโ€ข False negative โ€“ not giving a loan to someone who will not defaultโ€ข Loss function = false positives are 4.5x as expensive as false negatives

Baselinesโ€ข Max profit โ€“ no fairness constraintโ€ข Race blind โ€“ uses same FICO threshold for all groupsโ€ข Group fairness โ€“ picks for each group a threshold such that the fraction of group

members that qualify for loans is the sameโ€ข Equal opportunity โ€“ picks a threshold for each group s.t. fraction of non-defaulting

group members is the sameโ€ข Equalized odds โ€“ requires both the fraction of non-defaulters that qualify for loans

and the fraction of defaulters that qualify for loans to be constant across groups

Page 30: Fairness, Part II

ROC Curve Results

Page 31: Fairness, Part II

Profit Results

Method Profit (% relative to max profit)

Max profit 100

Race blind 99.3

Equal opportunity 92.8

Equalized odds 80.2

Group fairness (demographic parity) 69.8

Fair by some definition