differential privacy

14
Differential Privacy 18739A: Foundations of Security and Privacy Anupam Datta Fall 2009

Upload: creola

Post on 15-Jan-2016

81 views

Category:

Documents


1 download

DESCRIPTION

18739A: Foundations of Security and Privacy. Differential Privacy. Anupam Datta Fall 2009. Sanitization of Databases. Add noise, delete names, etc. Real Database (RDB). Sanitized Database (SDB). Health records Census data. Protect privacy Provide useful information (utility). - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Differential Privacy

Differential Privacy

18739A: Foundations of Security and Privacy

Anupam Datta

Fall 2009

Page 2: Differential Privacy

Sanitization of Databases

Real Database (RDB)

Sanitized Database (SDB)

Health records

Census data

Add noise, delete names, etc.

Protect privacy

Provide useful information (utility)

Page 3: Differential Privacy

Recall Dalenius’ Desideratum

3

Page 4: Differential Privacy

Impossibility Result

Privacy: For some definition of “privacy breach,” distribution on databases, adversaries A, A’ such that Pr(A(San)=breach) – Pr(A’()=breach)

≤ Result: For reasonable “breach”, if San(DB) contains

information about DB, then some adversary breaks this definition

Example Terry Gross is two inches shorter than the average

Lithuanian woman DB allows computing average height of a Lithuanian

woman This DB breaks Terry Gross’s privacy according to this

definition… even if her record is not in the database!

[Dwork, Naor 2006]

Page 5: Differential Privacy

(Very Informal) Proof Sketch

Suppose DB is uniformly random Entropy I( DB ; San(DB) ) > 0

“Breach” is predicting a predicate g(DB) Adversary knows r, H(r ; San(DB)) g(DB)

H is a suitable hash function, r=H(DB) By itself, does not leak anything about DB Together with San(DB), reveals g(DB)

Page 6: Differential Privacy

Relaxed Semantic Security Databases? Example

Terry Gross is two inches shorter than the average Lithuanian woman

DB allows computing average height of a Lithuanian woman

This DB breaks Terry Gross’s privacy according to this definition… even if her record is not in the database!

Suggests new notion of privacy Risk incurred by joining database

Page 7: Differential Privacy

Differential Privacy (informal)

Output is similar whether any single individual’s record is included in the database or not

C is no worse off because her record is included in the computation

If there is already some risk of revealing a secret of C by combining auxiliary information and something learned from DB, then that risk

is still there but not increased by C’s participation in the database

Page 8: Differential Privacy

Differential Privacy [Dwork et al 2006]

Randomized sanitization function κ has ε-differential privacy if for all data sets D1 and D2 differing by at most one element and all subsets S of the range of κ,

Pr[κ(D1) S ] ≤ eε Pr[κ(D2) S ]

8

Page 9: Differential Privacy

Achieving Differential Privacy

How much noise should be added?

Intuition: f(D) can be released accurately when f is insensitive to individual entries x1, … xn (the more sensitive f is, higher the noise added)

Tell me f(D)

f(D)+noisex1…xn

Database DUser

Page 10: Differential Privacy

Sensitivity of a function

Examples: count 1, histogram 1 Note: f does not depend on the database

10

Page 11: Differential Privacy

Achieving Differential Privacy

11

Page 12: Differential Privacy

Proof of Theorem 4

12

Page 13: Differential Privacy

13

Page 14: Differential Privacy

Sensitivity with Laplace Noise