dependence makes you vulnerable: differential privacy ... · changchang liu1, supriyo chakraborty2,...
TRANSCRIPT
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Dependence Makes You Vulnerable: DifferentialPrivacy Under Dependent Tuples
Changchang Liu1, Supriyo Chakraborty2, Prateek Mittal1
Email: 1{cl12, pmittal}@princeton.edu, [email protected],1 Princeton University, 2IBM T.J. Watson Research Center
February 23, 2016
1 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Data Privacy
• Privacy is important!
- Snowden case- G20 summit breach- iCloud photo breach· · ·
2 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Direct Release Data Would Compromise Privacy!
Individuals Data Provider
Raw Data
Applications Researchers
Data Recipients
Query Results
3 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Direct Release Data Would Compromise Privacy!
Individuals Data Provider
Raw Data
Data Recipients
Query Results
3 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Obfuscate Data before Release to Protect Privacy
Individuals Data Provider
Raw DataData
Obfuscation
Applications Researchers
Data Recipients
Query Results
Perturbed
Query Results
3 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Existing Privacy Metrics
– Differential Privacy [ICALP ’06]
– Pufferfish Privacy [PODS ’12]
– Membership Privacy [CCS ’13]
– Blowfish Privacy [SIGMOD ’14]
4 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
ε-Differential Privacy (DP)
D
D¢
Neighboring Databases
The adversary’s ability to infer the individual’s information is bounded!
5 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
ε-Differential Privacy (DP)
D
D¢
Neighboring
Databases
Differential Privacy requires:
The adversary’s ability to infer the individual’s information is bounded!
5 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
ε-Differential Privacy (DP)
D
D¢
Probability
S Query Output
Neighboring
Databases
Differential Privacy requires:
The adversary’s ability to infer the individual’s information is bounded!
5 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Laplace Perturbation Mechanism
S
noise
( )Q D
( ) 1, ~ exppQ
eæ ö= -ç ÷
Dè ø
xb b x
( )( )D Q D= +b( )D Q D( )( )D Q DD Q D
Raw Data
ε is the privacy budgetQ is the query function∆Q is the global sensitivity of Q: maxD,D′‖Q(D)−Q(D′)‖1
6 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Limitations for Differential Privacy (DP) Mechanisms
Implicitly assume independent tuples
7 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Limitations for Differential Privacy (DP) Mechanisms
In reality, however, tuples are correlated
• large volume• rich semantics• complex structure
8 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Data correlation exists almost everywhere
(a) social network data (b) business data
(c) mobility data (d) medical data
9 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Data correlation exists almost everywhere
(a) social network data (b) business data
(c) mobility data (d) medical data
friendships
interactions
9 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Data correlation exists almost everywhere
(a) social network data (b) business data
(c) mobility data (d) medical data
friendships
interactions
financial
transactions
9 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Data correlation exists almost everywhere
(a) social network data (b) business data
(c) mobility data (d) medical data
friendships
interactions
financial
transactions
communication
records
9 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Data correlation exists almost everywhere
(a) social network data (b) business data
(c) mobility data (d) medical data
friendships
interactions
financial
transactions
communication
recordsdisease
transmission
9 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Our Objective
Incorporate correlated data in differential privacy
10 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
Differential Privacy under Dependent DataInference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
10 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
Correlation in Gowalla Location Dataset
Gowalla location dataset: 6,969 users, 98,802 location recordsGowalla social dataset: 6,969 users, 47,502 edges
Manhattan, NewYork
Queens, NewYork
Brooklyn, NewYork
San Jose, San Francisco
Pasadena, Los Angeles
Long Beach, Los Angeles
Beverly Hills, Los Angeles
Oakland, San Francisco
11 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
Inference Attack on DP via K-Means Query
Differentially Private K -means for Gowalla Location Dataset
Individuals Data Provider
Raw DataK-means
Clustering
Data Recipients
Differentially Private
K-means Clustering
Perturbation
Inference Attack
12 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
Inference Attack
Social
RelationshipsInference Attack
Check-in
Community
13 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
Inference results by using correlation
0 0.5 1 1.5 2 2.5 30
2
4
6
8
Privacy Budget ε
Lea
ked
In
form
atio
n
with social relationships
w/o social relationships
security guarantee by DP
Exploiting correlation, one can infer more information!Exploiting correlation can break DP security guarantees!
14 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
Inference results by using correlation
0 0.5 1 1.5 2 2.5 30
2
4
6
8
Privacy Budget ε
Lea
ked
In
form
atio
n
with social relationships
w/o social relationships
security guarantee by DP
Exploiting correlation, one can infer more information!Exploiting correlation can break DP security guarantees!
14 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
Differential Privacy under Dependent DataInference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
14 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
ε-Dependent Differential Privacy (DDP)
Neighboring Databases
•R is probabilistic dependence relationship among the L dependent tuples•The adversary’s ability to infer the individual’s information is boundedeven if the adversary has access to data correlation R.
15 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
ε-Dependent Differential Privacy (DDP)
Neighboring
Databases
Dependent Differential
Privacy requires:
•R is probabilistic dependence relationship among the L dependent tuples•The adversary’s ability to infer the individual’s information is boundedeven if the adversary has access to data correlation R.
15 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
ε-Dependent Differential Privacy (DDP)
Probability
S Query Output
Neighboring
Databases
Dependent Differential
Privacy requires:
•R is probabilistic dependence relationship among the L dependent tuples•The adversary’s ability to infer the individual’s information is boundedeven if the adversary has access to data correlation R.
15 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
Dependent Perturbation Mechanism
• Augment conventional LPM with additional noise relevant to ρij
• Dependent coefficient ρij
− extent of dependence of Dj on the modification of Di
16 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
Dependent Coefficient
Laplace noise in dependent perturbation mechanism
exp
{− ε
Sensitivityi + ρij ×Sensitivityj
}Dependent coefficient satisfies: 0≤ ρij ≤ 1
• ρij = 0: standard differential privacy (independent setting)
• ρij = 1: fully dependent setting
• ρij : formulate correlation from privacy perspective
17 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
Limitations of Dependent Coefficient
The exact computation of ρij
relies on knowledge of data generation model
18 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
Differential Privacy under Dependent DataInference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
18 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
Resilience to Inference Attack
0 0.5 1 1.5 2 2.5 30
2
4
6
8
Privacy Budget ε
Lea
ked
In
form
atio
n
attack DPattack DDP
DDP is more resilient to inference attack
19 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
Resilience to Inference Attack
0 0.5 1 1.5 2 2.5 30
2
4
6
8
Privacy Budget ε
Lea
ked
In
form
atio
n
attack DPattack DDP
DDP is more resilient to inference attack
19 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Inference Attack for DP based on Correlated TuplesDependent Differential Privacy (DDP)Experimental Results
Further Analysis and Experiments
• Composition Property
− Sequential/parallel composition property
• Theoretical utility analysis
• Different classes of queries
− Machine learning queries− Graph queries
20 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Conclusion and Future work
• Incorporate correlation into differential privacy
− Dependent differential privacy− More resilient to inference attack
• Alternative data generation models in the future work
21 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Appendix1: Dependence between tuples can seriouslydegrade the privacy guarantees provided by the existing DPmechanisms
QuerySum
,i jD Dé ùë û ( )1Lap e
Add noise Noisy ( )i jD D+
Independent
Privacy
Guarantee( )exp e
( )exp 1.5eDependent
i jD D^
Smaller means better privacy e
[ ]0.5 0.5
0,1
j i
i
D D X
D X
= +
and are i.i.d in
Privacy
Guarantee
21 / 21
IntroductionDifferential Privacy under Dependent Data
Conclusion and Future Work
Appendix 2: Model to Compute Dependent Coefficient
Here, we consider to utilize the friend-based model to compute theprobabilistic dependence relationship, where a user’s location can beestimated by her friend’s location based on the distance between theirlocations. Specifically, the probability of a user j locating at dj whenher friend i is locating at di is
P(Dj = dj |Di = di) = a(‖dj −di‖1 + b)−c (1)
where a > 0,b > 0,c > 0.
21 / 21