privacy data storage humboldt
TRANSCRIPT
-
8/9/2019 Privacy Data Storage Humboldt
1/23
DATA STORAGEVIEWPOINT OF PRIVACYSlides from Prof. Johan Christoph Freytag (HumboldtUniversity, Berlin)
-
8/9/2019 Privacy Data Storage Humboldt
2/23
Outline
PrivacyPrivacy and contextPrivacy and mobilityPrivacy and context combined with privacy andmobility
PRECIOSA kick off meetingHU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
2
Paris, 11.04.2008
-
8/9/2019 Privacy Data Storage Humboldt
3/23
PrivacyPrivacy of movement
PRECIOSA kick off meetingHU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
3
Paris, 11.04.2008
B-O-333
RFID
-
8/9/2019 Privacy Data Storage Humboldt
4/23
PrivacyIs it always obvious?
PRECIOSA kick off meetingHU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
5
Paris, 11.04.2008
Is it always obvious that privacy is violated or breached?
Latanya Sweeneys FindingIn Massachusetts, USA, the Group Insurance Commission (GIC) is
responsible for purchasing health insurance for state employees
GIC has to publish the data:
GIC( zip, dob, sex , diagnosis, procedure, ...)
http://lab.privacy.cs.cmu.edu/people/sweeney/d ate o f b irth
[Sween01]
-
8/9/2019 Privacy Data Storage Humboldt
5/23
PrivacyLatanya Sweeneys Finding
PRECIOSA kick off meetingHU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
6
Paris, 11.04.2008
Sweeney paid $20 and bought the voter registration list forCambridge, MA:
William Weld (former governor) lives in Cambridge, hence isin VOTER
6 people in VOTER share his dobonly 3 of them were man (same sex )Weld was the only one in that zipSweeney learned Welds medical records !
VOTER(name, party, ..., zip, dob, sex )
GIC( zip, dob, sex , diagnosis, procedure, ...)
-
8/9/2019 Privacy Data Storage Humboldt
6/23
PrivacyLatanya Sweeneys Finding
Observation: All systems worked as specified, yetan important data has leaked
Information leakage occurred
Despite the observation that all systems worked asspecifiedBeyond correctness!Whats missing?
How do we protect against that kind of lack(leakage) of privacy?
Paris, 11.04.2008PRECIOSA kick off meetingHU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
7
-
8/9/2019 Privacy Data Storage Humboldt
7/23
PrivacyData Security
Dorothy Denning, 1982:Data Security is the science and study of methods ofprotecting data (...) from unauthorized disclosure
and modification
Data Security =Confidentiality + Integrity
(+ Availability)Distinct from system and network security
Paris, 11.04.2008PRECIOSA kick off meetingHU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
8
-
8/9/2019 Privacy Data Storage Humboldt
8/23
PrivacyWhat is Privacy?
Paris, 11.04.2008PRECIOSA kick off meeting
HU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
9
Definition 1:Privacy reflects the ability of a person, organization,government, or entity to control its own space, where the conceptof space (or privacy space) takes on different contexts.
Physical space, against invasionBodily space, medical consentComputer space, spamWeb browsing space, Internet privacy
[Sween02]
[Agrawal03] Definition 2:
Privacy is the right of individuals to determine for themselves when, how, andto what extent information about them is communicated to others.
(We shall call this data/information privacy )
-
8/9/2019 Privacy Data Storage Humboldt
9/23
PrivacyAnonymity and unobservability
Paris, 11.04.2008PRECIOSA kick off meeting
HU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
10
anonymity group event
message
access
Everybody could be the originator of an event with an equal likelihood
-
8/9/2019 Privacy Data Storage Humboldt
10/23
PrivacyApproaches for non-observable communication
Paris, 11.04.2008PRECIOSA kick off meeting
HU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
11
Anonymity group Events
Message
Access
Whom to protect?sender(content of message)
Basic approach:Dummy trafficProxies
MIX-NetworksDC-Networks more
-
8/9/2019 Privacy Data Storage Humboldt
11/23
PrivacyMaintaining data privacy for accessing databases
Paris, 11.04.2008PRECIOSA kick off meeting
HU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
12
k-anonymity &its properties
[Sween02]
introduced by Sweeney
-
8/9/2019 Privacy Data Storage Humboldt
12/23
PrivacyAn example: Medical Records
Paris, 11.04.2008PRECIOSA kick off meeting
HU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
13
Identifying SensitiveInformation
SoSecN Name Age Ethnic B Zipcode Disease
007 Chris 07 Caucas 12344 Arthritis009 Jane 77 Caucas 53211 Cold
011 Adam 28 Caucas 70234 Heart problem
023 Charlie 27 Afr-Amer 95505 Flu
034 Eve 27 Afr-Amer 54327 Arthritis
054 Yvonne 44 Hispanic 12007 Diabetes
099 John 65 Hispanic 12007 Flu
[Aggarwal03]
-
8/9/2019 Privacy Data Storage Humboldt
13/23
PrivacyMedical Records: De-identify & Release
Paris, 11.04.2008PRECIOSA kick off meeting
HU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
14
Sensitive
Age Ethnic B Zipcode Disease
07 Caucas 12344 Arthritis
77 Caucas 53211 Cold
28 Caucas 70234 Heart problem
27 Afr-Amer 95505 Flu
27 Afr-Amer 54327 Arthritis
44 Hispanic 12007 Diabetes
65 Hispanic 12007 Flu
-
8/9/2019 Privacy Data Storage Humboldt
14/23
PrivacyNot sufficient!
Paris, 11.04.2008PRECIOSA kick off meeting
HU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
15
Public Database
Uniquelyidentify
you!
Sensitive
Age Ethnic B Zipcode Disease
07 Caucas 12344 Arthritis
77 Caucas 53211 Cold
28 Caucas 70234 Heart problem
27 Afr-Amer 95505 Flu
27 Afr-Amer 54327 Arthritis
44 Hispanic 12007 Diabetes
65 Hispanic 12007 Flu
Quasi-identifiers:reveal less information
k-anonymity model
-
8/9/2019 Privacy Data Storage Humboldt
15/23
Privacyk-anonymity Problem Definition
Input: Database consisting of n rows, each with mattributesSet of domain values for attributes is finiteGoal : Suppress some entries in the table such thateach modified row becomes identical to at least k-1other rows.
Objective : Minimizethe number of suppressedentries.
Paris, 11.04.2008PRECIOSA kick off meeting
HU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
16
-
8/9/2019 Privacy Data Storage Humboldt
16/23
PrivacyMedical Records: 2-anonymized table
Paris, 11.04.2008PRECIOSA kick off meeting
HU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
17
Age Ethnic B Zipcode Disease
* Caucas * Arthritis
* Caucas * Cold
* Caucas * Heart problem
27 Afr-Amer * Flu
27 Afr-Amer * Arthritis* Hispanic 94042 Diabetes
* Hispanic 94042 Flu
-
8/9/2019 Privacy Data Storage Humboldt
17/23
PrivacyAccessing databases privately (Access privacy)
Paris, 11.04.2008PRECIOSA kick off meeting
HU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
18
Patent-DB
-
8/9/2019 Privacy Data Storage Humboldt
18/23
PrivacyFirst (nave) approach
Paris, 11.04.2008PRECIOSA kick off meeting
HU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
19
Problem to solve:User/Client: no one should know thecontents of the query nor the result (noteven the server)
Observation :Encrypting the communication betweenclient and server might not be sufficient(Adversary might access decrypted queryif he can get inside the databasesystem and if he can observe disk access)
Nave solution:Client downloads the entire DB &executes queries locally unrealisticsolution (size & ownership of data)
DB SERVER
DB
USER/
CLIENT
q u e r y
r e s u
l t
-
8/9/2019 Privacy Data Storage Humboldt
19/23
PrivacyAccessing databases privately (Access privacy)
Paris, 11.04.2008PRECIOSA kick off meeting
HU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
Simple Solution:Use a Secure Coprocessor (SC)Proven Hardware properties:
Cannot observe computation from outside
If tampered self-destruction occursRead entire database per query O(N)
123
45567
Encrypted (Return record x)
Database Server
R e a
d e n
t i r e
d a
t a b a s e
IBM 4758 Secure Coprocessor (SC )
[Asonov01]
20
-
8/9/2019 Privacy Data Storage Humboldt
20/23
PrivacyMetric (Probabilistic Privacy)
Using (Shannons) entropy definition to measure privacy:Pi ... Probability of query to access record iE ... uncertainty of adversarys observation
E is maximal if all Pis have the same valuei.e. the adversary cannot give some values stored in the db a higherprobability of being accessed than others
Perfect privacy : E does not change by observationsProbabilistic Privacy : adversary learns by observation (i.e. increaseprobability P i for some records)Goal : minimize learning (i.e. minimize increase of probabilities P i)
Paris, 11.04.2008PRECIOSA kick off meeting
HU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
21
i
N
ii
Pld P E *1
-
8/9/2019 Privacy Data Storage Humboldt
21/23
PDA Probabilistic privacySecurity Parameters:
a # of (sequential) requests to shuffled & encrypted database;b
# of random requests to original database ( includes requested record)reshuffling after N/ b queries necessary
???
???????
?
Database Server
SC
s h u
f f l e d a n
d e n c y p
t e d d a t a b a s e
123
4556789
10 Q u e r y
o r i
g i n a
l d a t a b a s e
0
0,05
0,1
0,15
0,2
0,25
1 2 3 4 5 6 7 8 9 10
Probability distribution Each record of original database: P =(1- a / N )/ b Others: P =( a / N)/(N-b) Therefore, no one record can be completely excluded from
query
-
8/9/2019 Privacy Data Storage Humboldt
22/23
Privacy and context
CombinatoricsMachine learningUse of backround knowledge linkage attacks
CancerBreast cancer
Lung cancerMale vs. female
PRECIOSA kick off meetingHU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
23
Paris, 11.04.2008
-
8/9/2019 Privacy Data Storage Humboldt
23/23
Privacy and contextChallenges
Modeling the domain of ITSOntologies can be used to specify relevant contextsCombine contexts with probabilities
Preventingcontexts to be identifiedcontexts to be combined with individuals
Apply methods of anonymization and Probabilistic privacy (e.g.shuffle contexts)
Shannons entropy definition applicable (normalized)
Contexts may change with the time (e.g. dense of traffic)Pseudonyms (temporary identifiers)
PRECIOSA kick off meetingHU Berlin: Prof. Johann-Christoph Freytag, Dipl.-Inf. Martin Kost
24
Paris, 11.04.2008