privacy issues in disclosing averages susmit sarkar(cmu)

16
Privacy Issues in Privacy Issues in Disclosing Averages Disclosing Averages Susmit Sarkar Susmit Sarkar (CMU) (CMU)

Post on 19-Dec-2015

217 views

Category:

Documents


0 download

TRANSCRIPT

Privacy Issues in Privacy Issues in Disclosing AveragesDisclosing Averages

Susmit SarkarSusmit Sarkar (CMU)(CMU)

Non-InterferenceNon-Interference

Non-InterferenceNon-Interference : Observable : Observable actions of programs are not actions of programs are not influenced by sensitive datainfluenced by sensitive data

Too restrictive in practice!Too restrictive in practice! Think of password securityThink of password security

Safe Relaxation of Safe Relaxation of Non-InterferenceNon-Interference

Passwords are sensitive dataPasswords are sensitive data Checking passwords violates non-Checking passwords violates non-

interferenceinterference This is still okay [Volpano] if This is still okay [Volpano] if

passwords are chosen randomlypasswords are chosen randomly The interaction is carefully controlledThe interaction is carefully controlled

Generalizing to Averages Generalizing to Averages

Idea: restrict access to allow us to Idea: restrict access to allow us to answer interesting queriesanswer interesting queries

Also, we can measure information Also, we can measure information lossloss

We want to calculate averages on We want to calculate averages on private dataprivate data

Generalize the notion of averagesGeneralize the notion of averages

Content Host’s problemContent Host’s problem

Content host serving multiple Content host serving multiple content providerscontent providers

The number of hits is sensitive The number of hits is sensitive informationinformation

Often, clients ask average hits of Often, clients ask average hits of specified clientsspecified clients

Example: Sport Site Example: Sport Site

You want to know how the redesign You want to know how the redesign of your sports portal workedof your sports portal worked

Complications : It happens to be Complications : It happens to be Superbowl SundaySuperbowl Sunday

We want averages of all sports sitesWe want averages of all sports sites What if there are only 2 sports sites?What if there are only 2 sports sites?

Formal ModelFormal Model

DataData

QueryQuery

:=:= dd11 + d + d33 + d + d55 = ? = ?

Problem : what about 1 0 1 1 0, andProblem : what about 1 0 1 1 0, and

1 0 1 1 11 0 1 1 1

DD11 DD22 DD33 DD44 DD55

11 00 11 00 11

Query ModelQuery Model

Solution : Maintain historySolution : Maintain history Idea : add current query to set, Idea : add current query to set,

decide if “bad” vectors are derivabledecide if “bad” vectors are derivable We restrict attention to weighted We restrict attention to weighted

sumssums

Issues Ignored in ModelIssues Ignored in Model

Answers of queries (Right Hand Answers of queries (Right Hand Sides)Sides)

Data valuesData values Extraneous information : Correlation Extraneous information : Correlation

between databetween data Some of this are in further workSome of this are in further work

Characterizing Bad Characterizing Bad VectorsVectors (0 (0 11 0 0 0 0 0 0 0 0 0) 0 0 0 0 0 0 0 0 0) (1 (1 101066 1 1 1 1 1 1 1 1 1)1 1 1 1 1 1 1 1 1) We want a measure that indicates We want a measure that indicates

when all entries are of similar when all entries are of similar magnitudemagnitude

Idea : EntropyIdea : Entropy

We use the We use the entropy entropy function : -function : - p pii lg lg ppii

NormalizeNormalize entries so that entries so that magnitudes sum to onemagnitudes sum to one

Then treat the Then treat the magnitudesmagnitudes as as probabilities in entropy definitionprobabilities in entropy definition

Entropy is low when data is skewedEntropy is low when data is skewed

Formal Problem Formal Problem StatementStatement

m m QueryQuery vectors Q vectors Qii = (q = (qi1i1,q,qi2i2,,,q,qinin)) Unknown Unknown linear combinationlinear combination

U = cU = c11 Q Q11 + c + c22 Q Q22 + +

Variables uVariables uii = = c cjj q qijij

Variables u’Variables u’ii ¸̧ u uii and u’ and u’ii ¸̧ – u – uii

u’u’ii ¸̧ |u |uii||

Calculating EntropyCalculating Entropy

EntropyEntropy (u’ (u’ii / / u’ u’jj ) lg (u’ ) lg (u’ii / / u’ u’jj) ) ¸̧ T T

MinimizeMinimize : : u’ u’II

Notice that this is a convex programNotice that this is a convex program

Convex ProgrammingConvex Programming

[Vempala] allows us to do convex [Vempala] allows us to do convex programming efficientlyprogramming efficiently

His algorithm allows us to solve our His algorithm allows us to solve our problem in polynomial timeproblem in polynomial time

Future WorkFuture Work

Extend our measure to take into Extend our measure to take into account the Right Hand Sidesaccount the Right Hand Sides

Change the model to maximize Change the model to maximize queries we can answerqueries we can answer

BibliographyBibliography

[Volpano] “Verifying Secrets and [Volpano] “Verifying Secrets and Relative Secrecy”, Volpano and Relative Secrecy”, Volpano and Smith, POPL’ 00Smith, POPL’ 00

[Vempala] “Solving Convex [Vempala] “Solving Convex Programs by Random Walks”, Programs by Random Walks”, Vempala and Bertsimas, STOC’ 02Vempala and Bertsimas, STOC’ 02