near-optimal private approximation protocols via a black box transformation
DESCRIPTION
Near-Optimal Private Approximation Protocols via a Black Box Transformation. David Woodruff IBM Almaden. Outline. Communication Protocols and Goals Private Approximation Protocols Previous Work Our Results Proof of our Main Transformation. t-Party Communication Model. x 1. x t. x 2. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/1.jpg)
Near-Optimal Private Approximation Protocols
via a Black Box Transformation
David WoodruffIBM Almaden
![Page 2: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/2.jpg)
Outline1. Communication Protocols and Goals
2. Private Approximation Protocols
3. Previous Work
4. Our Results
5. Proof of our Main Transformation
![Page 3: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/3.jpg)
t-Party Communication Model
x2
x1
What is f(x1, x2, …, xt)?
x3 xt-1
xt
…
![Page 4: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/4.jpg)
Application – IP session data
Source Destination
Bytes Duration
Protocol
18.6.7.110.6.2.311.1.0.612.3.1.5…
19.7.3.212.3.4.811.6.8.214.7.0.1…
40K20K58K30K…
28182232…
httpftphttphttp…
AT & T collects 100+ GBs of NetFlow everyday
![Page 5: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/5.jpg)
Application – IP Session Data
AT & T needs to process massive stream of network data
Traffic estimationWhat fraction of network IP addresses are active?Distinct elements computation
Traffic analysis What are the 100 IP addresses with the most traffic? Frequent items computation
Security/Denial of Service Are there any IP addresses witnessing a spike in traffic? Skewness computation
![Page 6: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/6.jpg)
Application – Secure Datamining
For medical research, hospitals wish to mine their joint data
Patient confidentiality imposes strict laws on what information can be shared. Mining cannot leak anything sensitive
![Page 7: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/7.jpg)
Protocol Goals Communication Complexity: Minimize total
number of bits exchanged between the parties
Round Complexity: Minimize total number of messages exchanged between the parties
Computational Complexity: Minimize workload of the parties
Privacy: No party should learn unnecessary information about another party’s input
![Page 8: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/8.jpg)
Outline1. Communication Protocols and Goals
2. Private Approximation Protocols
3. Previous Work
4. Our Results
5. Proof of our Main Transformation
![Page 9: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/9.jpg)
Initial Observations
Even if the parties are randomized, unless they output approximate answers, the communication is large
How do we cope?
Computing many functions for which the parties are deterministic require a huge amount of communication
Settle for an approximation
Allow randomness and a small chance of error
How do we cope?
This helps with communication, round, and computational complexity, but what is a private randomized approximation?
![Page 10: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/10.jpg)
Privacy Definition
What does privacy mean for approximating a function f?
8 i: Party i does not learn anything about xj
, j i, other than what follows from xi and f(x1, …, xt)
First, what does privacy mean for computing a function f?
8 i: Party i not learn anything about xj, j i, other than what follows from xi and the approximation to f(x1, …, xt)
Not Sufficient!!
MinimalRequirement
Does thiswork?
![Page 11: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/11.jpg)
Privacy Definition
x1 2 {0,1}n x2 2 {0,1}n
Party 1 Party 2
Set the LSB of the approximation f’(x1, x2) to be LSB of x2, and the remaining bits of f’(x1, x2) to agree with those of f(x1, x2)
f’(x1, x2) is a +/- 1 approximation to f(x1, x2), but Alice learns LSB of x2 , which doesn’t follow from x1 and f(x1, x2)
What is the Hamming Distance f(x1, x2) between x1 and x2?
![Page 12: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/12.jpg)
New Privacy Definition [FIMNSW]
What does privacy mean for approximating a function f?
8 i: Party i does not learn anything about xj, j i, other than what follows from xi and f(x1, …, xt)
f’(x1, …, xt) is determined by f(x1, …, xt) and the randomness
NewRequirement
Implications
So, we allow for approximation to reduce communication,but we define privacy with respect to exact computation
![Page 13: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/13.jpg)
Simplifications for This Talk
- We only consider two parties in the rest of the talk
- Their names are Alice and Bob
- Their inputs are x and y
![Page 14: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/14.jpg)
What Can Alice and Bob do to Breach Privacy?
x y
Alice Bob
Semi-honest: parties follow their instructions but try to learn more than what is prescribed
Malicious: parties deviate from the protocol arbitrarily- Use a different input- Force other party to output wrong answer- Abort before other party learns answer
Difficult to achieve security in
malicious model…
![Page 15: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/15.jpg)
Reductions – Yao, GMW, NN
Protocolsecure in thesemi-honest
model
Protocolsecure in the
malicious model
Efficiency of the new protocol =
Efficiency of the old protocol
It suffices to design protocols in the semi-honest model
The parties follow the instructions of the protocol.Don’t need to worry about “weird” behavior.
Just make sure neither party learns anything about the other party’s input, other than what follows from the exact function value
![Page 16: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/16.jpg)
More Simplifications
Complicated Protocol
AliceInput xRandom string rA
BobInput yRandom string
rB
Output f’(x,y)
Using known techniques, just need efficient
simulators SA and SB that depend only on x, y, rA, rB and f(x,y)
![Page 17: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/17.jpg)
Simulators
SA(x, f(x,y))
=negl(n) (rB, y, f’(x,y))
=negl(n) (rA, x, f’(x,y))
SB(y, f(x,y))
![Page 18: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/18.jpg)
Outline1. Communication Protocols and Goals
2. Private Approximation Protocols
3. Previous Work
4. Our Results
5. Proof of our Main Transformation
![Page 19: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/19.jpg)
Known Private Approximations
Communication
Rounds Computation
Papers
Lp-norm0 < p · 2
O*(1) O*(1) O*(n) [IW][KMSZ][MM]
L2-heavy hitters(reveals L2)
O*(1) O*(1) O*(n) [KMSZ]
“Even functions that are efficiently computablefor moderately sized data sets are often not efficiently
computable for massive data sets.” [FIMNSW]
![Page 20: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/20.jpg)
What about all of these problems? Lp-norm for p > 2 and p = 0 Lp-heavy hitters for every p Lp-sampling Max Dominance Norm Distinct Summation Empirical Entropy Cascaded Moments Subspace Approximation L2-distance to independence Etc.
![Page 21: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/21.jpg)
Other Related Work Can privately approximate the
permanent of a matrix [FIMNSW] Some NP-hard problems can be privately
approximated if leak a few bits [HKKN] Many NP-hard problems cannot be
privately approximated even when leaking a large number of bits [BHN]
If answer is not unique, e.g., search problem, private approximations even harder to come by [BCNW]
![Page 22: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/22.jpg)
Outline1. Communication Protocols and Goals
2. Private Approximation Protocols
3. Previous Work
4. Our Results
5. Proof of our Main Transformation
![Page 23: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/23.jpg)
Our Main Transformation• Suppose f =Σi=1
n g(xi, yi)• suppose g is non-negative and efficiently computable
• Let ¦ be an arbitrary non-private protocol for approximating f up to a (1 ± 1/log n)-factor with probability ¸ 2/3
• Then there is a private approximation protocol ¦’ for approximating f up to a (1 ± ε)-factor with probability ¸ 2/3
• The communication, round, and computational complexity of ¦’ agree with that of ¦ up to a poly(log n / ε) factor
![Page 24: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/24.jpg)
Near-Optimal Private Approximation Protocols
Communication
Work
Lp-distance, p > 2Lp-Heavy hitters,Lp-sampling
O*(n1-2/p) O*(1)
Max-Dominance Norm
O*(1) O*(n)
Distinct Summation
O*(1) O*(n)
Empirical Entropy
O*(1) O*(n)
Subspace Approximation
O*(d) O*(nd)
![Page 25: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/25.jpg)
Other Private Approximations
Also obtain near-optimal bounds for: Cascaded frequency moments L2-distance to Independence
Using [BO], we get O*(1) communication for any g(xi, yi) = h(xi-yi) where h has “at most quadratic growth’’
![Page 26: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/26.jpg)
Weaker Assumptions
If non-private protocol ¦ is a “simultaneous protocol”, then it is enough to assume symmetrically private information retrieval with polylog(n) communication [CMS, NP]
![Page 27: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/27.jpg)
Outline1. Communication Protocols and Goals
2. Private Approximation Protocols
3. Previous Work
4. Our Results
5. Proof of our Main Transformation
![Page 28: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/28.jpg)
Main Transformation Given a non-private approximation protocol ¦ for
approximating f(x,y) = Σi=1n g(xi, yi), we design a private
approximation protocol ¦’
Main Theorem: There is a low-communication importance sampling procedure which:
If B is an upper bound on f(x,y),
Then Alice and Bob sample from a distribution ¹ on [n] [ ? :8 i 2 [n], ¹(i) = g(xi, yi)/B
¹(?) = 1- f(x,y)/B
How do we design ¦’
given such a procedure?
![Page 29: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/29.jpg)
Importance Sampling Procedureobtains samples from [n] [ ?.
1-Pr [obtain ?] = f(x,y)/B
Private Approximation Protocol
Thus, this probability depends only on f(x,y)!
1. Let B be an upper bound on f(x,y)2. The protocol outputs a bit c. 3. Since c is a bit, it is determined from its expectation.
Pr[c = 1] = 1-Pr[obtain ?] = f(x,y)/B · 1
Repeat a few times to get
concentration
If most repetitions return c = 0,
replace B with B/2, and repeat
The process of halving B
depends only on f(x,y), which
helps for simulation
Once B < 2f(x,y), with very high
probability, enough coin tosses are 1
![Page 30: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/30.jpg)
What’s left? Need an importance sampling procedure, and
show our overall approximation protocol is simulatable
We can’t sample exactly from ¹ on [n] [ ? : 8 i 2 [n], ¹(i) = g(xi, yi)/B
¹(?) = 1- f(x,y)/B
We can sample from a distribution with negl(n) distance from ¹
![Page 31: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/31.jpg)
Notation
For input vectors x and y,
let f[a,b] = Σi=ab g(xi, yi)
![Page 32: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/32.jpg)
Importance Sampling
x, rA y, rB
¦ is a non-private protocol for (1/log n, negl(n))-approximating f = Σi=1
n g(xi, yi),
Use ¦ to estimate f[1, n/2], obtaining f*[1, n/2]Use ¦ to estimate f[n/2+1, n], obtaining f*[n/2+1, n]Recurse on [1, n/2] with probability
f*[1,n/2]/(f*[1,n/2] + f*[n/2+1, n])Else recurse on [n/2+1, n]
f*[1, n/2] is a (1 ± 1/log n)-approximation to f[n/2]
![Page 33: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/33.jpg)
Importance Samplingf[1,8]
f[1,4] f[5,8]
f[1,2] f[3,4]
g(x3, y3) g(x4, y4)
With probability f*[1,4]/(f*[1,4] + f*[5,
8])go left, else go rightWith probability
f*[1,2]/(f*[1,2] + f*[3, 4])
go left, else go rightWith probability g(x3, y3)/(g(x3, y3)+g(x4,
y4))go left, else go right
Pr[g(x3, y3) chosen] =
f*[1,4]/(f*[1,4]+f*[5,8])x
f*[3,4]/(f*[1,2]+f*[1,4])x
g(x3, y3)/(g(x3, y3)+g(x4, y4))=
C*g(x3, y3)/f(x,y)
![Page 34: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/34.jpg)
Importance Sampling Procedure gives a way to sample from a distribution ½:
½(i) = Ci ¢ g(xi,yi)/f(x,y),where Ci 2 [1/2, 2]
If i is sampled, then we know the probability ½(i) that we chose it
We can also obtain g(xi, yi) efficiently
With probability g(xi,yi)/(½(i)¢B), output i, else output ? !
Pr[don’t output ?] = i ½(i)¢g(xi,yi)/(½(i)¢B)= f(x,y)/B
Hence, we sample from ¹:
8 i 2 [n], ¹(i) = g(xi, yi)/B ¹(?) = 1- f(x,y)/B
(up to negl(n), since small probability ¦ fails)
![Page 35: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/35.jpg)
Simulators
For f’(x,y) , SA generates random coins with expectation f(x,y)/B, and keeps halving B until there are enough coin tosses equal to 1
For rA, SA outputs a random rA SA outputs (rA, x, f’(x,y)) which is equal to
the distribution in ¦’ except with negl(n) probability
SA(x, f(x,y)) =negl(n) (rA, x, f’(x,y))
![Page 36: Near-Optimal Private Approximation Protocols via a Black Box Transformation](https://reader035.vdocuments.site/reader035/viewer/2022070420/56815f6a550346895dce6caa/html5/thumbnails/36.jpg)
Conclusions Any non-private approximation protocol for a
function f = Σi=1n g(xi, yi) can be transformed into a
private one with an O*(1) blowup in complexity
Many problems can be expressed this way (e.g., lp-norms), even non-obvious ones (e.g., entropy), for which we had no technique of achieving a private approximation
What about other functions?