![Page 1: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/1.jpg)
1
Secure Multiparty Regression Based on Homomorphic Encryption
Rob HallJoint work with Yuval Nardi (Technion) and
Steve Fienberg
http://www.cs.cmu.edu/~rjhall [email protected]
![Page 2: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/2.jpg)
2
Structure
• Setting and motivation.
• Basic tools of cryptography.• Prior work
• Techniques for regression.• Logistic regression
“Well known”
Our contribution
![Page 3: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/3.jpg)
3
• Multiple parties with private data:
• e.g., is this vaccine causing hepatitis?• Long term vaccine safety surveillance (c.f., the
FDA’s “sentinel initiative”)
Setting
Patient ID Hepatitis
0001 N
0002 Y
0003 N
… …
Patient ID Vaccine
0001 Y
0002 N
0003 N
… …
Health insurance agency
Hospital
![Page 4: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/4.jpg)
4
Secure Multiparty RegressionPatient ID Vaccine Age Weight Hepatitis
0001 ? 36 170 N
0002 ? 26 150 Y
0003 ? 45 165 N
… … … … …
Patient ID Vaccine Age Weight Hepatitis
0001 Y 36 ? ?
0002 N 26 ? ?
0003 N 45 ? ?
… … … … …
Party 1
Party 2
Each party has a private (partial) data
matrix
Additional variables may be present
![Page 5: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/5.jpg)
5
Secure Multiparty RegressionPatient ID Vaccine Age Weight Hepatitis
0001 ? 36 170 N
0002 ? 26 150 Y
0003 ? 45 165 N
… … … … …
Patient ID Vaccine Age Weight Hepatitis
0001 Y 36 170 N
0002 N 26 150 Y
0003 N 45 165 N
… … … … …
Patient ID Vaccine Age Weight Hepatitis
0001 Y 36 ? ?
0002 N 26 ? ?
0003 N 45 ? ?
… … … … …
“Full data”
Goal is regression on
full data
Assumptions: Complete and
properly joined
![Page 6: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/6.jpg)
6
Secure Multiparty RegressionPatient ID Vaccine Age Weight Hepatitis
0001 ? 36 170 N
0002 ? 26 150 Y
0003 ? 45 165 N
… … … … …
Patient ID Vaccine Age Weight Hepatitis
0001 Y 36 170 N
0002 N 26 150 Y
0003 N 45 165 N
… … … … …
Patient ID Vaccine Age Weight Hepatitis
0001 Y 36 ? ?
0002 N 26 ? ?
0003 N 45 ? ?
… … … … …
Data are “private”
e.g., HIPAA
![Page 7: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/7.jpg)
7
Alternate SettingsFictional scenario based on discussion with CyLab corporate partners:
Records of transactions
Records of commercial
views
Store TV Network
Regression of advertising effect
![Page 8: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/8.jpg)
8
Two Types of Privacy Breach
• Information leakage via the computation itself:– Focus of this talk.– Dealt with via “cryptographic protocols.”
• Information leakage via the output:– Not in this talk.– Assume the parties have deemed that the
regression is “safe” to compute.– Otherwise may use e.g., “Differential Privacy.”
![Page 9: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/9.jpg)
9
The Ideal Scenario vs. Real LifeData submitted to “trusted 3rd party.”
Ideal: Parties see their own data and the output.
![Page 10: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/10.jpg)
10
The Ideal Scenario vs. Real LifeData submitted to “trusted 3rd party.”
“Trusted party” computes regression,
sends coefficients back to each party.
Ideal: Parties see their own data and the output.
![Page 11: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/11.jpg)
11
The Ideal Scenario vs. Real LifeData submitted to “trusted 3rd party.”
“Trusted party” computes regression,
sends coefficients back to each party.
Ideal: Parties see their own data and the output.
Real: Parties also see intermediate messages.
Parties exchange messages and perform
local computation according to a protocol
![Page 12: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/12.jpg)
12
The Ideal Scenario vs. Real LifeData submitted to “trusted 3rd party.”
“Trusted party” computes regression,
sends coefficients back to each party.
Ideal: Parties see their own data and the output.
Real: Parties also see intermediate messages.
Parties exchange messages and perform
local computation according to a protocol
Protocol is secure if intermediate messages don’t reveal any information beyond whatever is contained in the output.
![Page 13: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/13.jpg)
13
“Security by Simulation”Consider the messages to party 1:
Depends on other’s private inputs
A distribution, since the protocol is randomized.
![Page 14: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/14.jpg)
14
“Security by Simulation”Consider the messages to party 1:
Depends on what's available in ideal case
Depends on other’s private inputs
Suppose we construct a simulator:
A distribution, since the protocol is randomized.
![Page 15: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/15.jpg)
15
“Security by Simulation”Consider the messages to party 1:
Try to decide which one a particular transcript is from:
Depends on what's available in ideal case
Depends on other’s private inputs
A poly-time algorithm
Suppose we construct a simulator:
A distribution, since the protocol is randomized.
![Page 16: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/16.jpg)
16
“Security by Simulation”Consider the messages to party 1:
Try to decide which one a particular transcript is from:
Depends on what's available in ideal case
Depends on other’s private inputs
A poly-time algorithm
Suppose we construct a simulator:
Can’t decide messages reveal no more than input/output.
A distribution, since the protocol is randomized.
![Page 17: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/17.jpg)
17
“Computational Indistinguishability”
Negligible function of a security parameter k
Probability over transcripts and coin tosses of A
Probability that decision is correct ≈ 0.5
![Page 18: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/18.jpg)
18
“Computational Indistinguishability”
Negligible function of a security parameter k
Probability over transcripts and coin tosses of A
Probability that decision is correct ≈ 0.5
A proper relaxation of statistical closeness:
Polynomially (in k) many secure sub-protocols may be composed.
![Page 19: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/19.jpg)
19
Basic Tools
Uniformly distributed among all solutions.
• Hide intermediate values as “random shares”:Intermediate value
One “share” per party
Sums may be computed locally
![Page 20: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/20.jpg)
20
Basic Tools
Use a sub-protocol for computing
products of shares:
Uniformly distributed among all solutions.
Uniformly distributed among all solutions.
• Hide intermediate values as “random shares”:Intermediate value
One “share” per party
![Page 21: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/21.jpg)
21
Basic Tools
Use a sub-protocol for computing
products of shares:
Uniformly distributed among all solutions.
• Random shares easy to simulate.• Sub protocols compose yielding secure protocol.
Uniformly distributed among all solutions.
• Hide intermediate values as “random shares”:Intermediate value
One “share” per party
![Page 22: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/22.jpg)
22
Basic ToolsHomomorphic encryption
(e.g., Paillier ‘99)• Public key (like e.g., RSA)• Ciphertexts are indistinguishable.
Allows math operations on
encrypted values:
(note, on ring mod n)
Allows construction of the “product” sub-protocol…
n ≈ 2kSecurity parameterPublic key
![Page 23: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/23.jpg)
23
Secure Products (Integer)Party 1 (has private key) Party 2
Data held by party 2
Data held by party 1
![Page 24: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/24.jpg)
24
Secure Products (Integer)Party 1 (has private key) Party 2
Encrypt values and send them.
![Page 25: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/25.jpg)
25
Secure Products (Integer)Party 1 (has private key) Party 2
Draw r uniformly at random
![Page 26: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/26.jpg)
26
Secure Products (Integer)Party 1 (has private key) Party 2
Decrypt, add local product
![Page 27: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/27.jpg)
27
Secure Products (Integer)Party 1 (has private key) Party 2
Share of product
Share of product
![Page 28: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/28.jpg)
28
Secure Products (Integer)Party 1 (has private key) Party 2
Share of product
Share of product
Encrypted
Uniform random variable
![Page 29: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/29.jpg)
29
Yao’s Construction
• In principle may now evaluate any circuit:
“xor,” “and” for binary a,b
![Page 30: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/30.jpg)
30
Yao’s Construction
• In principle may now evaluate any circuit:
• This is essentially a theoretical construction (nevertheless it is implemented in practice c.f., “fairplay”).
• To accomplish even a floating point addition would take many encryptions.
“xor,” “and” for binary a,b
![Page 31: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/31.jpg)
31
Prior Work in Secure Multiparty Regression
Inner productsMatrix inversion
Inner products
Linear regression is sums and products (with tricks)
Chris Clifton et. al:Inner product protocols for a weak definition of “secure.”
Alan Karr et. al:Compute , share them.
This work: A secure protocol which reveals only the output
All reveal some info in
addition to the estimate
![Page 32: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/32.jpg)
32
Input Data Setup
• We suppose the data obey the following:
• Subsumes all data partitioning schemes.• Leads to a general protocol for all situations.– Although, specialized protocols may be faster.
“X” data of party i “Full” data
![Page 33: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/33.jpg)
33
Our Protocol
• Yao’s approach: very clean but inefficient.• Our approach: messy but fast(er)…
– Fixed precision arithmetic.
Mostly sums and products.
Sadly: real numbers not integers
![Page 34: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/34.jpg)
34
Secure Products (Real Approx)Approximate reals
with integers:The real number Integer representation
![Page 35: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/35.jpg)
35
Secure Products (Real Approx)Approximate reals
with integers:
Using the previous method is wrong:
Need to divide off
The real number Integer representation
“Decimal point” is pushed left
![Page 36: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/36.jpg)
36
Secure Products (Real Approx)Approximate reals
with integers:
Using the previous method is wrong:
Can’t just correct shares locally:
The real number Integer representation
Extra term due to “mod” in definition of RS
![Page 37: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/37.jpg)
37
Secure Products (Real Approx)Approximate reals
with integers:
Using the previous method is wrong:
Can’t just correct shares locally:
The real number Integer representation
Extra term due to “mod” in definition of RSProposed solution:
• Assume bound on magnitude of product (mild assumption)• Restrict domain of noise to ensure that c’ = 1• “Correct” the results of locally dividing shares.
Shares remain C.I. from uniform distribution
![Page 38: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/38.jpg)
38
Our Protocol
• We can do sums and products on reals and everything composes nicely!
Matrix inversion is all we need
![Page 39: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/39.jpg)
39
Inversion by Sums and ProductsComputing the reciprocal of a
The zero of this function is x = a-1
![Page 40: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/40.jpg)
40
Inversion by Sums and Products
0.5 1 1.5-0.5
0
0.5
1
1.5
2
2.5
3
x
f(x)
f(x) = a-1
Computing the reciprocal of a
Use Newton’s method
Convergence is quadratic if 0 < x0 < a-1
![Page 41: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/41.jpg)
41
Inversion by Sums and Products
0.5 1 1.5-0.5
0
0.5
1
1.5
2
2.5
3
x
f(x)
f(x) = a-1
Use Newton’s method
Convergence is quadratic if 0 < x0 < a-1
Inverting the matrix A
Sums and productsNumber of iterations required depends on
condition of A
Computing the reciprocal of a
![Page 42: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/42.jpg)
42
Putting it TogetherStep 1: Compute (shares of) XTX, XTy
Easy to parallelize by slicing X horizontally
Step 2: Compute shares of inverse
Step 3: Multiply shares of inverse with shares of XTy
Use reciprocal of trace as starting point.
Step 4: Pool final shares and construct output.
![Page 43: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/43.jpg)
43
CPS - Experimental Verification
• Survey data with 50000 samples, 22 covariates.
• Artificially split into 3 “parties” holding 10,8,4 covariates respectively (for all cases).
• Using 1024 bit long keys.• Computation of XTX, XTy parallelized on 9
CPUs, takes roughly 1.5 days.• Matrix inversion takes 1 hour.
![Page 44: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/44.jpg)
44
Logistic Regression
• Iteratively Re-weighted Least Squares:
• A non-linear thing to compute:• Repeated matrix inversion
Similar to linear regression….except:
![Page 45: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/45.jpg)
45-1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 40.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Logistic Regression
Think of these as variables to update
![Page 46: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/46.jpg)
46-1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 40.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Logistic Regression
Use Euler’s method to integrate the gradient
Multiple steps, per iteration
Introduces some error
![Page 47: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/47.jpg)
47-1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 40.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Logistic Regression
Multiple steps, per iteration
Introduces some error
Gradient only involves sums and products.
Use Euler’s method to integrate the gradient
![Page 48: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/48.jpg)
48
Logistic Regression
• Avoid repeated matrix inversion:
Invert only once (see e.g., Tom Minka)
![Page 49: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/49.jpg)
49
Logistic Regression
• Avoid repeated matrix inversion:
• Algorithm converges and has following property:
Invert only once (see e.g., Tom Minka)
Distance between optimizer of approximation and IRLS
Data dependent constant
Number of steps of Euler’s
![Page 50: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/50.jpg)
50
Logistic Regression
![Page 51: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/51.jpg)
51
Summary
• Intro to cryptographic protocols.• Secure product protocol.• Our linear regression protocol:– Approximation of real math with integer math.– Reduction of matrix inverse to sums and products.
• Our logistic regression protocol:– Approximation of logistic function by sums and
products.
![Page 52: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/52.jpg)
52
Ongoing Work
• Record linkage
• Implementation (R bindings?)
• Regression variants– LARS, Lasso etc.
• Privacy implications of regression coefficients.
![Page 53: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/53.jpg)
53
Thanks
![Page 54: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/54.jpg)
54
Privacy Implications
The (2 party) protocol computes the estimate:
At the end, party 1 may conclude that the data of party 2 falls into the set:
e.g., invertible implies total privacy invasion
![Page 55: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/55.jpg)
55
Privacy Implications (Vertical)
Consider the partitioning scheme:
The OLS estimate may be written as:
![Page 56: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/56.jpg)
56
Privacy Implications (Vertical)
Consider the partitioning scheme:
The OLS estimate may be written as:
We may express M in terms of its projection onto X1
![Page 57: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/57.jpg)
57
Privacy Implications (Vertical)
Consider the partitioning scheme:
The OLS estimate may be written as:
We may express M in terms of its projection onto X1
Grinding out the maths gives:
![Page 58: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/58.jpg)
58
Privacy Implications (Vertical)Express M2 in terms of the new variables:
q = 1 means A is revealed
![Page 59: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/59.jpg)
59
Ongoing Work
• Logistic Regression (done but slow).• Lasso, LARs etc.• Record linkage (assumed here).• Imputation of missing data.• Secure computation of goodness-of-fit
statistics.
![Page 60: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/60.jpg)
60
Questions
• For the technical details and code please see:
http://www.cs.cmu.edu/~rjhall/slr
![Page 61: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/61.jpg)
61
Logistic Regression (IRLS)
• Newton-Raphson iterates:
• Approximate sigmoid by the empirical CDF:
• Secure computation of “greater than” is well known.• Approximation error
decreases with . -10 -5 0 5 100
0.2
0.4
0.6
0.8
1
a
(a)
![Page 62: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/62.jpg)
62
CPS - Experimental Verification
No. in Household 0.96 0.95 0.09 0.96 0.03
![Page 63: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/63.jpg)
63
CPS - Experimental Verification
Age(3) 1.18 1.20 0.10 1.18 0.04
![Page 64: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/64.jpg)
64
Alternative ApproachesPatient
IDTobacc
oAge Weigh
tHeart Disease
0001 ? 36 170 ?
0002 N 26 150 ?
0003 N 45 165 ?
… … … … …
Patient ID
Tobacco
Age Weight
Heart Disease
0001 ? 36 170 ?
0002 N 26 150 ?
0003 N 45 165 ?
… … … … …
Patient ID
Tobacco
Age Weight
Heart Disease
0001 ? 36 170 ?
0002 N 26 150 ?
0003 N 45 165 ?
… … … … …
Patient ID
Tobacco
Age Weight
Heart Disease
0001 ? 36 170 ?
0002 N 26 150 ?
0003 N 45 165 ?
… … … … …
Parties “sanitize” data
Release “Sanitized” Data
i.e., transform, the data into something they are willing to
release
![Page 65: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/65.jpg)
65
Alternative ApproachesPatient
IDTobacc
oAge Weigh
tHeart Disease
0001 ? 36 170 ?
0002 N 26 150 ?
0003 N 45 165 ?
… … … … …
Patient ID
Tobacco
Age Weight
Heart Disease
0001 ? 36 170 ?
0002 N 26 150 ?
0003 N 45 165 ?
… … … … …
Patient ID
Tobacco
Age Weight
Heart Disease
0001 ? 36 170 ?
0002 N 26 150 ?
0003 N 45 165 ?
… … … … …
Patient ID
Tobacco
Age Weight
Heart Disease
0001 ? 36 170 ?
0002 N 26 150 ?
0003 N 45 165 ?
… … … … …
Sanitization scheme
may affect estimator
Parties “sanitize” data
Release “Sanitized” Data
Patient ID
Tobacco
Age Weight
Heart Disease
0001 ? 36 170 ?
0002 N 26 150 ?
0003 N 45 165 ?
… … … … …
Data are pooled
![Page 66: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/66.jpg)
66
Alternative Approaches
?
Patient ID
Tobacco
Age Weight
Heart Disease
0001 ? 36 170 ?
0002 N 26 150 ?
0003 N 45 165 ?
… … … … …
Patient ID
Tobacco
Age Weight
Heart Disease
0001 ? 36 170 ?
0002 N 26 150 ?
0003 N 45 165 ?
… … … … …
Patient ID
Tobacco
Age Weight
Heart Disease
0001 ? 36 170 ?
0002 N 26 150 ?
0003 N 45 165 ?
… … … … …
Patient ID
Tobacco
Age Weight
Heart Disease
0001 ? 36 170 ?
0002 N 26 150 ?
0003 N 45 165 ?
… … … … …
Sanitization scheme
may affect estimator
Output the correct result
Distributed computation that ensures
privacy
Parties “sanitize” data
“Secure Multiparty Computation”
Release “Sanitized” Data
Patient ID
Tobacco
Age Weight
Heart Disease
0001 ? 36 170 ?
0002 N 26 150 ?
0003 N 45 165 ?
… … … … …
Data are pooled
![Page 67: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/67.jpg)
67
Yao’s Protocol
• Theoretically can now compute anything!• How:– Compose sums and products in mod 2.– Corresponds to “xor” and “and.”– Sufficient to compute any circuit.
Theoretically, we’re done already … but
![Page 68: Secure Multiparty Regression Based on Homomorphic Encryption Rob Hall Joint work with Yuval Nardi (Technion) and Steve Fienberg 1 rjhallrjhall+@cs.cmu.edu](https://reader035.vdocuments.site/reader035/viewer/2022081518/55199c355503463d068b4a1c/html5/thumbnails/68.jpg)
68
Yao’s Protocol
• Theoretically can now compute anything!• How:– Compose sums and products in mod 2.– Corresponds to “xor” and “and.”– Sufficient to compute any circuit.
Theoretically, we’re done already … but
Leads to very slow protocols!