monomi: practical analytical query processing over encrypted data

33
Monomi: Practical Analytical Query Processing over Encrypted Data Network Security Dr. Ali Fanian Mostafa Arjmand

Upload: mostafa-arjmand

Post on 13-Apr-2017

96 views

Category:

Engineering


4 download

TRANSCRIPT

Page 1: Monomi: Practical Analytical Query Processing over Encrypted Data

Monomi: Practical Analytical Query Processing over Encrypted Data

Network Security

Dr. Ali Fanian

Mostafa Arjmand

Page 2: Monomi: Practical Analytical Query Processing over Encrypted Data

The Presentation Include The Following :

• INTRODUTION• SPLIT CLIENT/SERVER EXECUTION• OPTIMIZATION TECHNIQUES• DESIGNER AND PLANNER• EVALUATION• CONCLUSION

Page 3: Monomi: Practical Analytical Query Processing over Encrypted Data

Monomi

• Securely executing analytical query over sensitive data on an untrusted database server

Page 4: Monomi: Practical Analytical Query Processing over Encrypted Data

Problem: Want to run queries over data!

Vulnerable databaseTrusted user

Query

Response

“Give me the # of views of all adults by country”

US 1M

Italy 3K

… …

Page 5: Monomi: Practical Analytical Query Processing over Encrypted Data

Approach 1: Fully Homomorphic Encryption

• Run any computation over encrypted data• Prohibitive overheads in practice

Slowdown order 10^9× compared to computation

on plaintext data

Page 6: Monomi: Practical Analytical Query Processing over Encrypted Data

Approach 2: Specialized Schemes

• Cryptosystems supporting specific operations:• Equality (deterministic) [AES]• Addition [Paillier 99]• Inequality (order preserving) [Boldyreva 09]• Keyword Search [Song 00]

• These operations common in SQL queries…

Page 7: Monomi: Practical Analytical Query Processing over Encrypted Data

Practical state of the art: CryptDB

SELECT country_DET, PAILLIER_SUM(views_HOM) FROM users_ENCRYPTEDWHERE age_OPE > 0xDEADBEEFGROUP BY country_DET

Transformed Query:SELECT country, SUM(views) FROM users WHERE age > 18GROUP BY country

Original Query:Deterministic encryption: EqualityOrder preserving encryption: InequalityPaillier cryptosystem: Addition

0xDEADBEEF = Encrypt_OPE(18)

Under attack

DB Servertransformed queryplain query

Stores encryption keys

Applicationdecrypted results encrypted results

Trusted

Encrypted DB

No client computation: CryptDB requires that all computation in a query are supported by a specialized crypto-system

Page 8: Monomi: Practical Analytical Query Processing over Encrypted Data

Problem: OLTP ≠ OLAP

• CryptDB is designed for OLTP queries• We are interested in OLAP queries• Queries typically involve more computation• CryptDB can only support 4/22 TPC-H queries

Page 9: Monomi: Practical Analytical Query Processing over Encrypted Data

SELECT category, SUM(cost * quantity) AS valueFROM productWHERE made_in = ‘United States’GROUP BY categoryHAVING SUM(cost * quantity) > 1000000ORDER BY value

What happens when we run this query with CryptDB?

SELECT category, SUM(cost * quantity) AS valueFROM productWHERE made_in = ‘United States’GROUP BY categoryHAVING SUM(cost * quantity) > 1000000ORDER BY value

No efficient additive + multiplicative homomorphic cryptosystem

SELECT category, SUM(cost * quantity) AS valueFROM productWHERE made_in = ‘United States’GROUP BY categoryHAVING SUM(cost * quantity) > 1000000ORDER BY value

No efficient additive + order preserving homomorphic cryptosystem

Problem: OLTP ≠ OLAPSELECT category, SUM(cost * quantity) AS valueFROM productWHERE made_in = ‘United States’GROUP BY categoryHAVING SUM(cost * quantity) > 1000000ORDER BY value

Most of the query can be executed on the server, except a few parts

CryptDB can handle queries that involve only computation supported by one of these encryption schemes

But few analytical queries fall in this category (only four out of 22 TPC-H), and for those four, CryptDB incurs significant Overhead

Page 10: Monomi: Practical Analytical Query Processing over Encrypted Data

Solution

• Monomi: A new system for practical analytical query processing • Split client/server query execution• Pre-computation + other runtime optimizations• Query planner/designer

Monomi: Can run TPC-H with 1.24x median overhead (vs. plaintext) using these three techniques.

Page 11: Monomi: Practical Analytical Query Processing over Encrypted Data

Encryption schemes used by MONOMI

Page 12: Monomi: Practical Analytical Query Processing over Encrypted Data

Overall architecture of MONOMI

Page 13: Monomi: Practical Analytical Query Processing over Encrypted Data

2 ) SPLIT CLIENT/SERVER EXECUTION

• In order to execute queries that cannot be computed on the server alone, MONOMI partitions the execution of each query

Page 14: Monomi: Practical Analytical Query Processing over Encrypted Data

TPC-H query 11

Example split query plan for TPC-H query 11

SELECT ps_partkey,SUM(ps_supplycost * ps_availqty) AS valueFROM partsupp JOIN supplier JOIN nationWHERE n_name = :1GROUP BY ps_partkeyHAVING SUM(ps_supplycost * ps_availqty) > (

SELECT SUM(ps_supplycost * ps_availqty) * 0.0001

FROM partsupp JOIN supplier JOIN nationWHERE n_name = :1 )ORDER BY value DESC;

Can execute on server

• Can’t execute on server

• They need decrypt on clien and execute

Pre-computed (ps_supplycost * ps_availqty) (Deterministic)Execute Group by with pre-computed ps_partkey_det

0xabcdef denotes the deterministic encryption of the argument :1

MONOMI’s encryption schemes cannot support such queries directly over encrypted data

Page 15: Monomi: Practical Analytical Query Processing over Encrypted Data

Pseudo-code for GENERATEQUERYPLAN

Page 16: Monomi: Practical Analytical Query Processing over Encrypted Data

3 ) OPTIMIZATION TECHNIQUES

• MONOMI’s designer and planner automatically use these techniques as appropriate to achieve good performance• Without requiring the application developer to perform manual query

rewriting

Page 17: Monomi: Practical Analytical Query Processing over Encrypted Data

Per-row precomputation

• Executing an operator on the client can require downloading a large amount of intermediate data• SUM(ps_supplycost*ps_availqty) • Paillier encryption cannot perform multiplication of two values• MONOMI employs per-row precomputation• Designer materializes an additional column in a table• These materialized expressions can then be used at query runtime

• MONOMI can materialize an additional column storing encryption of ps_supplycost*ps_availqty

• Which allows the server to compute the entire SUM()• This allows the client to download one encrypted aggregate value,

instead of downloading all ps_supplycost and ps_availqty values

Page 18: Monomi: Practical Analytical Query Processing over Encrypted Data

Space-efficient encryption

• Database queries that involve table scans—which are common in analytic workloads—are often bottlenecked by I/O

• Precomputed (o_orderdate_det) • To allow grouping by the extracted year, the value must be encrypted

deterministically, but using the standard AES or Blowfish algorithm would produce a 128- or 64-bit ciphertext even for 8-, 16-, or 32-bit.

• To minimize ciphertext expansion for deterministic encryption of such small data types, MONOMI uses the FFX block cipher mode of operation• Which encrypts n-bit plaintexts to n-bit ciphertexts, as long as n

is less than the block cipher width (e.g., 128 bits for AES).

Page 19: Monomi: Practical Analytical Query Processing over Encrypted Data

Grouped homomorphic addition

Conservative pre-filtering

Page 20: Monomi: Practical Analytical Query Processing over Encrypted Data

3 ) DESIGNER AND PLANNER

• The optimizations do not always apply to every query• In particular, a greedy application of the techniques or a greedy

execution of all parts of the query on the server does not necessarily yield the best performance and can waste space• The goal of MONOMI’s designer is to decide how to encrypt the data

(physical design). • MONOMI’s planner determines how to best execute queries given a

particular physical design. • This planner is also used at runtime to choose a query plan for a new

query from the application

Page 21: Monomi: Practical Analytical Query Processing over Encrypted Data

Input and output

• Input : Query workload Q1,Q2, ...,Qn.• Should be representative of the operations that the user is expecting to perform

over the data• Input : A sample of the data that will be loaded into the database • Is used for estimating statistics about the data• Need not be the exact data that will be eventually loaded on the server

• Input : The user can also specify a space constraint factor S• Controls how much space MONOMI’s designer can consume

• Designer returns to the user a physical design for the server• Is a set of (encrypted) columns to materialize for each table, including pre-

computed and Paillier columns

Page 22: Monomi: Practical Analytical Query Processing over Encrypted Data

Algorithm without constraints step 1

• The designer considers all of the operations in Qi, including all of the expressions in the WHERE and HAVING clauses, any ORDER BY, etc.• For each operation, the designer determines what expression and

encryption scheme would allow that operation to execute on the server.• The set of these (value, scheme) pairs for Qi is called EncSeti, and

defined E to be the set of all possible such pairs; that is, EncSeti E .• For example, a WHERE x = :1 clause generates a (x,DET) pair, referring to the

x column, and an ORDER BY x+y clause generates a (x+y,OPE) pair, referring to a precomputed x+y value.

Page 23: Monomi: Practical Analytical Query Processing over Encrypted Data

Algorithm without constraints step 2

• The designer invokes the planner • The planner computing power set that contains all subsets of EncSeti• It then constructs an execution plan for Qi for each element of the

power set• Where the execution plan describes what parts of the query would be

executed on the server, and what parts would be executed on the client

Page 24: Monomi: Practical Analytical Query Processing over Encrypted Data

Algorithm without constraints step 3

• For each of the execution plans, the planner uses a cost model to estimate how much time it would take to execute that plan.

Page 25: Monomi: Practical Analytical Query Processing over Encrypted Data

4 ) Evaluation

• How many TPC-H queries can Monomi run?• Can MONOMI efficiently execute an analytical workload over encrypted data on

an untrusted server?• How much do MONOMI’s optimization techniques, designer, and planner matter

in achieving good performance?• What are the overheads for MONOMI in terms of space and client-side CPU time?

• Setup:• TPC-H scale 10• Postgres 8.4 on Linux 2.6

• 8GB RAM, 16 cores, six 7200 RPM HDDs

Page 26: Monomi: Practical Analytical Query Processing over Encrypted Data

Most TPC-H queries supported

• Monomi’s approach handles all TPC-H queries• Our prototype handles 19/22 due to missing SQL features (e.g. views, LIKE

’%foo%bar%’, )

• First system we know of that can do this!• CryptDB only supports 4/22

Page 27: Monomi: Practical Analytical Query Processing over Encrypted Data

Execution time of TPC-H queries under various systems

min overhead 1.03x, median overhead 1.24x,

max overhead 2.33x

Page 28: Monomi: Practical Analytical Query Processing over Encrypted Data

Space and CPU overheads

Amount of disk space used on the server CPU overhead

Page 29: Monomi: Practical Analytical Query Processing over Encrypted Data

Sensitivity to designer input

Page 30: Monomi: Practical Analytical Query Processing over Encrypted Data

Number of distinct columns in the TPC-H tables encrypted by MONOM

Page 31: Monomi: Practical Analytical Query Processing over Encrypted Data

Security

• To understand the level of security that MONOMI provides, it is important to consider the encryption schemes chosen by MONOMI• The worst is OPE, which reveals order, followed by DET• A common use of OPE is for date fields, which may be less sensitive

• The next weakest scheme, DET• Is used in quite a few columns, but reveals only duplicates, which may be less

of a concern

Page 32: Monomi: Practical Analytical Query Processing over Encrypted Data

CONCLUSION

• Monomi: analytics on encrypted data can be made practical!• Techniques:• Split client/server execution• Pre-computation + space-efficient encryption + pre-filtering +…• Planner/designer

Page 33: Monomi: Practical Analytical Query Processing over Encrypted Data

THANK YOU FOR LISTENING