Transcript
Page 1: Security approaches in BigTable-like storage systems

Security approaches in BigTable-like storage systems

22951 Research Seminar: Information Security and Privacy July 2014

Open University of Israel

Grisha Weintraub

Page 2: Security approaches in BigTable-like storage systems

Abstract

• BigTable - Google’s scalable storage system. • Designed for internal(i.e. trusted) use. • Open sources implementations (e.g. HBase).• Can be deployed in a public cloud (i.e. DBaaS). • However one may not trust the public cloud

provider.• Our focus is on the approaches to make

BigTable-like systems secure.

Page 3: Security approaches in BigTable-like storage systems

Outline

• BigTable

• Security approaches :

Integrity(iBigTable)

Encryption(BigSecret)

Access Control(Accumulo)

Page 4: Security approaches in BigTable-like storage systems

BigTable - Introduction

• Fay Chang et al., Bigtable: A Distributed Storage System for Structured Data, OSDI2006 (Best Paper)

• Distributed storage system for managing structured data that is designed to scale to a very large size.

Page 5: Security approaches in BigTable-like storage systems

BigTable – Data Model

• BigTable is a sparse, distributed, persistent multidimensional sorted map.

• The map is indexed by a row key, column key, and a timestamp.

• (row_key,column_key,time) string

Page 6: Security approaches in BigTable-like storage systems

BigTable – Data Modelphone name user_id

178145 John 15

email name user_id

[email protected] 29Bob t1Robert t2

row_keycolumn_key

timestamp

(29, name, t2) “Robert”

email phone name user_id

RDBMSApproach

null 178145 John [email protected] null Bob 29

Page 7: Security approaches in BigTable-like storage systems

BigTable – Data Model• Columns are grouped into Column Families:

– family : optional qualifier

contactInfo : email contactInfo : phone name: user_id

[email protected] 17814552 John 15

Column Family

Optional Qualifier

name user_id RDBMSApproach

John 15

value type user_id

178145 phone [email protected] email 15

Page 8: Security approaches in BigTable-like storage systems

BigTable – Data Model

Value Timestamp Column Row-Key

Qualifier Family

Key Value

• Sorting order:– Row-Key Family Qualifier Timestamp

Page 9: Security approaches in BigTable-like storage systems

BigTable – Data Model

• Tablets :– Large tables broken into tablets at row boundaries.– Tablet holds contiguous range of rows.– Approximately 100-200 MB of data per tablet.

..… id

..… 15000

Tablet 1..… .…

..… 20000

..… 20001

Tablet 2..… .…

..… 25000

Page 10: Security approaches in BigTable-like storage systems

BigTable – API

• Metadata operations :– Creating and deleting tables, column families, modify access control

rights.

• Client operations :– Write/delete values– Read values– Scan row ranges

// Open the tableTable *T = OpenOrDie("/bigtable/users");

// Update name and delete a phoneRowMutation r1(T, “29");r1.Set(“name:", “Robert");r1.Delete(“contactInfo:phone");Operation op;Apply(&op, &r1);

Page 11: Security approaches in BigTable-like storage systems

BigTable – System Structure • Three major components:

– Client library

– Master (exactly one) :• Assigning tablets to tablet servers.• Detecting the addition and expiration of tablet servers.• Balancing tablet-server load.• Garbage collection of files in GFS.• Schema changes such as table and column family creations.

– Tablet Servers(multiple, dynamically added) :• Manages 10-1000 tablets• Handles read and write requests to the tablets.• Splits tablets that have grown too large.

Page 12: Security approaches in BigTable-like storage systems

BigTable – System Structure

Page 13: Security approaches in BigTable-like storage systems

BigTable – Tablet Location

• Three-level hierarchy analogous to that of a B+ tree to store tablet location information.

• Client library caches tablet locations.

Page 14: Security approaches in BigTable-like storage systems

BigTable – Tablet Serving• Writes :

– Updates committed to a commit log.– Recently committed updates are stored in memory – memtable.– Older updates are stored in a sequence of SSTables.

• Reads :– Read operation is executed on a merged view of the sequence of SSTables and the memtable.– Since the SSTables and the memtable are sorted, the merged view can be formed efficiently.

Page 15: Security approaches in BigTable-like storage systems

BigTable - Compactions

• Minor compaction:– Converts the memtable into SSTable.– Reduces memory usage.– Reduces log reads during recovery.

• Major compaction:– Merging compaction that results in a single SSTable.– No deletion records, only live data.– Good place to apply policy “keep only N versions”

Page 16: Security approaches in BigTable-like storage systems

Outline

• BigTable √

• Security approaches :

Integrity(iBigTable)

Encryption(BigSecret)

Access Control(Accumulo)

Page 17: Security approaches in BigTable-like storage systems

iBigTable - Introduction

• Wei Wei, Ting Yu, Rui Xue: iBigTable: practical data integrity for bigtable in public cloud. CODASPY 2013

• Enhancement of BigTable that provides scalable data integrity assurance.

Page 18: Security approaches in BigTable-like storage systems

iBigTable – System Model

BigTable

Data Owner

Clients

writes

reads

Page 19: Security approaches in BigTable-like storage systems

iBigTable - Goals

• Correctness:– returned records have not been modified in any way

• Completeness:– no answers have been omitted from the result

• Freshness:– results are based on the most current version of the data

Page 20: Security approaches in BigTable-like storage systems

iBigTable – System Design• Basic Idea:

– Build Merkle Hash Tree based Authenticated Data Structure for each tablet.

• Verification Object(VO) - Data returned along with result and used to authenticate the result.

• Example – VO for Data block 1 – {Hash 0-1, Hash 1}

Page 21: Security approaches in BigTable-like storage systems

iBigTable – System Design

Merkle B+ Tree

Page 22: Security approaches in BigTable-like storage systems

iBigTable – System Design

User Tablet User Tablet

Meta Tablet

Root Tablet

Data Owner

Root hash

• Pros:– Only maintain one hash for all data

• Cons:– Require update propagation– Concurrent updates could cause issues

Page 23: Security approaches in BigTable-like storage systems

User Tablet User Tablet

Meta Tablet

Root Tablet

Data OwnerRoot hash

Root hash

Root hash

Root hash

……

iBigTable – System Design

Page 24: Security approaches in BigTable-like storage systems

iBigTable – Reads

1.1 getMetaTabletLocation(table name, row key)

Tablet Server serving ROOT tabletClient

1.3 meta tablet location

1.4

verif

y

2.1 getUserTabletLocation(table name, row key)

Tablet Server serving META tabletClient

2.3 user tablet location

2.4

verif

y

3.1 getRow(row key)

Tablet Server serving USER tabletClient

3.3 row data

3.4

verif

y

1.2 generate VO

2.2 generate VO

2.2 generate VO

, VO

, VO

, VO

Page 25: Security approaches in BigTable-like storage systems

iBigTable – Updates

3.1 new/updated row

Tablet Server serving USER tabletData Owner

3.3 PT-VO

3.4 verify and update tablet root hash 3.2 generate PT-VO

Partial Tree Verification Object (PT-VO) – The difference between a VO and a PT-VO is that a PT-VO contains keys along with hashes, while a VO does not.

Page 26: Security approaches in BigTable-like storage systems

iBigTable – Updates

6030

10 50 80

0 10 20 5030 40 80 9060 70

70

Initial MB+ row tree of a tablet in a tablet server.

Page 27: Security approaches in BigTable-like storage systems

iBigTable – Updates

6030

50

5030 40

45

New Key 45

Insert a row with key 45 into partial tree VO

40 45

6030

50

5030

New Key 45

40

Partial tree VO after 45 is inserted

Page 28: Security approaches in BigTable-like storage systems

iBigTable – Authenticated Data Structure

• Projected range queries - expensive to generate and verify VOs.

SL-MBT: A single-level Merkle B+ tree

Page 29: Security approaches in BigTable-like storage systems

iBigTable – Authenticated Data Structure

TL-MBT: A two-level Merkle B+ tree.

Page 30: Security approaches in BigTable-like storage systems

Outline

• BigTable √

• Security approaches :

Integrity(iBigTable) √

Encryption(BigSecret)

Access Control(Accumulo)

Page 31: Security approaches in BigTable-like storage systems

BigSecret - Introduction

• Erman Pattuk et al., BigSecret: A Secure Data Management Framework for Key-Value Stores. IEEE CLOUD 2013

• A secure data management framework for BigTable-like storage systems.

Page 32: Security approaches in BigTable-like storage systems

BigSecret – System Model

BigTable

Clients

BigSecret

get(“Bob”, “email”) Get(“A4Vc”, “Zx$23”)

“DF77Xs9”“[email protected]

Page 33: Security approaches in BigTable-like storage systems

BigSecret – Goals• Secure storage of data on untrusted servers.

• Efficient query execution on encrypted data.

• Supported queries :– Put– Get– Delete– Scan

Page 34: Security approaches in BigTable-like storage systems

BigSecret – Preliminaries• Key :

– row||fam||qua||ts

• Symmetric Encryption:– E(p) c //encryption– D(c) p //decryption

• Pseudo-Random Functions(PRF):– H(m) h //deterministic random

• Bucketization:– Partitions p1,p2,… of domain Z.– Ident function that assigns unique random identifiers to each partition.– Map function that takes a partitioned domain, a value v from the domain, and returns

Ident(p), where v belongs to p.

Page 35: Security approaches in BigTable-like storage systems

BigSecret – Bucketization

0 100002000 4000 6000 8000

34 97 123 266 771

Map(100) = 34 Map(6451) = 266

Order-preserving mapping:x<y Map(x) < Map(y)

Page 36: Security approaches in BigTable-like storage systems

BigSecret – Encryption Models

Naive approach – encrypt values only

BigSecret BigTable

Put(row, fam, qua, ts, value ) Put (row, fam, qua, ts, E(value))

E(value)D(E(value))

– All operations are supported.– Relatively good performance.– Only minor changes to the system are required.– Poor privacy.

Page 37: Security approaches in BigTable-like storage systems

BigSecret – Encryption Models

Model-1– bucketization for all key parts

BigSecret BigTable

Put(row, fam, qua, ts, value ) Put (Map(row), Map(fam), Map(qua)||E(key), Map(ts), E(value))

– All operations are supported.– Relatively bad performance.– Privacy-performance trade-off.

Scan(row_from, row_to, fam)

Scan(200, 300, contactInfo)

Scan(Map(row_from), Map(row_to), Map(fam))

Scan(34, 34, 452)

Page 38: Security approaches in BigTable-like storage systems

BigSecret – Encryption Models

Model-2– PRF for all key parts

BigSecret BigTable

Put(row, fam, qua, ts, value ) Put (H(row), H(fam), H(qua)||E(key), H(ts), E(value))

– Scan is not supported.– Relatively good performance.– Frequency-based attacks.

Get(row, fam, qua)

Get(200, contactInfo, email)

Get(H(row), H(fam), H(qua))

Get(Az54Et, q8dj8, qWd29h)

Page 39: Security approaches in BigTable-like storage systems

BigSecret – Encryption Models

Frequency-based attacks(Damiani et al. 2003)

Possible solutions:• Decreasing the range of the PRFs.• Model-3

city name id

Tel-Aviv Alice 19New York Bob 24

Paris Carol 32New York Alice 38

city name id

$ 27 j

& 14 a

* 23 t

& 27 z

27 = “Alice”& = “New York”

Alice lives in NY

Page 40: Security approaches in BigTable-like storage systems

BigSecret – Encryption Models

Model-3– PRF only for row-key

BigSecret BigTable

Put(row, fam, qua, ts, value ) Put (H(row), 0, E(key), 1, E(value))

– Scan is not supported.– Relatively good privacy.– Performance ?

Get(row, fam, qua)

Get(200, contactInfo, email)

Get(H(row), 0, null)

Get(Az54Et, 0, null)

Page 41: Security approaches in BigTable-like storage systems

BigSecret – Encryption Models

Page 42: Security approaches in BigTable-like storage systems

Outline

• BigTable √

• Security approaches :

Integrity(iBigTable) √

Encryption(BigSecret) √

Access Control(Accumulo)

Page 43: Security approaches in BigTable-like storage systems

Accumulo- Introduction

• Adam Fuchs, Apache Accumulo: Extensions to Google's Bigtable Design, 2012, lecture conducted from Morgan State University

• An extension of BigTable that provides cell-level access control.

Page 44: Security approaches in BigTable-like storage systems

Accumulo – System Model

BigTable

Value Qualifier Family Row

Bob name [email protected] email contactInfo 14sodium : 137 …

blood test

healthData 14

Patient suffers from .…

doctor’s notes

healthData 14

… … .… …

email, blood test

blood test, notes

Bob

Page 45: Security approaches in BigTable-like storage systems

Accumulo – System Model

BigTable

credentials, query

lookup user user authorization set

auth, query

datadata

Page 46: Security approaches in BigTable-like storage systems

Accumulo- Data Model

Value Timestamp Column Row-Key

Visibility Qualifier Family

Value Timestamp Column Row-Key

Qualifier Family

Security labels (e.g. A|(B&C) )

Page 47: Security approaches in BigTable-like storage systems

Accumulo- Visibility

• Syntax:– A&B – both A and B required– A|B – must have either A or B – A|(B & C) – must have A or both B and C

• Examples:– Admin|(Manager & Sales)– Citizen & Adult– Secret | Top Secret

Page 48: Security approaches in BigTable-like storage systems

Accumulo- Visibility

Value Visibility Qualifier Family RowBob name [email protected] bob14 email contactInfo 14

sodium : 137 …

bob14|doctor blood test healthData 14

Patient suffers from .…

doctor doctor’s notes

healthData 14

… … .… …

Page 49: Security approaches in BigTable-like storage systems

Accumulo – Visibility

BigTable

(bob, ***), health data

lookup user {bob14}

{bob14}, health data

blood testblood test

Visibility Qualifier Family

doctor notes HealthData

bob14|doctor

blood test HealthData

Bob

Page 50: Security approaches in BigTable-like storage systems

Accumulo- Iterators

Iterator

Page 51: Security approaches in BigTable-like storage systems

Accumulo- Iterators

Page 52: Security approaches in BigTable-like storage systems

Outline

• BigTable √

• Security approaches :

Integrity(iBigTable) √

Encryption(BigSecret) √

Access Control(Accumulo) √

Page 53: Security approaches in BigTable-like storage systems

References

• Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Michael Burrows, Tushar Chandra, Andrew Fikes, Robert Gruber: Bigtable: A Distributed Storage System for Structured Data (Awarded Best Paper!). OSDI 2006:205-218

• Wei Wei, Ting Yu, Rui Xue: iBigTable: practical data integrity for bigtable in public cloud. CODASPY 2013:341-352

• Erman Pattuk, Murat Kantarcioglu, Vaibhav Khadilkar, Huseyin Ulusoy, Sharad Mehrotra: BigSecret: A Secure Data Management Framework for Key-Value Stores. IEEE CLOUD 2013:147-154

• http://accumulo.apache.org/


Top Related