@ carnegie mellon databases data-oriented transaction execution vldb 2010 ippokratis pandis ryan...

28
@ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis †‡ Ryan Johnson †‡ Nikos Hardavellas Anastasia Ailamaki †‡ †Carnegie Mellon University ‡École Polytechnique Fédérale de Lausanne Northwestern University

Upload: mckenna-buffin

Post on 31-Mar-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

@ Carnegie MellonDatabases

Data-oriented Transaction Execution

VLDB 2010

Ippokratis Pandis†‡ Ryan Johnson†‡ Nikos Hardavellas Anastasia Ailamaki †‡

†Carnegie Mellon University ‡École Polytechnique Fédérale de Lausanne Northwestern University

Page 2: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 2

Scaling OLTP on multicores/sockets• Shared Everything

– Hard to scale

Multicore multisocket machine

Core Core Core Core

Core Core Core Core

CPU

Core Core Core Core

Core Core Core Core

CPU

Page 3: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 3

Scaling OLTP on multicores/sockets• Shared Everything

– Hard to scale

• Shared Nothing– Multiple processes, physically

separated data– Explicit contention control– Perfectly partition-able

workload– Memory pressure: redundant

data/structures

Multicore multisocket machine

Core Core Core Core

Core Core Core Core

CPU

Core Core Core Core

Core Core Core Core

CPU

Two approaches complementaryFocus on scalability of a single (shared everything) instance

Page 4: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 4

Shore-MT• Multithreaded version of SHORE

• State-of-the-art DBMS features• Two-phase row-level locking • ARIES-style logging/recovery

• Similar at instruction-level with commercial DBMSs 0 4 8 12 16 20 24 28 32

0

200

400

600

800Execution Time

MySqlPostgresDBMS "X"BDBShore-MT

# HW Contexts

Tim

e (s

econ

ds)

High-performing, scalable conventional engineAvailable at: http://www.cs.wisc.edu/shore-mt/

Sun Niagara T1Insert-only mbench

Page 5: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 5

1 2 4 8 16 32 640

1000

2000

3000

4000 BPOOL

LOCK-MAN

LOG-MAN

DATA MOVEMENT

OTHER

# HW Contexts

Tim

e br

eakd

own

(cpu

sec

s)

Problems on Even Higher ParallelismSun Niagara T2TPC-C Payment

1 2 4 8 16 32 640

200

400

600

800

1000

1200

# HW Contexts

Thro

ughp

ut /

# H

W C

onte

xts

Lock manager overhead dominantContention for acquiring locks in compatible mode

Linear

Page 6: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis

Data-oriented Transaction Execution• It is not the transaction which dictates what

data the transaction-executing thread will access

• Break each transaction into smaller actions– Depending on the data they touch

• Execute actions by “data-owning” threads• Distribute and privatize locking, data accesses

across threads

New transaction execution modelReduce overhead of locking and data accesses 6

Page 7: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 7

DORA vs. Conventional – At 100% CPU

Eliminate contention on the centr. lock managerSignificantly reduced work (lightweight locks)

Baseline Dora0%

10%20%30%40%50%60%70%80%90%

100%

TM1

Tim

e Br

eakd

own

(%)

Baseline Dora0%

10%20%30%40%50%60%70%80%90%

100%

TPC-C OrderStatus

Dora

Lock Manager Cont

Lock Manager

Other Cont

Work

Sun Niagara T264 HW contexts

Page 8: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 8

Roadmap• Introduction• Conventional execution• Data-oriented execution• Evaluation• Conclusions

Page 9: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 9

Example

CPU-0

I D

CPU

L1

CPU-1

I D

CPU-2

I D

L2

MEMI/O

WH CUST ORD

Transaction:I Du(wh)u(cust)u(ord)

I = InstructionD = Data

Page 10: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 10

Conventional - Example

CPU-0 CPU

L1

CPU-1 CPU-2

L2

MEMI/O

WH CUST ORD

Transaction:I Du(wh)u(cust)u(ord)

u(wh)u(cust)u(ord)

Page 11: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis

Typical Lock Manager

The higher the HW parallelism Longer Queues of Requests Longer CSs Higher Contention

L1 EX

EXL2 EX

T1

Lock HeadLock Hash Table

Queue Lock Requests

Xct’s Lock Requests

11

Page 12: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis

2 15 40 57 68 76 83100

0%

20%

40%

60%

80%

100%

LM Release ContLM ReleaseLM Acquire ContLM Acquire

CPU Util. (%)

Tim

e Br

eakd

own

(%)

Time Inside the lock manager

Latch contention on the lock manager

Sun Niagara T2TPC-B

12

Page 13: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis

Thread-to-transaction – Access Pattern

200000 300000 400000 500000 600000 700000 800000 9000000

20

40

60

80

100TPC-C Payment

Time (secs)

DIS

TRIC

T Re

cord

s

13

Unpredictable access patternSource of contention

Page 14: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 14

Roadmap• Introduction• Conventional execution• Data-oriented execution• Evaluation• Conclusions

Page 15: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 15

Thread-to-data – Access Pattern

Predictable access patternsOptimizations possible (e.g. no centralized locks)

200000 300000 400000 500000 600000 700000 800000 9000000

20

40

60

80

100TPC-C Payment

Time (secs)

DIS

TRIC

T re

cord

s

Page 16: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 16

Transaction Flow Graph• Each transaction input is a

graph of Actions & RVPs• Actions

– Table/Index it is accessing– Subset of routing fields

• Rendezvous Points – Decision points (commit/abort)– Separate different phases– Counter of the # of actions to report– Last to report initiates next phase– Enqueue the actions of the next phase

TPC-C Payment

Upd(WH) Upd(DI) Upd(CU)

Ins(HI)

Phase 1

Phase 2

Page 17: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis

Partitions & Executors• Routing table at each table

– {Routing fields executor}

• Executor thread– Local lock table

• {RoutingFields + partof(PK), LockMode}• List of blocked actions

– Input queue• New actions

– Completed queue• On xct commit/abort• Remove from local lock table

– Loop completed/input queue– Execute requests in serial order

Completed

Input

Local Lock Table

Pref LM Own Wait

AAB

A{1,0} EX A

{1,3} EX B

A

17

Routing fields: {WH_ID, D_ID}

Range Executor

{0,0}-{5,2} 1

{5,2}-{6,1} 2

Page 18: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 18

DORA - Example

CPU-0 CPU

L1

CPU-1 CPU-2

L2

MEMI/O

WH CUST ORD

u(wh) u(cust) u(ord)

Transaction:I Du(wh)u(cust)u(ord)

Centralized lock freeImproved data reuse

Page 19: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 19

Non-partition aligned accesses• Secondary index probes without part of the

primary key in their definition• Each entry stores the part of the primary key

needed to decide routing, along with the RID• Do the probe with the RVP-executing thread• Enqueue a new action to the corresponding

thread• Do the record retrieve with the RID

Page 20: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 20

Roadmap• Introduction• Conventional execution• Data-oriented execution• Evaluation• Conclusions

Page 21: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 21

8 16 24 32 460

1000

2000

3000

4000

5000

6000Other ContLogMgr ContLockMgr ContUseful

# HW Ctxs8 16 24 32 46 58

0

1000

2000

3000

4000

5000

6000Other Cont

LogMgr Cont

LockMgr Cont

Useful

# HW Ctxs

Tim

e Br

eakd

own

(cpu

sec

s)

0 20 40 60 80 100 1200

10

20

30

40

50

60

70

80DORABASE-DLOG

Real CPU Load (%)

Thro

ughp

ut

Sun Niagara T2TM1

BASELINE

DORA

Baseline DORAEliminates contention on the lock managerLinear scalability to 64 HW ctxsImmune to oversaturation

Eliminating contention on the lock mgr

Page 22: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 22

TM1-G

etNewDest

TM1-U

pdSubData

TM1-Call

FwdM

ix

TPCC-NewOrd

er

TPCC-Payment

TPC-B0

0.2

0.4

0.6

0.8

1

1.2

BaselineDORA

Nor

m. R

espo

nse

Tim

eResponse time for single client

better

Exploits intra-xct parallelismLower response times on low loads

Page 23: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 23

Peak Performance

better

TM1-M

IX

TM1-G

etSubData

TM1-G

etNew

Dest

TM1-G

etAccD

ata

TM1-U

pdSubData

TM1-U

pdLoca

tion

TM1-Call

FwdMix

TPCC-Pay

ment

TPCC-N

ewOrder

TPCC-O

rderStat

usTP

C-B0

0.5

1

1.5

2

74%

98%

66%

100%

78%

100%

60%

100%

85%

95%

76%

95%

88%

95%

90%

90%

70%

70%

95%

100%

85%

88%

Baseline DORA

Nor

m. P

eak

Thro

ughp

ut

Page 24: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 24

Peak Performance

better

TM1-M

IX

TM1-G

etSubData

TM1-G

etNew

Dest

TM1-G

etAccD

ata

TM1-U

pdSubData

TM1-U

pdLoca

tion

TM1-Call

FwdMix

TPCC-Pay

ment

TPCC-N

ewOrder

TPCC-O

rderStat

usTP

C-B0

0.5

1

1.5

2

Baseline Baseline-Ideal DORA DORA-Ideal

Nor

m. P

eak

Thro

ughp

ut

Higher peak performanceAlways close to 100% CPU utilization

Page 25: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 25

Roadmap• Introduction• Conventional execution• Data-oriented execution• Evaluation• Conclusions

Page 26: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 26

DORA vs. Conventional

0 20 40 600

500010000150002000025000300003500040000

TPC-C Payment

DORABASELINE

Clients

Thro

ughp

ut (K

tps)

Avoid expensive (centralized) lock

manager operations

20%

Immune to centr. lock manager

40+%

Intra-xactionparallelism on

light loads

2x

Higher performance in the entire load spectrum

Page 27: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 27

Reducing overheads of locking• Rdb/VMS

– Distributed DBMS– Lock “carry-over” reduces network traffic

• DB2 “keep table lock” setting– Connection holds all table locks until close– Leads to “poor concurrency”

• H-Store– Single-threaded, non-interleaved execution model– No locking or latching at all

Page 28: @ Carnegie Mellon Databases Data-oriented Transaction Execution VLDB 2010 Ippokratis Pandis Ryan Johnson Nikos Hardavellas Anastasia Ailamaki Carnegie

© 2010 Ippokratis Pandis 28

Summary• Large number of active threads stress scalability

of database system• Data-oriented transaction execution (DORA)• Decentralize locking & data accesses• Benefits of shared-nothing w/o physical data

partitioning• Small modifications on a conventional engine• Higher performance on the entire load spectrum