proposed solutions
DESCRIPTION
OLTP Execution: Stratified or Collective?. Transaction B. Transaction C. Transaction A. Islam Atta 1 , Pınar Tözün 2 , Xin Tong 1 , Anastasia Ailamaki 2 , Andreas Moshovos 1 1 University of Toronto, 2 École Polytechnique Fédérale de Lausanne. Better. Better. - PowerPoint PPT PresentationTRANSCRIPT
Proposed Solutions
SLICC: Collective CachesMulti-core – Spread in “Space”
STREX: Stratified ExecutionSingle-Core – Spread in “Time”
Prob
lem
& O
ppor
tuni
tyK
ey Results
OLTP Execution: Stratified or Collective?Islam Atta1, Pınar Tözün2, Xin Tong1, Anastasia Ailamaki2, Andreas Moshovos1
1University of Toronto, 2École Polytechnique Fédérale de Lausanne
fits in aggregate L1-I capacity of a CMP
A transaction's Instruction footprint
Foot
prin
t
L1-I size
Bet
ter
L1-I size
Foot
prin
tEach
Email: [email protected]: 416-805-8790Website: http://islamatta.com
Many concurrent Transactions
TPC-C TPC-E
Instruction Overlap
0
1
2
3
CORES
Conventional
SLICC
Dividedwe Fail
United we Succeed
A A B A C B C
A A A A B B C C C
Transaction A Transaction B Transaction C
Cache Thrashing Overhead
Time
STREX
ThroughputL1 Instruction Misses
Opportunity – CMP Integration
Payment
IT(CUST)
R(DIST)
R(CUST)
U(CUST)
U(DIST)
U(WH)
I(HIST)
R(WH) 41.4KB
40.5KB
41.8KB
39.1KB
29.9KB
28.7KB
28.7KB
47.4KB
New Order
R(DIST)
I(NORD)
R(WH)
U(DIST)
R(CUST)
R(ITEM)
R(STO)
U(STO)
I(OL)
I(ORD)
Loop (OL_CNT)
41.5KB
40.5KB
40.5KB28.8KB
39.6KB
40.2KB
65.3KB
29.4KB
41.5KB
41.5KB
B
AB C
1T1Leader Transaction
Phase # 2 3 4 5T1 T1 T1 T2
A A A A
B BB
A B
A C
B C
STREX SLICCBaseline HYBRID
TPC-C TPC-E0%
20%
40%
60%
80%
100%
BusyOther StallsInstruction Stalls
Exec
ution
Cyc
les B
reak
dow
n
TPC-C
TPC-E
0 1 2 3 4IPC
Transaction B Transaction CTransaction AExample Transaction Control Flow
Possible execution flows; Significant Overlap
L1 Data MissesOperation Overlap
Intel Xeon X5660 4-way Issue
Ideal
Core Cycles Wasted!
Instruction Stalls
Dominate
9
3
10
4
Cache
Refill Coun
t
2-core
4-core
8-core
16-core
2-core
4-core
8-core
16-core
TPC-C TPC-E
01234567
Rela
tive
Thro
ughp
ut
2 co
res
4 co
res
8 co
res
16 co
res
2 co
res
4 co
res
8 co
res
16 co
res
TPC-C TPC-E
05
10152025303540
D-M
PKI
2 co
res
4 co
res
8 co
res
16 co
res
2 co
res
4 co
res
8 co
res
16 co
res
TPC-C TPC-E
05
10152025303540
I-MPK
I
A B AC
C C C
0
1
2
Conventional
OLTP Micro-Architectural Evaluation
Instruction Caches are Thrashed
Why Instruction Stalls?
Opportunity – Inter-Transaction Behavior
Methodology
Simulator: x86 CMP
CPU: Out-of-Order, 2.5 GHz
L1-I/D: Private, 32 KB
L2: Unified, 1MB per core
Memory: DDR3, 1.6 GHz
Storage Manager: Shore-MT
TPC-C: 10 warehouse, 1GB
TPC-E: 1000 clients, 20 GB
Bet
ter
HYBRID: STREX + SLICCDynamically Selects the Better Scheduler
Measure Dynamic Instruction Footprint
Runtime Aggregate Cache
Capacity
STREX
SLICC
Compare
Threads Migrate Chasing Locality