phase reconciliation for contended in-memory transactions
DESCRIPTION
Phase Reconciliation for Contended In-Memory Transactions. Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris MIT CSAIL and Harvard. Incr Txn ( k Key) { INCR ( k , 1) } LikePageTxn ( page Key, user K ey) { INCR ( page , 1) liked_pages := GET ( user ) - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/1.jpg)
1
Phase Reconciliation for Contended In-Memory
Transactions
Neha Narula, Cody Cutler, Eddie Kohler, Robert Morris
MIT CSAIL and Harvard
![Page 2: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/2.jpg)
2
IncrTxn(k Key) {INCR(k, 1)
}
LikePageTxn(page Key, user Key) {INCR(page, 1)liked_pages := GET(user)PUT(user, liked_pages + page)
}
FriendTxn(u1 Key, u2 Key) {PUT(friend:u1:u2, 1)PUT(friend:u2:u1, 1)
}
![Page 3: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/3.jpg)
3
IncrTxn(k Key) {INCR(k, 1)
}
LikePageTxn(page Key, user Key) {INCR(page, 1)liked_pages := GET(user)PUT(user, liked_pages + page)
}
FriendTxn(u1 Key, u2 Key) {PUT(friend:u1:u2, 1)PUT(friend:u2:u1, 1)
}
![Page 4: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/4.jpg)
4
Applications experience write contention on popular data
Problem
![Page 5: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/5.jpg)
5
![Page 6: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/6.jpg)
6
Concurrency Control Enforces Serial Execution
core 0
core 1
core 2
INCR(x,1)
INCR(x,1)
INCR(x,1)
time
Transactions on the same records execute one at a time
![Page 7: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/7.jpg)
7
Throughput on a Contentious Transactional Workload
![Page 8: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/8.jpg)
8
Throughput on a Contentious Transactional Workload
![Page 9: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/9.jpg)
9
INCR on the Same Records Can Execute in Parallel
core 0
core 1
core 2
INCR(x0,1)
INCR(x1,1)
INCR(x2,1)
time
• Transactions on the same record can proceed in parallel on per-core slices and be reconciled later
• This is correct because INCR commutes
1
1
1
per-core slices of record x
x is split across cores
![Page 10: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/10.jpg)
10
Databases Must Support General Purpose Transactions
IncrTxn(k Key) {INCR(k, 1)
}
PutMaxTxn(k1 Key, k2 Key) {v1 := GET(k1)v2 := GET(k2)if v1 > v2:
PUT(k1, v2)else:
PUT(k2, v1)return v1,v2
}
IncrPutTxn(k1 Key, k2 Key, v Value) {
INCR(k1, 1)PUT(k2, v)
}
Must happen
atomically
Must happen atomically
Returns a value
![Page 11: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/11.jpg)
11
Challenge
Fast, general-purpose serializable transaction execution with per-core slices for contended records
![Page 12: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/12.jpg)
12
Phase Reconciliation
• Database automatically detects contention to split a record among cores
• Database cycles through phases: split, reconciliation, and joined
reconciliation
Joined Phase
Split Phase
Doppel, an in-memory transactional database
![Page 13: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/13.jpg)
13
Contributions
Phase reconciliation– Splittable operations– Efficient detection and response to
contention on individual records– Reordering of split transactions and
reads to reduce conflict– Fast reconciliation of split values
![Page 14: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/14.jpg)
14
Outline
1. Phase reconciliation2. Operations3. Detecting contention4. Performance evaluation
![Page 15: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/15.jpg)
15
Split Phase
core 0
core 1
core 2
INCR(x,1)
INCR(x,1) PUT(y,2)
INCR(x,1) PUT(z,1)
core 3 INCR(x,1) PUT(y,2)
core 0
core 1
core 2
INCR(x0,1)
INCR(x1,1) PUT(y,2)
INCR(x2,1) PUT(z,1)
core 3 INCR(x3,1) PUT(y,2)
• The split phase transforms operations on contended records (x) into operations on per-core slices (x0, x1, x2, x3)
split phase
![Page 16: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/16.jpg)
16
• Transactions can operate on split and non-split records• Rest of the records use OCC (y, z)• OCC ensures serializability for the non-split parts of the
transaction
core 0
core 1
core 2
INCR(x0,1)
INCR(x1,1) PUT(y,2)
INCR(x2,1) PUT(z,1)
core 3 INCR(x3,1) PUT(y,2)
split phase
![Page 17: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/17.jpg)
17
• Split records have assigned operations for a given split phase• Cannot correctly process a read of x in the current state• Stash transaction to execute after reconciliation
core 0
core 1
core 2
INCR(x0,1)
INCR(x1,1) PUT(y,2)
INCR(x2,1) PUT(z,1)
core 3 INCR(x3,1) PUT(y,2)
split phase
INCR(x1,1)
INCR(x2,1)
INCR(x1,1)
GET(x)
![Page 18: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/18.jpg)
18
core 0
core 1
core 2
INCR(x0,1)
INCR(x1,1) PUT(y,2)
INCR(x2,1) PUT(z,1)
core 3 INCR(x3,1) PUT(y,2)
reco
ncile
split phase
• All threads hear they should reconcile their per-core state
• Stop processing per-core writes
GET(x)
INCR(x1,1)
INCR(x2,1)
INCR(x1,1)
![Page 19: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/19.jpg)
19
• Reconcile state to global store• Wait until all threads have finished reconciliation• Resume stashed read transactions in joined phase
core 0
core 1
core 2
core 3
All done
reco
ncile
reconciliation phase joined phase
x = x + x0
x = x + x1
x = x + x2
x = x + x3
GET(x)
![Page 20: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/20.jpg)
20
core 0
core 1
core 2
core 3
x = x + x0
x = x + x1
x = x + x2
x = x + x3
reco
ncile
reconciliation phase
GET(x)
All done
joined phase
• Reconcile state to global store• Wait until all threads have finished reconciliation• Resume stashed read transactions in joined phase
![Page 21: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/21.jpg)
21
core 0
core 1
core 2
core 3
GET(x)
• Process new transactions in joined phase using OCC• No split data
All done
joined phase
INCR(x)
INCR(x, 1)
GET(x)GET(x)
![Page 22: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/22.jpg)
22
Batching Amortizes the Cost of Reconciliation
core 0
core 1
core 2
INCR(x0,1)
INCR(x1,1) INCR(y,2)
INCR(x2,1) INCR(z,1)
core 3 INCR(x3,1) INCR(y,2)
GET(x)
• Wait to accumulate stashed transactions, batch for joined phase• Amortize the cost of reconciliation over many transactions• Reads would have conflicted; now they do not
INCR(x1,1)
INCR(x2,1) INCR(z,1)
GET(x)
GET(x)
All done
reco
ncile
GET(x)
GET(x)
GET(x)
split phasejoined phase
![Page 23: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/23.jpg)
23
Phase Reconciliation Summary
• Many contentious writes happen in parallel in split phases
• Reads and any other incompatible operations happen correctly in joined phases
![Page 24: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/24.jpg)
24
Outline
1. Phase reconciliation2. Operations3. Detecting contention4. Performance evaluation
![Page 25: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/25.jpg)
Ordered PUT and insert to an ordered
list
Operation Model
Developers write transactions as stored procedures which are composed of operations on keys and values:
25
value GET(k)void PUT(k,v)
void INCR(k,n)void MAX(k,n)void MULT(k,n)
void OPUT(k,v,o)void TOPK_INSERT(k,v,o)
Traditional key/value operations
Operations on numeric values
which modify the existing value
Not splittable
Splittable
![Page 26: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/26.jpg)
26
27
55
MAX Can Be Efficiently Reconciled
core 0
core 1
core 2
MAX(x0,55)
MAX(x1,10)
MAX(x2,21)
21
• Each core keeps one piece of state xi
• O(#cores) time to reconcile x• Result is compatible with any order
MAX(x0,2)
MAX(x1,27)
x = 55
![Page 27: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/27.jpg)
27
What Operations Does Doppel Split?
Properties of operations that Doppel can split:– Commutative– Can be efficiently reconciled– Single key– Have no return value
However:– Only one operation per record per split
phase
![Page 28: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/28.jpg)
28
Outline
1. Phase reconciliation2. Operations3. Detecting contention4. Performance evaluation
![Page 29: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/29.jpg)
29
Which Records Does Doppel Split?
• Database starts out with no split data• Count conflicts on records–Make key split if #conflicts >
conflictThreshold
• Count stashes on records in the split phase–Move key back to non-split if #stashes
too high
![Page 30: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/30.jpg)
30
Outline
1. Phase reconciliation2. Operations3. Detecting contention4. Performance evaluation
![Page 31: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/31.jpg)
31
Experimental Setup and Implementation
• All experiments run on an 80 core Intel server running 64 bit Linux 3.12 with 256GB of RAM
• Doppel implemented as a multithreaded Go server; one worker thread per core
• Transactions are procedures written in Go• All data fits in memory; don’t measure RPC• All graphs measure throughput in
transactions/sec
![Page 32: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/32.jpg)
32
Performance Evaluation
• How much does Doppel improve throughput on contentious write-only workloads?
• What kinds of read/write workloads benefit?
• Does Doppel improve throughput for a realistic application: RUBiS?
![Page 33: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/33.jpg)
33
Doppel Executes Conflicting Workloads in Parallel
Th
rou
gh
pu
t (m
illio
ns
txn
s/se
c)
20 cores, 1M 16 byte keys, transaction: INCR(x,1) all on same key
Doppel OCC 2PL0
5000000
10000000
15000000
20000000
25000000
30000000
35000000
![Page 34: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/34.jpg)
34
Doppel Outperforms OCC Even With Low Contention
20 cores, 1M 16 byte keys, transaction: INCR(x,1) on different keys
5% of writes to contended
key
![Page 35: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/35.jpg)
35
Contentious Workloads Scale Well
1M 16 byte keys, transaction: INCR(x,1) all writing same key
Communication of phase changing
![Page 36: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/36.jpg)
36
LIKE Benchmark
• Users liking pages on a social network• 2 tables: users, pages• Two transactions:
– Increment page’s like count, insert user like of page– Read a page’s like count, read user’s last like
• 1M users, 1M pages, Zipfian distribution of page popularity
Doppel splits the page-like-counts for popular pagesBut those counts are also read more often
![Page 37: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/37.jpg)
37
Benefits Even When There Are Reads and Writes to the Same Popular Keys
Doppel OCC0
1000000
2000000
3000000
4000000
5000000
6000000
7000000
8000000
9000000
Th
rou
gh
pu
t (m
illio
ns
txn
s/se
c)
20 cores, transactions: 50% LIKE read, 50% LIKE write
![Page 38: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/38.jpg)
38
Doppel Outperforms OCC For A Wide Range of Read/Write Mixes
20 cores, transactions: LIKE read, LIKE write
Doppel does not split any data and
performs the same as OCC
More stashed read transactions
![Page 39: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/39.jpg)
39
RUBiS
• Auction application modeled after eBay– Users bid on auctions, comment, list new items,
search
• 1M users and 33K auctions• 7 tables, 17 transactions• 85% read only transactions (RUBiS bidding mix)
• Two workloads:– Uniform distribution of bids– Skewed distribution of bids; a few auctions are very
popular
![Page 40: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/40.jpg)
40
StoreBid Transaction
StoreBidTxn(bidder, amount, item) {INCR(NumBidsKey(item),1)MAX(MaxBidKey(item), amount)
OPUT(MaxBidderKey(item), bidder, amount)
PUT(NewBidKey(), Bid{bidder, amount, item})}
All commutative operations on potentially conflicting auction
metadata
Inserting new bids is not likely to conflict
![Page 41: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/41.jpg)
41
Uniform Skewed0
2000000
4000000
6000000
8000000
10000000
12000000
DoppelOCC
Doppel Improves Throughput on an Application Benchmark
Th
rou
gh
pu
t (m
illio
ns
txn
s/se
c)
80 cores, 1M users 33K auctions, RUBiS bidding mix
8% StoreBid Transactions
3.2x throughput improveme
nt
![Page 42: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/42.jpg)
42
Related Work
• Commutativity in distributed systems and concurrency control– [Weihl ’88]
– CRDTs [Shapiro ’11]
– RedBlue consistency [Li ’12]
– Walter [Lloyd ’12]
• Optimistic concurrency control– [Kung ’81]
– Silo [Tu ’13]
• Split counters in multicore OSes
![Page 43: Phase Reconciliation for Contended In-Memory Transactions](https://reader035.vdocuments.site/reader035/viewer/2022062408/5681336f550346895d9a8317/html5/thumbnails/43.jpg)
43
ConclusionDoppel:
• Achieves parallel performance when many
transactions conflict by combining per-core data and
concurrency control
• Performs comparably to OCC on uniform or read-
heavy workloads while improving performance
significantly on skewed workloads.
http://pdos.csail.mit.edu/doppel