mica : a holistic approach to fast in-memory key-value storage
DESCRIPTION
MICA : A Holistic Approach to Fast In-Memory Key-Value Storage. Hyeontaek Lim 1. Dongsu Han, 2 David G. Andersen, 1 Michael Kaminsky 3 1 Carnegie Mellon University 2 KAIST , 3 Intel Labs. Goal: Fast In-Memory Key-Value Store. Improve per-node performance (op/sec/node) - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/1.jpg)
MICA: A Holistic Approach to Fast In-Memory Key-Value Storage
Hyeontaek Lim1
Dongsu Han,2 David G. Andersen,1 Michael Kaminsky3
1Carnegie Mellon University2KAIST, 3Intel Labs
![Page 2: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/2.jpg)
2
Goal: Fast In-Memory Key-Value Store• Improve per-node performance (op/sec/node)
• Less expensive• Easier hotspot mitigation• Lower latency for multi-key queries
• Target: small key-value items (fit in single packet)
• Non-goals: cluster architecture, durability
![Page 3: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/3.jpg)
3
Q: How Good (or Bad) are Current Systems?• Workload: YCSB [SoCC 2010]
• Single-key operations
• In-memory storage• Logging turned off in our experiments
• End-to-end performance over the network
• Single server node
![Page 4: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/4.jpg)
4
Mem-cached
RAMCloud MemC3 Masstree MICA0
10
20
30
40
50
60
70
8095% GET ORIG
95% GET OPT
50% GET OPT
End-to-End Performance Comparison
Throughput (M operations/sec)
- Published results; Logging on RAMCloud/Masstree- Using Intel DPDK (kernel bypass I/O); No logging
- (Write-intensive workload)
![Page 5: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/5.jpg)
5
Mem-cached
RAMCloud MemC3 Masstree MICA0
10
20
30
40
50
60
70
80
1.4 0.14.4
8.9
1.35.8 5.7
16.5
65.6
0.7 1 0.96.5
70.495% GET ORIG
95% GET OPT
50% GET OPT
End-to-End Performance Comparison
Throughput (M operations/sec)
Performance collapses under heavy writes
- Published results; Logging on RAMCloud/Masstree- Using Intel DPDK (kernel bypass I/O); No logging
- (Write-intensive workload)
![Page 6: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/6.jpg)
6
Mem-cached
RAMCloud MemC3 Masstree MICA0
10
20
30
40
50
60
70
80
1.4 0.14.4
8.9
1.35.8 5.7
16.5
65.6
0.7 1 0.96.5
70.495% GET ORIG
95% GET OPT
50% GET OPT
End-to-End Performance Comparison
Throughput (M operations/sec)
13.5x
Maximum packets/secattainableusing UDP4x
![Page 7: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/7.jpg)
7
MICA Approach• MICA: Redesigning in-memory key-value storage
• Applies new SW architecture and data structuresto general-purpose HW in a holistic way
ClientCPU
NICCPU
Memory
Server node
1. Parallel data access
2. Requestdirection
3. Key-valuedata structures(cache & store)
![Page 8: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/8.jpg)
8
Parallel Data Access
• Modern CPUs have many cores (8, 15, …)• How to exploit CPU parallelism efficiently?
ClientCPU
NICCPU
Memory
Server node
1. Parallel data access
2. Requestdirection
3. Key-valuedata structures
![Page 9: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/9.jpg)
9
Parallel Data Access Schemes
CPU core
CPU coreMemory
CPU core
CPU core
Partition
Partition
Concurrent Read Concurrent Write
Exclusive ReadExclusive Write
+ Good load distribution
- Limited CPU scalability(e.g., synchronization)
- Cross-NUMA latency
+ Good CPU scalability
- Potentially low performanceunder skewed workloads
![Page 10: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/10.jpg)
10
In MICA, Exclusive Outperforms Concurrent
Throughput (Mops)
Concurrent Access Exclusive Access0
10
20
30
40
50
60
70
80
35.1
76.969.3
76.3
21.6
70.469.465.6
Uniform, 50% GETUniform, 95% GETSkewed, 50% GETSkewed, 95% GET
End-to-end performance with kernel bypass I/O
![Page 11: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/11.jpg)
11
Request Direction
• Sending requests to appropriate CPU cores forbetter data access locality
• Exclusive access benefits from correct delivery• Each request must be sent to corresp. partition’s core
ClientCPU
NICCPU
Memory
Server node
1. Parallel data access
2. Requestdirection
3. Key-valuedata structures
![Page 12: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/12.jpg)
12
Request Direction Schemes
Client CPU
CPU
Server node
Client
Classification using 5-tuple
CPU
CPU
Server node
Classification depends on request content
Key 1
Key 1
Key 2
Key 2Client
ClientNICNIC
+ Good locality for flows(e.g., HTTP over TCP)
- Suboptimal for smallkey-value processing
+ Good locality for key access
- Client assist or special HW support needed for efficiency
Flow-based Affinity Object-based Affinity
![Page 13: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/13.jpg)
13
Crucial to Use NIC HW for Request Direction
Throughput (Mops)
Using exclusive access for parallel data access
Request direction done solely by software
Client-assisted hardware-based request direction
0
10
20
30
40
50
60
70
80
33.9
76.9
28.1
70.4Uniform
Skewed
![Page 14: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/14.jpg)
14
Key-Value Data Structures
• Significant impact on key-value processing speed• New design required for very high op/sec
for both read and write• “Cache” and “store” modes
ClientCPU
NICCPU
Memory
Server node
1. Parallel data access
2. Requestdirection
3. Key-valuedata structures
![Page 15: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/15.jpg)
15
MICA’s “Cache” Data Structures• Each partition has:
• Circular log (for memory allocation)• Lossy concurrent hash index (for fast item access)
• Exploit Memcached-like cache semantics• Lost data is easily recoverable (not free, though)• Favor fast processing
• Provide good memory efficiency & item eviction
![Page 16: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/16.jpg)
16
Circular Log• Allocates space for key-value items of any length• Conventional logs + Circular queues
• Simple garbage collection/free space defragmentationNew item is
appended at tail
Head Tail
HeadTail
Evict oldest itemat head (FIFO)
Insufficient spacefor new item?
(fixed log size)
Support LRU by reinserting recently accessed items
![Page 17: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/17.jpg)
17
Lossy Concurrent Hash Index• Indexes key-value items stored in the circular log• Set-associative table
• Full bucket? Evict oldest entry from it• Fast indexing of new key-value items
Key,Val
bucket 0bucket 1
bucket N-1…
Circularlog
Hashindexhash(Key)
![Page 18: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/18.jpg)
18
MICA’s “Store” Data Structures• Required to preserve stored items
• Achieve similar performance by trading memory• Circular log -> Segregated fits• Lossy index -> Lossless index (with bulk chaining)
• See our paper for details
![Page 19: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/19.jpg)
19
Evaluation
• Going back to end-to-end evaluation…
• Throughput & latency characteristics
![Page 20: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/20.jpg)
20
Throughput ComparisonThroughput (Mops)
End-to-end performance with kernel bypass I/O
Memcached MemC3 RAMCloud Masstree MICA0
10
20
30
40
50
60
70
80
0.7 0.9 0.9
5.7
76.9
1.3
5.4 6.1
12.2
76.3
0.7 0.9 1
6.5
70.4
1.3
5.7 5.8
16.5
65.6
Uniform, 50% GETUniform, 95% GETSkewed, 50% GETSkewed, 95% GET
Bad at high write ratios
Similar performance regardless of skew/write
Largeperformance
gap
![Page 21: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/21.jpg)
21
Throughput-Latency on EthernetAverage latency (μs)
Throughput (Mops)0 10 20 30 40 50 60 70 80
0102030405060708090
100Original Memcached MICA
0 0.1 0.2 0.30
102030405060708090
100
Original Memcached using standard socket I/O; both use UDP
200x+ throughput
![Page 22: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/22.jpg)
22
MICA• Redesigning in-memory key-value storage• 65.6+ Mops/node even for heavy skew/write• Source code: github.com/efficient/mica
ClientCPU
NICCPU
Memory
Server node
1. Parallel data access
2. Requestdirection
3. Key-valuedata structures(cache & store)
![Page 23: MICA : A Holistic Approach to Fast In-Memory Key-Value Storage](https://reader035.vdocuments.site/reader035/viewer/2022062302/5681674a550346895ddbfa50/html5/thumbnails/23.jpg)
23
Reference• [DPDK]
http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/packet-processing-is-enhanced-with-software-from-intel-dpdk.html
• [FacebookMeasurement] Berk Atikoglu, Yuehai Xu, Eitan Frachtenberg, Song Jiang, and Mike Paleczny. Workload analysis of a large-scale key-value store. In Proc. SIGMETRICS 2012.
• [Masstree] Yandong Mao, Eddie Kohler, and Robert Tappan Morris. Cache Craftiness for Fast Multicore Key-Value Storage. In Proc. EuroSys 2012.
• [MemC3] Bin Fan, David G. Andersen, and Michael Kaminsky. MemC3: Compact and Concurrent MemCache with Dumber Caching and Smarter Hashing. In Proc. NSDI 2013.
• [Memcached] http://memcached.org/• [RAMCloud] Diego Ongaro, Stephen M. Rumble, Ryan Stutsman, John Ousterhout,
and Mendel Rosenblum. Fast Crash Recovery in RAMCloud. In Proc. SOSP 2011.• [YCSB] Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and
Russell Sears. Benchmarking Cloud Serving Systems with YCSB. In Proc. SoCC 2010.