transactions in a flash
TRANSCRIPT
![Page 1: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/1.jpg)
1
Transactions in a Flash
Morgan Price
Scott Nettles
![Page 2: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/2.jpg)
2
Flash Memory: Pros
• Solid-state memory
• Billions of dollars a year spent
• Persistent
• High-density
• Cheap
• Low latency
• High read bandwidth
![Page 3: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/3.jpg)
3
Flash Memory: Cons
• Low write bandwidth
• Write once until explicitly erased
• Erased in large blocks
• Erasing is slow
• Limited lifetime
• High cost compared to disks
![Page 4: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/4.jpg)
4
Key Issues
• Is Flash useful for transaction systems
• Where in the memory hierarchy– Disk-like block-oriented device
– Byte-oriented device
– Directly accessed as user memory
![Page 5: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/5.jpg)
5
Our Experiment
• Compare transaction logs using– Disk
– Battery-backed DRAM
– Flash Memory
• Emulate Flash with DRAM, timers
• Simple, fair comparison– No special Flash optimizations
![Page 6: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/6.jpg)
6
Outline
• Background– Transaction systems
– Flash memory
• Design & Implementation
• Experiments
• Performance Results
• Discussion
• Related & Future Work
![Page 7: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/7.jpg)
7
Transactions
• Reliability
• Resiliency to machine failure– Permanent storage
• Lots more to transaction systems besidespermanent storage– Consistency
– Concurrent access
![Page 8: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/8.jpg)
8
Permanent Storage
• Disks– file-system
– raw partitions
• Protection from disk failure– mirroring, RAID, tape
• Commit
– atomic update to persistent data
![Page 9: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/9.jpg)
9
Fast Commit
• Store updates separately– in an append-only log
– easy to append atomically
• When log fills– Truncate & reuse
• Appending disk logs is relatively fast– No seeks, just rotational delay
– Batch transactions to reduce latency
![Page 10: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/10.jpg)
10
Batched Commits
• High disk latency for writes
• Reduced by batching commits– Increases transaction throughput
– Increases transaction latency
• Requires lots of concurrency– Great for database systems
– Maybe not for other transaction systems• persistent heaps
• filesystem meta-data
![Page 11: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/11.jpg)
11
Flash Logging
• Write-once
• Erase on truncation
• Automatic wear-leveling
• Low write latency
• High parallelism is feasible– hardware failures should be very rare
– simpler than RAID
![Page 12: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/12.jpg)
12
Flash Memory
• EEPROM, but with block erases
• 1 transistor per Flash cell
• Flash cells are 30% smaller than DRAM
• Lower cost per bit, eventually
• Intel has samples with >1 bit per cell– Flash could be used for main memory
• Cheaper than battery-backed DRAM
![Page 13: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/13.jpg)
13
Reading from Flash Memory
• Organized similarly to DRAM
• Flash chips with DRAM read interfaces
• Flash read performance matches DRAM
![Page 14: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/14.jpg)
14
Writing to Flash Memory
• Slow: 6 µs versus 65 ns for read
• Each bit can change from 1 to 0– but not back
– Writing Flash is an AND
• Very low latency compared to disk
• 163 KB/s per chip– low bandwidth compared to disk
![Page 15: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/15.jpg)
15
Erasing Flash Memory
• Erases a whole block: 64 KB
• Conditioning– Forces each bit to 0 before erasing
– Slows down erase
– Raises lifetimes
• 600 ms latency
• 107 KB/s per chip
![Page 16: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/16.jpg)
16
Flash Lifetimes
• Write and erase “stress” the chip
• Too-slow blocks have “failed”– no data loss
• 100,000 erase cycles guaranteed
• 1,000,000 expected given– wear-leveling
– retirement of (rare) failed blocks
![Page 17: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/17.jpg)
17
Outline
• Background
• Design and Implementation– The transaction system
– Using Flash as a transaction log
– Flash emulator
– Device drivers for different access models
• Experiments
• Performance Results
• Discussion
![Page 18: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/18.jpg)
18
The Transaction System
• Recoverable Virtual Memory (RVM)
• Persistence on regions of virtual memory
• Assumes small persistent working set– Must fit in physical memory for good
performance
![Page 19: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/19.jpg)
19
RVM’s Transaction Log
• Circular disk-based log
• Truncation forces updates to actual data– Big operation with high latency
– Can be asynchronous
![Page 20: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/20.jpg)
20
Changes to RVM
• Small– Added 500 lines of C out of 20,000 original
• Erase log after truncation
• Erase entire log on recovery
• Minor changes to device configuration
• Occasional in-place updates to meta-data– Replaced with a mini-log
![Page 21: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/21.jpg)
21
Flash Mini-log
• Data block has a log of values
• Commit appends value & mark
• To reclaim space in a block– at least two data blocks
– third block points to valid data block
• Infrequently used– performance is not a concern
![Page 22: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/22.jpg)
22
Sidney
• Persistent heap for Standard ML (SML)
• Based on SML/NJ and RVM
• No changes to Sidney were needed
• No garbage collection– Sidney doesn’t use the transaction log
– Flash & copying garbage collection
![Page 23: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/23.jpg)
23
The Flash Emulator
• Emulates Flash with DRAM
• Reading Flash is fast– RVM forces an unnecessary copy
• Delay writes and erases with timers
• Worst error was 30 ns/byte
![Page 24: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/24.jpg)
24
Flash Configurations
• Vary the bandwidth– as if varying the parallelism
• Emulate battery-backed DRAM– just turn off delays
• Vary memory hierarchy
![Page 25: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/25.jpg)
25
Flash as Kernel Memory
• Simple character device driver
• Emulates Flash with kernel memory
• Erase as ioctl
![Page 26: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/26.jpg)
26
Flash as a Disk
• Another simple device driver
• Ignore overhead of in-place write semantics– Writes are contiguous
• Ignore cost of I/O bus
• Faster writes through page buffers?
![Page 27: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/27.jpg)
27
Outline
• Design & Implementation
• Experiments– Benchmarks
– The Flash memory simulated
– The benchmarking environment
• Performance Results
• Discussion
• Related & Future Work
![Page 28: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/28.jpg)
28
C++ Debit-Credit
• Closely based on TPC-B
• Does not scale database size by TPS– Fixed at 100,000 accounts
• Each transaction modifies 448 bytes– RVM writes 748 byte commit record
• Run 50,000 transactions and truncate
• 16 MB log fills twice
![Page 29: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/29.jpg)
29
Sidney Debit-Credit
• Written in SML instead of C++
• Sidney implicitly logs writes– 80 bytes sent to RVM
– 548 byte commit record
– 16 MB log fills once
![Page 30: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/30.jpg)
30
SML/NJ Compiler
• Compiles 38 of its own files
• Stores data in persistent heap
• Commits after each file
• Lots of actual computation
• 15 KB transactions
![Page 31: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/31.jpg)
31
Flash Memory Details
• Parallelism from 4 to 64– Our memory controller goes up to 512-way
• Intel’s 28F016SV: 2M by 8 bits– 65 ns reads, 6 µs writes (163 KB/s)
– 32 erasable blocks of 64 KB each: 107 KB/s
– Two 256-byte page buffers• bulk writes at 465 KB/s
– Suspends slow operations for faster
![Page 32: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/32.jpg)
32
Benchmarking Machine
• SGI Challenge-L– 4 R4400 processors at 250 MHz
– 384 MB of main memory
– IRIX 5.3
• Disks rotate at 7200 RPM– limits RVM’s disk log to 120 TPS
• ~6 MB/s of raw disk bandwidth
• Flash emulator delays accurate to 30 ns/byte
![Page 33: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/33.jpg)
33
RVM Configuration
• RVM data stored in the EFS filesystem
• Disk-based log stored on a raw partition
• Log size fixed at 16 MB– not including the mini-log
Truncation Update Overheads (µs)
0 200 400 600 800 1000 1200
81632
Log
Siz
e (M
B)
![Page 34: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/34.jpg)
34
Performance Evaluation
• Vary Flash parallelism– Flash results in significant speedups
– Flash is bandwidth limited
– Disk is latency limited
• Vary the memory hierarchy– Flash-disks suffer from fragmentation
– Kernel overheads are insignificant
![Page 35: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/35.jpg)
35
• Flash log is bandwidth-limited
• Flash approaches battery-backed DRAM
C++ Debit-Credit Throughput
Disk
Flash-disk
Flash
DRAM
0
200
400
600
800
1000
1200
4 8 16 32 64
Parallelism (log-scale)
Tra
nsac
tions
per
Sec
ond
![Page 36: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/36.jpg)
36
Why is the Flash-Disk Slow?
• Not due to kernel overheads– Turn off cycle timers
– Battery-backed DRAM vs. fast Flash-Disk
– No difference
• Due to fragmentation– Writes 67% more bytes per transaction
![Page 37: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/37.jpg)
37
Using Page-Buffers• Triples the write bandwidth (optimistically)
– Sector size < Parallelism * Page Buffer Size
TPS at 4-way parallelism
0 50 100 150 200 250 300
Flash-disk
Flash-disk withPage Buffers
Flash
![Page 38: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/38.jpg)
38
• Low peak throughput– CPU overheads rise from 0.3 to 1.0 ms
Debit-Credit Throughput: C++ vs. Sidney
Sidney Disk
Sidney Flash
C++ FlashSidney DRAM
0100200300400500600700800900
4 8 16 32 64
Parallelism (log-scale)
Tra
nsac
tions
per
Sec
ond
![Page 39: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/39.jpg)
39
Why isn’t Low-ParallelismSidney Slower?
• Big log bandwidth savings!– Writes 548 vs. 748 bytes
• Flash requires compact logs
![Page 40: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/40.jpg)
40
A Better Log Representation
• RVM has high header overheads– 76 bytes per transaction
– 56 bytes per range modified
• Reasonable headers are 8 byte each!Commit Record Sizes (bytes)
0 200 400 600 800
Optimized
Original SidneyC++
![Page 41: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/41.jpg)
41
• Big win for Sidney version
• C++ programmer could do the same
• RVM could compare the modified ranges
Debit-Credit with Optimized Headers
Ideal Disk
Sidney Flash
C++ Flash
Sidney DRAM
0
200
400
600
800
1000
1200
4 8 16 32 64
Parallelism (log-scale)
Tra
nsac
tions
per
Sec
ond
![Page 42: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/42.jpg)
42
Sidney Debit-Credit Overheads (ms)
0 2 4 6 8 10
Disk
4x
64x
Truncation EraseTruncation UpdateCommit I/OCompute
![Page 43: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/43.jpg)
43
• 670 ms of Compute time not shown– Allows background truncation
SML/NJ Compiler Overheads (ms)
0 20 40 60 80
Disk
4x
64x
Truncation EraseTruncation UpdateCommit I/O
![Page 44: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/44.jpg)
44
Outline
• Performance Results
• Discussion– Flash Life-time
– How to Extend it
– Bigger Transactions
• Related Work
• Future Work
• Conclusions
![Page 45: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/45.jpg)
45
Flash Life-time
1,000,000 erase cycles• 1,000,000 erase cycles– Conservative given block retirement
• Block retirement by– virtual to physical remapping
– sector remapping
• At least 13 MB of data at 8x
• (13 MB * 1,000,000 erases) /
(748 bytes * 1000 TPS) = 200 days
![Page 46: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/46.jpg)
46
Extending Life-time
• More Flash extends life– at expense of price-performance
• Header optimizations– Extend lifetime to at least 2 years
• Better hardware
• Bursty work-loads
• Actually reading the Flash
![Page 47: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/47.jpg)
47
Log Compression
• Compiler log compresses by 2x– Real application
– Would double flash lifetime
• Compress/Decompress runs at 1 MB/s
• Improves performance if– Write&Erase Bandwidth < 1/2 MB/s
– Breaks even at 8x parallelism
![Page 48: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/48.jpg)
48
Hybrid Logging
• For large transactions, want both– low-latency (Flash)
– high-bandwidth (disk)
• Write-ahead logging– standard optimization
– “speculatively” writes to disk
• Use Flash for the final commit write
![Page 49: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/49.jpg)
49
Related Work: eNVy
• Persistent memory controller– 256-bytes wide
– 2 GB of flash
• Allow in-place update via 64 MB of SRAM
• Differences of– Data area
– Scale: eNVy supports I/O rates of 30,000 TPS
– Custom hardware, SRAM
• Wu & Zwaenepoel, in ASPLOS ‘94
![Page 50: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/50.jpg)
50
Related Work: Filesystems
• Flash file-systems for mobile computers– Low-power
– High durability
– Douglas et al, in OSDI ‘94
– Kawaguchi et al, in Usenix ‘95
![Page 51: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/51.jpg)
51
Outline
• Related Work
• Future Work– Real Hardware
– Improved Logger
– Redesigning Sidney to use Flash directly
• Conclusions
![Page 52: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/52.jpg)
52
Real Hardware
• A simple memory controller– The Intel 28F016XD has a DRAM interface
• New performance measurements– Effect of background truncation on throughput
• Allow page-buffer use on small writes
• System cache structure– Forcing persistent writes to Flash
– Flushing stale data upon erase
![Page 53: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/53.jpg)
53
Improved Logger
• Header space optimizations
• Replace RVM logger– Designed for disks
– CPU overhead for error checking
• Retirement of slowed blocks– page-remapping
• Batched commits
![Page 54: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/54.jpg)
54
Flash-Based Persistent Heaps
• Use Flash as bulk of main memory– Eliminate the disk update overheads
• Most Sidney data is immutable
• Copying garbage collection– append-only
– frees up large chunks to be erased
• Keep mutable data, young data in DRAM
![Page 55: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/55.jpg)
55
Conclusions
• Flash memory is well-suited for transactionlogging.
• Flash logging is– easy to implement.
– fast for small transactions
– can rival battery-backed DRAM for speed
![Page 56: Transactions in a Flash](https://reader034.vdocuments.site/reader034/viewer/2022052523/555c262cd8b42a09438b4ca5/html5/thumbnails/56.jpg)
56
Thanks to
• Satya and the CODA group for RVM
• Puneet Kumar for C++ Debit-Credit
• Mark Foster for help with memory systems
• Michael Wu for information about eNVy