solid state storage technologies - csl.skku.educsl.skku.edu/uploads/ice3028s16/7-sss.pdf · solid...
TRANSCRIPT
Jin-Soo Kim ([email protected])
Computer Systems Laboratory
Sungkyunkwan University
http://csl.skku.edu
Solid State
Storage
Technologies
2ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
NVMe (1)
▪ NVM Express (NVMe)
• For accessing PCIe-based SSDs
• Bypass block I/O layer
• Low latency
• Read, Write, Flush, Format
• Deallocate, Atomic Write
• Data Set Management
3ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
NVMe (2)
▪ Deep queue: 64K commands/queue, up to 64K queues
▪ Streamlined command set: only 13 required commands
▪ One register write to issue a command (“doorbell”)
▪ Support for MSI-X and interrupt aggregation
Doorbell
4ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
All-Flash Array
▪ Interfaces
• 10Gb/40Gb Ethernet (iSCSI) or
16Gb Fibre Channel or PCIe
• SAS or NVMe SSDs
▪ Functionalities
• Volume management
• Virtualization support
• RAID
• Snapshot
• Deduplication
• Compression, …
5ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
Traditional Block Interface
▪ SATA/SCSI/SAS
• Read (sector #, length)
Write (sector #, length, data)
• No block-level liveness information
• No high-level semantics on data
• Several “unwritten contracts”
do not hold for SSDs
– Sequential accesses are several tens of
times better than random accesses
– Distant LBNs lead to longer seek times
– Data written is equal to data issued
– …
FTL
SSD
Host
Block device driver
File system
Block I/F
NAND Flash
Flash I/F
6ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
Extending Block I/F
▪ TRIM command
• “The data in the specified sectors is no
longer needed”
• ATA interface standard
(T13 technical committee)
• Non-queued command
• SATA 3.1 introduces the Queued TRIM
commandFTL
SSD
Host
Block device driver
File system
NAND Flash
Block I/F + SSD-Specific I/F
Flash I/F
7ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
Atomic Write
▪ Transaction support for multi-block writes
• Simplifies file systems and DBMSes
X. Quyang, et al., “Beyond Block I/O: Rethinking Traditional Storage Primitives,” HPCA, 2011.
8ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
FusionIO’sVFSL
▪ Virtualized Flash Storage Layer
• Provides 64-bit virtual
block-addressed space
• Virtual-to-physical block
mapping: A variation of
B-trees in memory
• FTL functionalities
• read, write, trim, and
atomic update supported
W. Josephson, et al., “DFS: A File System for Virtualized Flash Storage,” FAST, 2010.
9ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
DFS over VFSL
▪ Direct File System
• Simple metadata and data layout (2TB chunk)
10ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
Annotating Block Semantics
▪ For differentiated storage service
• “Class” mapped to Group
Number field (5 bits) of
the SCSI CDB
• Selective caching on SSDs
▪ Similar mechanism in
eMMC 4.5
• ContextID
M. Mesnier, et al., “Differentiated Storage Services,” SOSP, 2011.
Ext3 Class
Group Number
Cache priority
Unclassified 0 12
Superblock 1 0
Group desc. 2 0
Bitmap 3 0
Inode 4 0
Indirect block 5 0
Directories 6 0
Journal 7 0
File <= 4KB 8 1
File <= 16KB 9 2
… … …
File > 1GB 18 11
11ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
Multi-streamed SSD (1)
▪ Previous write patterns (= current state) matter
12ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
Multi-streamed SSD (2)
▪ Mapping data with different lifetime to different streams
▪ Standardized in T10 SCSI (SAS SSDs)
13ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
Multi-streamed SSD (3)
▪ Cassandra with Multi-streamed SSD
14ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
Multi-streamed SSD (4)
▪ Cassandra’s normalized updated throughput with
5 streams
15ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
OSSD: Object-based SSD (1)
▪ OSD (Object-based Storage Device)
• Virtualizes physical storage as a pool of objects
• Offloads space management to storage devices
• Standardized as a subset of SCSI command set
Block interface Object interface
16ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
OSSD: Object-based SSD (2)
▪ OSD storage modelApplication
System Call Interface
File System Storage Management
Sector/LBA Interface
Block I/O Manager
Physical Media
File System User Component
Application
System Call Interface
OSD Storage Management
OSD Interface
Block I/O Manager
Physical Media
File System User Component
Host
StorageDevice
Traditional OSD
17ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
OSSD: Object-based SSD (3)
OAQ
OFS
VFS
iSCSI Initiator
iSCSI Target Daemon
OSSD Framework
OML
FML
FAL
Host
Target
RawSSD
READ/WRITE/ERASE SATA-2
OSD Interface (iSCSI) TCP 1Gbps
46
7
7
OID -ContextHash
Q n
Q 1
Q 0
16:8
Priority Queue
oid = 7
oid = 46
I/OContext
Object I /O instances
W
W
COAQ
AllocationBitmap
FML
OML
Descriptor
Object Attr.Object Data
μ-Tree
(Extents)
Object Data Buffer
18ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
OSSD: Object-based SSD (4)
▪ Simplified host file system
• No need for SSD-specific parameter tuning
▪ More efficient management of flash storage
• Block-level liveness
• Metadata separation
• Object-aware storage management (allocation, dedup, ...)
▪ Application-aware storage management
• Application hints, QoS
▪ Storage virtualization
• Pooling, tiering, caching, backup, replication, etc.
19ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
In-Storage Computing (1)
▪ Samsung ISC SSD Prototype
• Commodity SSD: Samsung PM1725 NVMe with the ISC
feature
• PCIe 3.0x4
• 800 GB
▪ Software
• C++11
• C++STL
• G++
• Software emulator
20ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
In-Storage Computing (2)
▪ ISC Application Development Process
21ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
In-Storage Computing (3)
▪ ISC Dataflow Programming Model
22ICE3028: Embedded Systems Design | Spring 2016 | Jin-Soo Kim ([email protected])
In-Storage Computing (4)
▪ Example: Simple Key-Value Store