solid state storage technologies - skku eslab

Post on 13-Jun-2022

6 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu)

Solid State Storage Technologies

Dongkun Shin (dongkun@skku.edu)

Embedded Software Laboratory

Sungkyunkwan University

http://nyx.skku.ac.kr/

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 2

NVMe (1)

• The industry standard interface for

high-performance NVM storage

– NVMe 1.0 specification in 2011 (now 1.3)

– Supported by major OSes: Windows, Linux, Solaris, …

• PCIe-based

– Low latency: direct connection to CPU

– Scalable performance: 1GB/s per lane, up to 32 lanes

– No HBA required: reduced power & cost

• Form factors

– Add-in-Card, M.2, BGA, etc.

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 3

NVMe (2)

• Deep queue: 64K commands/queue, up to 64K

queues

• Streamlined command set: only 13 required

commands

• One register write to issue a command

(“doorbell”)

• Support for MSI-X and interrupt aggregation

Doorbell

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 4

User-level NVMe Drivers

• Ex. Intel SPDK (storage performance development kit)

– All I/O operations issued in user-land

– Polling or interrupt (signal to user process)

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 5

User-level NVMe Drivers

• NVMeDirect framework

– User-level access +

kernel-level access

NV

MeD

irec

t Li

bra

ry

NVMe Controller

I/O

H

and

les

I/O

Q

ueu

es

Block Cache

I/O Scheduler

I/O Completion Thread

Handle Handle

Admin Tool

NVMeDirectAPI

Use

rK

ern

el

HW

NV

Me

Dri

ver

Def

ault

Q

ueu

es

Use

r Q

ueu

es

H.-J. Kim, Y.-S. Lee, and J.-S. Kim, “NVMeDirect: A User-space I/O Framework for Application-specific Optimization on NVMe SSDs,” HotStorage, 2016.

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 6

All-Flash Array

• Interfaces

– 10Gb/40Gb Ethernet (iSCSI) or

16Gb Fibre Channel or PCIe

– SAS or NVMe SSDs

• Functionalities

– Volume management

– Virtualization support

– RAID

– Snapshot

– Deduplication

– Compression, …

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 7

Traditional Block Interface

• SATA/SCSI/SAS

– Read (sector #, length)

Write (sector #, length, data)

– No block-level liveness information

– No high-level semantics on data

– Several “unwritten contracts”

do not hold for SSDs• Sequential accesses are several tens of

times better than random accesses

• Distant LBNs lead to longer seek times

• Data written is equal to data issued

• …

FTL

SSD

Host

Block device driver

File system

Block I/F

NAND Flash

Flash I/F

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 8

Extending Block I/F

• TRIM command

– “The data in the specified sectors is

no longer needed”

– ATA interface standard

(T13 technical committee)

– Non-queued command

– SATA 3.1 introduces the Queued

TRIM command

FTL

SSD

Host

Block device driver

File system

NAND Flash

Block I/F + SSD-Specific I/F

Flash I/F

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 9

Atomic Write

• Transaction support for multi-block writes

– Simplifies file systems and DBMSes

X. Quyang, et al., “Beyond Block I/O: Rethinking Traditional Storage Primitives,” HPCA, 2011.

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 10

Multi-streamed SSD (1)

• Previous write patterns (= current state) matter

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 11

Multi-streamed SSD (2)

• Mapping data with different lifetime to different

streams

• Standardized in T10 SCSI/SAS (2015), NVMe 1.3

(2017)

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 12

Multi-streamed SSD (3)

• Cassandra with Multi-streamed SSD

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 13

Multi-streamed SSD (4)

• Cassandra’s normalized updated throughput with

5 streams

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 14

Open-Channel SSD (1)

• Why Open-channel SSD is required?

– I/O predictability & isolation

0% writes

latency is consistent

I/O Performance is

unpredictable

due to writes

being buffered

50% writes can make

SSDs as slow as

spinning drives

20% writes make

big impact

on read latency

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 15

Open-Channel SSD (2)

• Why Open-channel SSD is required?

– Log-on-log, indirection, and narrow I/O

Log-structured Database (e.g., RocksDB)

Metadata Mgmt. Address Mapping Garbage Collection

VFS

Log-structured File-system

Metadata Mgmt. Address Mapping Garbage Collection

Block Layer

Solid-State Drive

Metadata Mgmt. Address Mapping Garbage Collection

User

Space

Kernel

Space

HW

pread/pwrite

Read/Write/TrimBlack-Boxed SSD

- SSD state is hidden

due to the narrow I/O interface

- Data Placement + Buffering

→ Best Effort

We need application-driven SSD!

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 16

Open-Channel SSD (3)

• Open-Channel SSD exposes its geometry

• LightNVM: Open-Channel SSD subsystem in Linux Kernel

– Functionalities of Flash Translation Layer (FTL)

– Administration of drive instances

– Interface between application/filesystem and Open-Channel SSD

A t

rad

itio

nal

Blo

ck I

/O S

SD

Op

en

-Ch

an

nel S

SD

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 17

Open-Channel SSD (4)

Traditional SSD Open-Channel SSD

User visibility X O

Command Format Read/Write Program/Read/Erase

Address LBA PPA (physical Page Address)

Timing info. None program/read/erase timing

L2P Mapping Table

On Device On host OSDRAM buffer

Bad Block Management

Write Handling Firmwareon device

Thread on host OS Garbage Collection

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 18

Open-Channel SSD (5)

• Experiment – multi-tenant WorkloadsOC-SSDNVMe SSD

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 19

Open-Channel SSD (6)

• Experiment – Predictable Latency

– 4K reads during 64K concurrent writes

– Read PU and Write PU are separated at OC-SSD

– Consistent low latency at 99.99, 99.999, 99.9999

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 20

Open-Channel SSD (7)

• DIDACache: A deep integration of device and

application for flash-based key-value caching. (FAST’17)

– Integrate the Key-value cache system with FTL

– A single-level direct mapping from keys to physical flash

memory locations

– An integrated garbage collection

– Throughput ↑ and Latency ↓

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 21

KeyValue-SSD (1)

• Internally manages variable length key-value pairs

• Provide a similar interface with conventional host-

side KV store

• Offload the key-value management layer to an SSD

– reduce host system resource

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 22

KeyValue-SSD (2)

KAML: fixed size 8B key,

additional log and translation

for variable size key in host

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 23

KeyValue-SSD (3)

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 24

In-Storage Computing (1)

• Samsung ISC SSD Prototype

– Commodity SSD: Samsung PM1725 NVMe with the ISC

feature

– PCIe 3.0x4

– 800 GB

• Software

– C++11

– C++STL

– G++

– Software emulator

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 25

In-Storage Computing (2)

• ISC Application Development Process

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 26

In-Storage Computing (3)

• ISC Dataflow Programming Model

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 27

In-Storage Computing (4)

• Example: Simple Key-Value Store

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 28

In-Storage Computing (5)

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 29

In-Storage Computing (6)

• FPGA SSD

ICE3028: Embedded Systems Design, Fall 2019, Dongkun Shin (dongkun@skku.edu) 30

In-Storage Computing (7)

• Cognitive SSD (ATC’19)

top related