key value ssd: a scalable smart storage for objects · pdf filedisclaimer 2 this presentation...

Post on 15-Feb-2018

222 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Key Value SSD: a Scalable Smart Storage for Objects

Yang Seok KIDirector/Principal Engineer

Memory Solutions LabSamsung Electronics

Disclaimer

2

This presentation and/or accompanying oral statements by Samsung representatives collectively, the

“Presentation”) is intended to provide information concerning the SSD and memory industry and

Samsung Electronics Co., Ltd. and certain affiliates (collectively, “Samsung”). While Samsung strives

to provide information that is accurate and up-to-date, this Presentation may nonetheless contain

inaccuracies or omissions. As a consequence, Samsung does not in any way guarantee the

accuracy or completeness of the information provided in this Presentation.

This Presentation may include forward-looking statements, including, but not limited to, statements

about any matter that is not a historical fact; statements regarding Samsung’s intentions, beliefs or

current expectations concerning, among other things, market prospects, technological developments,

growth, strategies, and the industry in which Samsung operates; and statements regarding products

or features that are still in development. By their nature, forward-looking statements involve risks and

uncertainties, because they relate to events and depend on circumstances that may or may not occur

in the future. Samsung cautions you that forward looking statements are not guarantees of future

performance and that the actual developments of Samsung, the market, or industry in which

Samsung operates may differ materially from those made or suggested by the forward-looking

statements in this Presentation. In addition, even if such forward-looking statements are shown to be

accurate, those developments may not be indicative of developments in future periods.

.

Agenda

• Background

• Use Cases & Challenges

• Key Value SSD

• Experiments

• Conclusion

BC/AD in IT

6

Source: Human Computer Interaction % Knowledge Discovery

Everything is object!

5

OSD Object Storage

ID Attributes User Data

KV Storage

Key Value

Key Value Stores in Systems at Scale

Case 1: Abstraction of Scalable Store in DC

File System

KV Store

Scalable Storage (Block/Object)

KV Store

Applications

File System

KV Store

Scalable Database (Redis, MongoDB)

KV Store

Applications

• Application database • Write-optimized storage

RocksDB: Key Value Database

Performance Implications of IO Amplification

Ceph performance on RocksDB with an SSD in the case of IO saturation

Block Size Data Written

WRITE READ WAF RAF

IOPS BW (MBps) IOPS BW (MBps)

4K 47.01 GB 2416 9.43 13733 53.64 12.05 1.57

• Device Throughput from Ceph: 9.43 x 12.05 = 113.63 MB/s• Device Throughput with 4KB blocks from FIO: 117.16MB/s

Case 2: Extent Service for Scalable Storage• Extent store or chunk store uses a key value store to map a logical

block address space to a physical block address space

Physical Disks

LUN: Linear Storage SpaceThin

Pro

vision

ing Laye

r

Block: Block Layer

FS: File System Layer

Dev: Logical Device

Clie

nt

LUN LUN

Disks

Distrib

ute

d Sto

rage

Extent Store: Content to Physical Location Mapping

Volume: Independent Logical Storage Space(Deduplication)

Volume

Shard (Cache) SSD

KV SSD

Extent Store Use Case• Storage capacity tends to be limited by DRAM capacity of server

to manage a map between logical address space and physical address space

MD

Extent Body

MD

Extent

MD

Extent Body

MD

Extent

Blo

ck

MapKV2File(DRAM)

Blo

ck

MD M

D

Extent

Key

Key

System Resource

• Return the DRAM space for mapping in the host to applications

• Return the user storage space for GC to applications (e.g., 5% user OP)

KVS

MongoDB

KVS+FTL

MongoDB

FTL

1/250

1/1000 1/250

MS-SSD KV-SSD

Physical NAND

User Capacity

SSD CapacityKV-SSD

User Capacity

SSD Capacity

Physical NAND

Legacy SSD

Key Value SW Stack Consolidation• SSD with native key value interface through hardware software co-design

Datacenter S/W Infra

Storage Plugin Interface

Block Interface

POSIX API

Key Value API

S/W Key Value Store

Key Value Glue Logic

File System

Block Device Driver

Index Log

JournalBlock Map

Command Protocol

Block DeviceMap Log

Datacenter S/W Infra

Storage Plugin Interface

KV Interface

Key Value API

Thin KV Library

Key Value Glue Logic

KV Device Driver

Command Protocol

KV DeviceIndex Log

TX/s WAF, RAF, Latency

Index Log

Block Map

Index Log

13

Physical Location / Offset

Index

User/Device Hash Key

NAND Page (32KB)

Lookup / Check hash collision

< NAND >

Read/Write User Data

Key Size Range ? Value Size Range ?

Key Value I/F Command

Key Value SSD device driver

Value Size

Meta data Key Value

Key Size

Get (key) / Put (Key, Value)

KV SSD Design Overview• Key/Value Range

– Key : 4~255B

– Value : 64B~2GB (32B granularity)

– The large value is stored into multiple NAND pages

Storage Server Key Value SSD

Samsung KV-PM983 Prototype

15

NGSFF KV SSD

Form factor: NGSFF/U.2Capacity: 1-16TBInterface: NVMe PCIe Gen.3

NVMe Extension for Key Value SSD• Defines a new device type for a Key Value device• A controller performs either KV or traditional block storage

commands

PUT GET DELETE EXISTS

Admin command

Identify commands for KV

Other non-block specific commands

New Key Value Commands

Existing Command Extension

KV SSD Ecosystem

Key Value SSD

Partners

Product

ApplicationsSDK

Standard

17

DBBench/KVBenchDBBench/KVBench

Scale-Out: RocksDB & KV Stacks Configuration

RDMA Fabric

RocksDBvs

KV Stacks

NVMeoFover

RDMA

Mission PeakKV-PM983 SSDs

…Software CentOS 7.3 w/ KV Target

NICs 2x 100GbE + 2x 50GbE

CPUXeon 8160 CPU @2.10GHz1-node 2-socket, 48-core 96-thread

# SSDs 36x 1TB

Software CentOS 7.3 w/ KV SWNIC 2x 100GbE

CPUXeon E5-2699V4 CPU @2.20GHz1-node 2-socket44-core 88-thread

DRAM 256 GB

Client: kvbench Client: kvbench Client: kvbench Client: kvbench

0

1

2

3

4

5

6

7

8

9

RocksDB(PM983) KV Stacks(KV-PM983)0

2

4

6

8

10

12

14

RocksDB(PM983) KV Stacks(KV-PM983)

Performance: Random PUT• 8x more QPS (Query Per Second) with KV Stacks than RocksDB on

block SSD

• 90+% less traffic goes from host to device with KV SSD than RocksDB on block device

8x

* Workload: 100% random put, 16 byte keys of random uniform distribution, 4KB-fixed values on single PM983 and KV-PM983 in a clean state

12.7

1.0

Re

lati

ve Q

PS

8.0

Dev

ice

IO/U

ser

IO

0

0.5

1

1.5

2

2.5

RocksDB (PM983) KV Stacks (KV-PM983)

Scale-up Performance: Sequential Key PUT

• 3.4x IO performance over S/W key value store on block devices

20

Re

lati

ve Q

PS

1.0

2.0

Dev

ice

IO/U

ser

IO

Relative performance to the maximum aggregate RocksDB random Put QPS for 1 SSD with a default configuration for 1 PM983 SSD in a clean state.System: Ubuntu 16.04.2 LTS, , Ext4, RAID0 for block SSDs, Actual CPU utilization could be 90% at CPU saturation point.Workload: 100% puts, 16 byte keys of random uniform distribution for RocksDB v. 5.0.2, 4KB-fixed values, 36 RocksDB instances with 1 client thread, 34GB/Instance or 1.2TB Data is used

0.00

5.00

10.00

15.00

20.00

25.00

30.00

35.00

1 6 12 18

RocksDB (PM983) - S KV Stacks (KV-PM983)

3.4x

# of SSDs

Local vs NVMeoF PUT Latency

0

200

400

600

800

1000

1200

1 2 4 8 16 32 64 128

Average Latency

Local Avg

RDMA Avg

Queue Depth

Mic

rose

con

ds

@Qdepth: 1-8Overhead: 4-7us

RDMASwitch

kvbench

0

10

20

30

40

50

60

70

80

90

100

CPU Utilization for Clients

0

10

20

30

40

50

60

70

80

90

100

1

38

75

11

2

14

9

18

6

22

3

26

0

29

7

33

4

37

1

40

8

44

5

48

2

51

9

55

6

0

10

20

30

40

50

60

70

80

90

100

1

41

81

12

1

16

1

20

1

24

1

28

1

32

1

36

1

40

1

44

1

48

1

52

1

56

1

Avg Utilization 10% Higher

Time Time Time

2.1 M QPS

Key Value SSD is a Scalable Solution with Better TCO

23

• Performance

• Capability

KV SSD

• Capacity

• Performance

Scale-Up• CPU

• Server

Scale-In

• TCO

• Power

Scale-Down• Capacity

• Performance

Scale-Out

TCO($)

Questions?

kvssd@ssi.samsung.com

top related