key value ssd: a scalable smart storage for objects · pdf filedisclaimer 2 this presentation...

24
Key Value SSD: a Scalable Smart Storage for Objects Yang Seok KI Director/Principal Engineer Memory Solutions Lab Samsung Electronics

Upload: dangkhuong

Post on 15-Feb-2018

222 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

Key Value SSD: a Scalable Smart Storage for Objects

Yang Seok KIDirector/Principal Engineer

Memory Solutions LabSamsung Electronics

Page 2: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

Disclaimer

2

This presentation and/or accompanying oral statements by Samsung representatives collectively, the

“Presentation”) is intended to provide information concerning the SSD and memory industry and

Samsung Electronics Co., Ltd. and certain affiliates (collectively, “Samsung”). While Samsung strives

to provide information that is accurate and up-to-date, this Presentation may nonetheless contain

inaccuracies or omissions. As a consequence, Samsung does not in any way guarantee the

accuracy or completeness of the information provided in this Presentation.

This Presentation may include forward-looking statements, including, but not limited to, statements

about any matter that is not a historical fact; statements regarding Samsung’s intentions, beliefs or

current expectations concerning, among other things, market prospects, technological developments,

growth, strategies, and the industry in which Samsung operates; and statements regarding products

or features that are still in development. By their nature, forward-looking statements involve risks and

uncertainties, because they relate to events and depend on circumstances that may or may not occur

in the future. Samsung cautions you that forward looking statements are not guarantees of future

performance and that the actual developments of Samsung, the market, or industry in which

Samsung operates may differ materially from those made or suggested by the forward-looking

statements in this Presentation. In addition, even if such forward-looking statements are shown to be

accurate, those developments may not be indicative of developments in future periods.

.

Page 3: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

Agenda

• Background

• Use Cases & Challenges

• Key Value SSD

• Experiments

• Conclusion

Page 4: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

BC/AD in IT

6

Source: Human Computer Interaction % Knowledge Discovery

Page 5: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

Everything is object!

5

OSD Object Storage

ID Attributes User Data

KV Storage

Key Value

Page 6: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

Key Value Stores in Systems at Scale

Page 7: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

Case 1: Abstraction of Scalable Store in DC

File System

KV Store

Scalable Storage (Block/Object)

KV Store

Applications

File System

KV Store

Scalable Database (Redis, MongoDB)

KV Store

Applications

Page 8: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

• Application database • Write-optimized storage

RocksDB: Key Value Database

Page 9: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

Performance Implications of IO Amplification

Ceph performance on RocksDB with an SSD in the case of IO saturation

Block Size Data Written

WRITE READ WAF RAF

IOPS BW (MBps) IOPS BW (MBps)

4K 47.01 GB 2416 9.43 13733 53.64 12.05 1.57

• Device Throughput from Ceph: 9.43 x 12.05 = 113.63 MB/s• Device Throughput with 4KB blocks from FIO: 117.16MB/s

Page 10: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

Case 2: Extent Service for Scalable Storage• Extent store or chunk store uses a key value store to map a logical

block address space to a physical block address space

Physical Disks

LUN: Linear Storage SpaceThin

Pro

vision

ing Laye

r

Block: Block Layer

FS: File System Layer

Dev: Logical Device

Clie

nt

LUN LUN

Disks

Distrib

ute

d Sto

rage

Extent Store: Content to Physical Location Mapping

Volume: Independent Logical Storage Space(Deduplication)

Volume

Page 11: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

Shard (Cache) SSD

KV SSD

Extent Store Use Case• Storage capacity tends to be limited by DRAM capacity of server

to manage a map between logical address space and physical address space

MD

Extent Body

MD

Extent

MD

Extent Body

MD

Extent

Blo

ck

MapKV2File(DRAM)

Blo

ck

MD M

D

Extent

Key

Key

Page 12: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

System Resource

• Return the DRAM space for mapping in the host to applications

• Return the user storage space for GC to applications (e.g., 5% user OP)

KVS

MongoDB

KVS+FTL

MongoDB

FTL

1/250

1/1000 1/250

MS-SSD KV-SSD

Physical NAND

User Capacity

SSD CapacityKV-SSD

User Capacity

SSD Capacity

Physical NAND

Legacy SSD

Page 13: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

Key Value SW Stack Consolidation• SSD with native key value interface through hardware software co-design

Datacenter S/W Infra

Storage Plugin Interface

Block Interface

POSIX API

Key Value API

S/W Key Value Store

Key Value Glue Logic

File System

Block Device Driver

Index Log

JournalBlock Map

Command Protocol

Block DeviceMap Log

Datacenter S/W Infra

Storage Plugin Interface

KV Interface

Key Value API

Thin KV Library

Key Value Glue Logic

KV Device Driver

Command Protocol

KV DeviceIndex Log

TX/s WAF, RAF, Latency

Index Log

Block Map

Index Log

13

Page 14: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

Physical Location / Offset

Index

User/Device Hash Key

NAND Page (32KB)

Lookup / Check hash collision

< NAND >

Read/Write User Data

Key Size Range ? Value Size Range ?

Key Value I/F Command

Key Value SSD device driver

Value Size

Meta data Key Value

Key Size

Get (key) / Put (Key, Value)

KV SSD Design Overview• Key/Value Range

– Key : 4~255B

– Value : 64B~2GB (32B granularity)

– The large value is stored into multiple NAND pages

Storage Server Key Value SSD

Page 15: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

Samsung KV-PM983 Prototype

15

NGSFF KV SSD

Form factor: NGSFF/U.2Capacity: 1-16TBInterface: NVMe PCIe Gen.3

Page 16: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

NVMe Extension for Key Value SSD• Defines a new device type for a Key Value device• A controller performs either KV or traditional block storage

commands

PUT GET DELETE EXISTS

Admin command

Identify commands for KV

Other non-block specific commands

New Key Value Commands

Existing Command Extension

Page 17: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

KV SSD Ecosystem

Key Value SSD

Partners

Product

ApplicationsSDK

Standard

17

Page 18: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

DBBench/KVBenchDBBench/KVBench

Scale-Out: RocksDB & KV Stacks Configuration

RDMA Fabric

RocksDBvs

KV Stacks

NVMeoFover

RDMA

Mission PeakKV-PM983 SSDs

…Software CentOS 7.3 w/ KV Target

NICs 2x 100GbE + 2x 50GbE

CPUXeon 8160 CPU @2.10GHz1-node 2-socket, 48-core 96-thread

# SSDs 36x 1TB

Software CentOS 7.3 w/ KV SWNIC 2x 100GbE

CPUXeon E5-2699V4 CPU @2.20GHz1-node 2-socket44-core 88-thread

DRAM 256 GB

Client: kvbench Client: kvbench Client: kvbench Client: kvbench

Page 19: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

0

1

2

3

4

5

6

7

8

9

RocksDB(PM983) KV Stacks(KV-PM983)0

2

4

6

8

10

12

14

RocksDB(PM983) KV Stacks(KV-PM983)

Performance: Random PUT• 8x more QPS (Query Per Second) with KV Stacks than RocksDB on

block SSD

• 90+% less traffic goes from host to device with KV SSD than RocksDB on block device

8x

* Workload: 100% random put, 16 byte keys of random uniform distribution, 4KB-fixed values on single PM983 and KV-PM983 in a clean state

12.7

1.0

Re

lati

ve Q

PS

8.0

Dev

ice

IO/U

ser

IO

Page 20: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

0

0.5

1

1.5

2

2.5

RocksDB (PM983) KV Stacks (KV-PM983)

Scale-up Performance: Sequential Key PUT

• 3.4x IO performance over S/W key value store on block devices

20

Re

lati

ve Q

PS

1.0

2.0

Dev

ice

IO/U

ser

IO

Relative performance to the maximum aggregate RocksDB random Put QPS for 1 SSD with a default configuration for 1 PM983 SSD in a clean state.System: Ubuntu 16.04.2 LTS, , Ext4, RAID0 for block SSDs, Actual CPU utilization could be 90% at CPU saturation point.Workload: 100% puts, 16 byte keys of random uniform distribution for RocksDB v. 5.0.2, 4KB-fixed values, 36 RocksDB instances with 1 client thread, 34GB/Instance or 1.2TB Data is used

0.00

5.00

10.00

15.00

20.00

25.00

30.00

35.00

1 6 12 18

RocksDB (PM983) - S KV Stacks (KV-PM983)

3.4x

# of SSDs

Page 21: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

Local vs NVMeoF PUT Latency

0

200

400

600

800

1000

1200

1 2 4 8 16 32 64 128

Average Latency

Local Avg

RDMA Avg

Queue Depth

Mic

rose

con

ds

@Qdepth: 1-8Overhead: 4-7us

RDMASwitch

kvbench

Page 22: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

0

10

20

30

40

50

60

70

80

90

100

CPU Utilization for Clients

0

10

20

30

40

50

60

70

80

90

100

1

38

75

11

2

14

9

18

6

22

3

26

0

29

7

33

4

37

1

40

8

44

5

48

2

51

9

55

6

0

10

20

30

40

50

60

70

80

90

100

1

41

81

12

1

16

1

20

1

24

1

28

1

32

1

36

1

40

1

44

1

48

1

52

1

56

1

Avg Utilization 10% Higher

Time Time Time

2.1 M QPS

Page 23: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

Key Value SSD is a Scalable Solution with Better TCO

23

• Performance

• Capability

KV SSD

• Capacity

• Performance

Scale-Up• CPU

• Server

Scale-In

• TCO

• Power

Scale-Down• Capacity

• Performance

Scale-Out

TCO($)

Page 24: Key Value SSD: a Scalable Smart Storage for Objects · PDF fileDisclaimer 2 This presentation and/or accompanying oral statements by Samsung representatives collectively, the “Presentation”)

Questions?

[email protected]