an empirical study of hot/cold data separation policies in solid state drives (ssds)

20
An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs) Jongsung Lee and Jin-Soo Kim Sungkyunkwan University South Korea

Upload: leitia07

Post on 22-May-2015

295 views

Category:

Technology


0 download

DESCRIPTION

Presentation file for SYSTOR 2013

TRANSCRIPT

Page 1: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

An Empirical Study of Hot/Cold Data Separation Policies in Solid State

Drives (SSDs)

Jongsung Lee and Jin-Soo Kim

Sungkyunkwan University

South Korea

Page 2: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

Flash memory

• Flash characteristics• No overwrite

• Flash Translation Layer• Out-of-place update• Garbage Collection

•Write amplification factor• Performance metric of SSDs•𝑊𝑟𝑖𝑡𝑒 𝐴𝑚𝑝𝑙𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛 𝐹𝑎𝑐𝑡𝑜𝑟 =𝐴𝑐𝑡𝑢𝑎𝑙 𝑎𝑚𝑜𝑢𝑛𝑡 𝑑𝑎𝑡𝑎 𝑤𝑟𝑖𝑡𝑡𝑒𝑛 𝑡𝑜 𝑓𝑙𝑎𝑠ℎ

𝐴𝑚𝑜𝑢𝑛𝑡 𝑜𝑓 𝑑𝑎𝑡𝑎 𝑤𝑟𝑖𝑡𝑡𝑒𝑛 𝑏𝑦 𝑡ℎ𝑒 ℎ𝑜𝑠𝑡

• The lower WAF means the better SSD performance

2

Valid page copy

Erase

Garbage Collection Procedure

BlockPageVictim

Page 3: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

Hot/Cold Data Separation

3

No separation

With separation

Reduces the amount of pages copied during GC

Clean page Cold page Hot pageInvalid page

Should be copied before erasing

Cold pages are frequently copied during GC

Page 4: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

Motivation

• List of target policies• 2-Level LRU

• L.-P. Chang and T.-W. Kuo. An adaptive striping architecture for flash memory storage systems of embedded systems. (RTAS 02’)

• Multiple Bloom Filter• D. Park. Hot data identification for flash-based storage systems using multiple bloom filters. (MSST 11’)

• Dynamic dAta Clustering• M.-L. Chiang, P. C. H. Lee, and R.-C. Chang. Using data clustering to improve cleaning performance for flash memory. (Practice & Experience 99’)

• Evaluate Hot/Cold separation policies• With fair conditions on a real SSD platform

4

Page 5: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

Related Works

• 2-Level LRU (LRU)

5

Advantages

DisadvantagesFixed size of each list

Long latency for list searching

Simple design

Hot List

Candidate List

MRU LRU

Write miss

Write hit

Full

Page 6: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

Related Works

•Multiple Bloom Filter (MBF)

6

Advantages

DisadvantagesFixed parameters (e.g., number of filters,

filter size, decaying period, etc..)

Small memory consumption

Calculate the hash value

Current Filter

Hash 1 Hash 2

Set the corresponding bit in Current Filter

1 2

Page 7: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

Related Works

•Dynamic dAta Clustering (DAC)

7

Advantages

DisadvantagesOptimal number of regions depends on

workload pattern

Small memory consumptionLow calculating overhead

Region 0

Region 1

Region 2

Region 3

Write Write Write Write

Garbage collectionGarbage collectionGarbage collectionGarbage collection

Page 8: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

Evaluations

8

Page 9: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

Device Information

• Jasmine OpenSSD Platform• Run as a normal SATA drive

• Can be programmable

• Specification• ARM7TDMI-S core

• 64MB SDRAM

• 8 NAND module slots

• Configuration• Total capacity : 32GB

• Clustered block size : 4MB

• Clustered page size : 32KB

9

Page 10: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

Synthetic Workloads

• SkewX (Skew70, Skew90, Skew95, Skew99)• X% of writes are concentrated on (100-X)% area

• SkewInc• Skew rate changes: 70% → 90% → 95% → 99%

• SkewDec• Skew rate changes: 99% → 95% → 90% → 70%

10

X% 100-X%

Write pattern

Page 11: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

Real Workloads

• Financial• Collected from OLTP(On-Line Transaction Processing) applications running at a financial institution

•Web search• Surfing the web during one day

•General• Run office suite, download and play mp3 files, play movies during five days

• TPC-C• Gathered from commercial DBMS while running the TPC-C benchmark for three hours

11

Page 12: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

Performance – Synthetic workloads

12

Hot/Cold separation policies are quite effective

Page 13: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

Performance – Synthetic workloads

13

Oracle shows the best performance in most cases

Page 14: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

Performance – Synthetic workloads

14

Oracle correctly separates Hot/Cold data, but it does not mean that it minimize WAF

Page 15: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

Performance – Synthetic workloads

15

DAC improves WAF value by up to 73% (average 46%)

73%

Page 16: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

Performance – Real workloads

16

Also DAC works well in most cases

Page 17: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

Performance – Real workloads

17

Hot/Cold separation does not work well in TPC-C workload

Page 18: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

Performance – Real workloads

18

Because written LPNs in TPC-C are uniformly distributed

Page 19: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

Conclusion & Future Works

•Hot/Cold data separation is effective in most cases

•DAC reduces WAF value by up to 74% in synthetic workloads, up to 58% in real workloads

• Run more diverse workloads

•Develop a brand-new hot/cold separation policy

19

Page 20: An Empirical Study of Hot/Cold Data Separation Policies in Solid State Drives (SSDs)

Thank you!

20