gecko: a contention-oblivious design for the cloudjyshin/talks/hotstorage12-shin.pdf · gecko:...

19
Ji-Yong Shin Cornell University In collaboration with Mahesh Balakrishnan (MSR SVC), Tudor Marian (Google), Lakshmi Ganesh (UT Austin), and Hakim Weatherspoon (Cornell) HotStorage Talk on June 13, 2012 Gecko: A Contention-Oblivious Design for Cloud Storage

Upload: others

Post on 04-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt

Ji-Yong Shin

Cornell University

In collaboration with Mahesh Balakrishnan (MSR SVC), Tudor Marian (Google),

Lakshmi Ganesh (UT Austin), and Hakim Weatherspoon (Cornell)

HotStorage Talk on June 13, 2012

Gecko: A Contention-Oblivious Design for Cloud Storage

Page 2: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt

• What happens to storage?

Cloud and Virtualization

VMM

Guest VM

Guest VM

Guest VM

Guest VM

Shared Disk

Shared Disk

Shared Disk

Shared Disk

SEQUENTIAL

RANDOM

2

Page 3: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt

Sequential Writers Only

• Sequential streams are no longer sequential

– 1~8 VM + EXT4 FS

– 4-disk RAID-0 setting

– Sequential Writer (256KB)

– Random Writer (4KB)

0

50

100

150

200

250

300

350

1 2 3 4 5 6 7 8

Thro

ugh

pu

t (M

B/s

)

# of VMs

Sequential Writers Only

0

50

100

150

200

250

300

350

1 2 3 4 5 6 7

Thro

ugh

pu

t (M

B/s

)

# of VMs

Sequential Writers + 1 Random Writer

3

Page 4: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt

Existing Solutions for IO Contention?

• IO scheduling

– Entails increased latency for certain workload

– May still require moving disk head

• Workload placement

– Requires prior knowledge or dynamic prediction

– Limits freedom of placing VMs in the cloud

4

Page 5: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt

Log

Tail

Shared Disk

Log-structured File System to the Rescue?

– Write everything as log to tail

– Perfect prediction for writes

– Assume reads are handled by cache

Addr 0 1 2 … … N

Wri

te

Wri

te

Wri

te

Log

Tail

Shared Disk Shared

Disk

Log

Tail

RAID0 + LFS

5

Page 6: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt

• Garbage collection is the Achilles’ Heel of LFS

Log

Tail

Shared Disk

Challenges of Log-Structured File System

… N …

Log

Tail

Shared Disk Shared

Disk

Log

Tail

GC

GC

GC

Garbage Collection (GC) from Log Head

RAID0 + LFS

6

Page 7: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt

Challenges of Log-Structured File System

• Garbage collection is the Achilles’ Heel of LFS

– 2-disk RAID-0 setting of LFS

– GC under write-only workload RAID 0 + LFS

0

50

100

150

200

250

0 10 20 30 40 50 60 70 80 90 100 110 120

Thro

ugh

pu

t (M

B/s

)

Time (s)

Aggregate Writes

Application Writes

GC Writes

7

Page 8: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt

Summary of Challenges in the Cloud

• Server consolidation through cloud and virtualization

– Numbers of core and VM per server increase

– Storage is not yet maturely virtualized

• RAID cannot preserve high throughput

– IO performance varies depending on coexisting VMs

• LFS only solves write-write contention

– GC operation interferes with logging

– First class reads can interfere with logging

8

Page 9: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt

Rest of the Talk

• Gecko, a chain logging design

– Overview

– Caching reads

– Garbage collection strategies

– Metadata management

• Evaluation

• Summary

9

Page 10: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt

Gecko: Chain logging Design

• Cutting the log tail from the body

– GC reads do not interrupt the sequential write

– 1 uncontended drive >>faster>> N contended drives

Disk 2 Disk 1 Disk 0

Log Tail

Physical Addr Space

GC

Garbage Collection

from Log Head

Disk 2

10

Page 11: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt

Disk 2

Read

Gecko Overview and Properties

Fault tolerance + Read performance

Disk 1 Disk 0

Log Tail

Read Read

No write-write contention, No GC-write contention

Disk 2’ Disk 1’ Disk 0’

Primary

Mirror

Read Log Tail

Disk 1’ Off

Disk 0’ Off

Power saving w/o

Consistency concerns

Read Read

11

Page 12: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt

Gecko Caching

• What happens to reads going to tail drives?

Disk 2 Disk 1 Disk 0

Wri

te Tail Cache (Flash )

Rea

d

Rea

d

Rea

d

Blocks AT LEAST 86% of reads from real workload. (500GB disk, 34GB cache)

Prevents first-class read-write contention.

Revival of LFS using Flash

12

Page 13: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt

Gecko Garbage Collection (GC)

Disk 2 Disk 1

Move-to-tail GC

Compact-in-body GC

+ Simple - GC shares write bandwidth

+ GC is independent from writes - Complicates metadata management

Disk 0

13

Page 14: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt

Gecko Metadata and Persistence Primary map: less than 8 GB RAM

for a 8 TB storage

Inverse map: 8 GB flash for a 8 TB storage (flushed every 1024 writes)

4KB pages

empty filled

Data (in disk)

Disk 0 Disk 2 Disk 1

Physical-to-logical map (in flash) h

ead

tail

hea

d

tail

4-byte entries Logical-to-physical map (in memory)

14

Page 15: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt

Evaluation Setup

• In-kernel version

– Implemented as block device for portability

– Similar to software RAID

– Move-to-tail GC

• User-level emulator

– For fast prototyping

– Runs block traces

– Compact-in-body GC

15

Page 16: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt

Evaluation

• Performance under move-to-tail GC

– 2-disk Gecko chain, write only workload

– GC does not affect aggregate throughput

RAID 0 + LFS Gecko

0

50

100

150

200

250

0 10 20 30 40 50 60 70 80 90 100 110 120

Thro

ugh

pu

t (M

B/s

)

Time (s)

0

50

100

150

200

250

0 10 20 30 40 50 60 70 80 90 100 110 120

Thro

ugh

pu

t (M

B/s

)

Time (s)

Aggregate Writes

Application Writes

GC Writes

16

Page 17: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt

Evaluation

• Performance under compact-in-body GC (CIB GC)

– Write only workload is used

– Application throughput is not affected

0

20

40

60

80

100

120

No GC CIB GC

Thro

ugh

pu

t (M

B/s

)

Average Application Throughput

17

Page 18: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt

Summary

• Log-structured designs

– Oblivious to write-write contention

– Sensitive to GC/read-write contention

• Gecko fixes the GC-write and read-write contention

– Separates the tail of the log from its body

– Flash re-enables log-structured designs

• Tail flash cache for read-write contention

• Small flash memory for persistence

18

Page 19: Gecko: A Contention-Oblivious Design for the Cloudjyshin/talks/hotstorage12-shin.pdf · Gecko: Chain logging Design •Cutting the log tail from the body –GC reads do not interrupt

Future work

• Experiments with real workloads

• Exploration to minimize read-read contention

• IO handling policy inside Gecko

19