don’t stack your log on my log - usenix · collapsing logs stacking logs is not always optimal...

25
1 c Don’t stack your Log on my Log Oct 5, 2014 Jingpei Yang, Ned Plasson, Greg Gillis, Nisha Talagala, Swaminathan Sundararaman

Upload: others

Post on 18-Jul-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

1c

Don’t stack your Log on my Log

Oct 5, 2014

Jingpei Yang, Ned Plasson, Greg Gillis,

Nisha Talagala, Swaminathan Sundararaman

Page 2: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

2

Outline

� Introduction

� Log-stacking models

� Problems with stacking logs

� Solutions to mitigate log-stacking issues

Page 3: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

3

Log-structuring

� Append data at the tail

� Maximized write throughput by aggregating small random writes to large sequential ones

� Easier storage management

– Free space management

� Fast recovery

– Good for transactional consistency

– Low overhead snapshots

� Flexibility on reclaiming space or data chunks

– Good for database systems that unmodified data could be grouped during GC process

free space

Page 4: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

4

Log-structuring - why on flash?

� Flash does not support in place update

– Erase is costly

� Asymmetric read/write performance

– Write is much slower than Read

� Operations are in a unit of page (4KB, 8KB)

– High overhead for small random I/O

– e.g. a 512-byte write may consume a 4KB flash page

page 0

page 1

page 2

page 3

page 4

page 5

page 6

page 7

page 8

page 9

page 10

page 11

block 0 block 1 block 2

logging

Log-structure is a better choice for flash memory!

Page 5: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

5

Log-structuring - who?

logginglogging

flash aware file systems, FTL

e.g. data log, metadata log

database

e.g. change and access logging

records

logging system, pub/sub

everything is logged

Page 6: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

6

Are more logs better?

� Individual logging is optimized for performance

� Individual logging provides better isolation

� Multiple logs within one application/layer is even better

– Hot/cold data separation

– Easy metadata management

� Multiple layers of logs?

Page 7: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

7

Outline

� Introduction

� Log-stacking models

� Problems with stacking logs

� Solutions to mitigate log-stacking issues

Page 8: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

8

Log-on-log models

Log Structured Application or Filesystem

Log Segments

(size=N)

Garbage Collection User Data Meta Data

JournalingCrash Recovery Other Services

Data Append Point #1 Data Append Point #3Data Append Point #2

Device Flash Translation Layer (FTL) Log

Log Segments

(size=1.5N)

Garbage Collection User Data Meta Data

JournalingCrash Recovery Other Services

Data Append Point #1 Data Append Point #2

Page 9: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

9

Log-on-log models

FTL Log

LogNon-log

FTL Log

Log Log

FTL Log

Log

Log

Page 10: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

10

Outline

� Introduction

� Log-stacking models

� Problems with stacking logs

� Solutions to mitigate log-stacking issues

Page 11: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

11

Problems with stacking logs

� Increased write pressure

� Destroyed sequentiality

� Duplicated log capabilities

� Lack of coordination

– Application logs are not FTL aware

– FTL log is not application aware

app 1 app 2 app 3

Flash logging

Page 12: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

12

Experimental methodology

� A flash-aware file system – F2FS

– Up to 6 logs

– On top of a single log (one append point) SSD

� A log-on-log simulator

– Mimic a file system to device storage system

– Data placing, storage management, garbage collector

Garbage

collector

Storage

Mgmt.

Metadat

a Mgmt.

Upper log storage medium

Garbage

collector

Storage

Mgmt.

Metadat

a Mgmt.

Lower log storage medium

readwrite

virtual

logical

physical

I/O interface

Page 13: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

13

1 - Metadata footprint increase

� Each layer/log has its own metadata

� On-media metadata is increased – resulting in more writes to the device

� In-memory metadata is increased – higher memory consumption

Increased metadata results in reduced

storage for user data and additional

writes which reduces endurance.

Page 14: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

14

2 - Fragmented logs

� Mixed workload from logs and other traffic destroys sequentiality

– Each log writes sequentially, but the device gets mixed workload, most likely to be random

– Underlying flash-based SSD also writes its own metadata

Log 1 Log 2 Log 3

Application layer

Flash layer

write

readSEG 1 SEG 2 SEG 3

Other non-log traffic

Multiple logs on a shared device results in random traffic seen by underlying device.

Page 15: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

15

2 - Fragmented logs (cont.)

� Unaligned segment size

• Garbage collecting one upper segment results in data invalidation across multiple segments in

device log

• Matching segment sizes does not prevent data fragmentation

application log

flash log

……

……

app_seg 1 app_seg 2 app_seg 3

dev_seg 1 dev_seg 2 dev_seg 3

Unaligned segment size results in fragmented device space

caused by garbage collection.

Page 16: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

16

3 - Decoupled segment cleaning

� Without TRIM, data has different ‘validation’ view on each log layer

� Data could be moved multiple times across log layers

Segment 2

valid block

1 2 3 4 5 6 7

2 4 6 7

file system logs

device log

invalid block

Segment 1

free block

Uncoordinated garbage collection on each log destroys the sequentiality of data that

the application log intends to maintain, and leads to more flash writes.

2 4 6

2 4 6

Even with TRIM support, the sequentiality could hardly be guaranteed in the lower flash log.

Page 17: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

17

Fragmented logs + decoupled cleaning

� A combined effect on the overall WA - TCWA

� Higher log ratio results in higher device WA

More upper log results in higher device WA, and hence higher TCWA

60GB random writes, 4K direct IO

< 1% 4 - 32% 3 - 24%

Page 18: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

18

Outline

� Introduction

� Log-stacking models

� Problems with stacking logs

� Solutions to mitigate log-stacking issues

Page 19: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

19

Methods for log coordination

Tpce - like: upper/lower size ratio 90%

Turning slope: low_seg_size > up_seg_size

� Smaller flash segment size is better in terms of WA

� Total WA increases as lower log segment size increases

� Turning slope when low_seg_size > up_seg_size

0.00

1.00

2.00

3.00

4.00

5.00

6.00

7.00

8.00

9.00

2 4 8 16 32 64 128 256 512

Tota

l C

om

bin

ed

WA

Lower log segment size MB

up_2

up_4

up_8

up_16

up_32

Upper log

segment

size MB

Page 20: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

20

Methods for log coordination (cont.)

upper/lower size ratio 90% upper/lower size ratio 70%

Total WA is a combined effect of capacity ratio, segment size ratio, GC frequency, and etc.

0.00

1.00

2.00

3.00

4.00

5.00

6.00

7.00

8.00

9.00

2 4 8 16 32 64 128 256 512

Tota

l C

om

bin

ed

WA

Lower log segment size MB

up_2

up_4

up_8

up_16

up_32

Upper log

segment

size MB

0.00

0.50

1.00

1.50

2.00

2.50

3.00

2 4 8 16 32 64 128 256 512

Tota

l C

om

bin

ed

WA

Lower log segment size MB

Page 21: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

21

Methods for log coordination (cont.)

� Size coordination to reduce Total Combined WA - TCWA

– TCWA = (upper log WA) * (lower log WA)

– Capacity ratio

– Segment size ratio

– Tradeoff the space management and GC efficiency by adjusting upper log segment

size

� Coordinated GC activity

– Postpone the device log GC while the upper GC is active

– Avoid/minimize the same data to be moved multiple times

– Start multiple upper logs GC simultaneously

Page 22: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

22

Collapsing logs

� Stacking logs is not always optimal for flash-based system

– Log coordination is less possible with multiple layers be involved

� Two approaches to collapsing logs

– NVMFS (formerly called DirectFS): sparse space, atomic writes, PTRIM

– Object-based flash: breaking the standard block interface

Page 23: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

23

Collapsing logs (cont.)

Application

VFS abstraction layer

Kernel block layer

Flash-aware file system

Flash memory device

Metadata mgmt.

Space allocation

Addr. mapping

Garbage collection

Crash recovery

Logging/journaling

Log-structured file system

Metadata

mgmt.

Space

allocation

Addr.

mapping

Garbage

collection

Crash

recovery

FTL

Logging

FTL-based flash memory

Metadata

mgmt.

Space

allocation

Addr.

mapping

Garbage

collection

Crash

recovery

Atomic

transaction

� Eliminate redundant log layers

– Remove similar log functionalities

� Break the block interface

� Keep the log capabilities in only one layer

– File system layer with a lightweight flash device design

– FTL layer with a log-less file system design

Page 24: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

24

Conclusion

� Log structured system is designed to provide high throughput

– Combining small random requests to large sequential ones

� Multiple logs/append points in different system layers tends to provide better data and space management

– hot/cold separation

� Stacking logs on another log achieves sub-optimal performance

– Lack of coordination on each layer mixes the traffic

� Log-on-log can perform better with improved coordination

– Size coordination, GC coordination

– File system support

� Collapsing logs – breaking the block interface

– NVMFS: sparse space, atomic writes, PTRIM

– Object-based flash storage

Page 25: Don’t stack your Log on my Log - USENIX · Collapsing logs Stacking logs is not always optimal for flash-based system – Log coordination is less possible with multiple layers

25

Thank You