speculative execution in distributed file system and external synchrony edmund b.nightingale,...

56
Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by Han Wang Slides based on the SOSP and OSDI presentations

Upload: alessandra-umble

Post on 14-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Speculative Execution In Distributed File Systemand

External Synchrony

Edmund B.Nightingale, Kaushik VeeraraghavanPeter Chen, Jason Flinn

Presented by Han WangSlides based on the SOSP and OSDI presentations

Page 2: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

ConsistencyAvailability

Partition Tolerance

Page 3: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

“ … consistency, availability, and partition tolerance. It is impossible to achieve all three. “

-- Gilbert and Lynn, MIT

“So in reality, there are only two types of systems: CP/CA and AP” -- Daniel Abadi, Yale

“There is no ‘free lunch’ with distributed data.” -- Anonymous, HP

Page 4: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

AP: Lack Consistency

CP: Lack Availability CA: Lack Partition Tolerance

Page 5: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Synchrony

Asynchrony

Page 6: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Synchrony Asynchrony

Page 7: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

synchronous abstractions:strong reliability guarantees but are slow

asynchronous counterparts:relax reliability guarantees reasonable performance

Page 8: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

External Synchrony

Page 9: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

• provide the reliability and simplicity of a synchronous abstraction

• approximate the performance of an asynchronous abstraction.

Page 10: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Speculative Execution in a Distributed File SystemEdmund B. Nightingale, Peter M. Chen, and Jason Flinn

Rethink the SyncEdmund B. Nightingale, Kaushik Veeraraghavan, Peter M. Chen and Jason Flinn

Page 11: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Authors• Edmund B Nightingale

– PhD from UMich (Jason Flinn)– Microsoft Research– Best Paper Award (OSDI 2006)

• Kaushik Veeraraghavan– PhD Student in Umich (Jason Flinn)– Best Paper Award (FAST 2010, ASPLOS 2011)

• Peter M Chen– PhD from Berkeley (David Patterson)– Faculty at UMich

• Jason Flinn– PhD from CMU (Mahadev Satyanarayanan)– Faculty at Umich

Page 12: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Speculative Execution in a Distributed File SystemEdmund B. Nightingale, Peter M. Chen, and Jason Flinn

Rethink the SyncEdmund B. Nightingale, Kaushik Veeraraghavan, Peter M. Chen and Jason Flinn

Page 13: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

IdeaExampleDesign

Evaluation

Page 14: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

External Synchrony

• Question– How to improve both durability and performance

for local file system?• Two extremes– Synchronous IO• Easy to use• Guarantee ordering

– Asynchronous IO• Fast

Page 15: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

15

When a sync() is really async• On sync() data written only to volatile cache– 10x performance penalty and data NOT safe

VolatileCacheOperating

SystemCylinders

Disk

100x slower than asynchronous I/O if disable cache

From Nightingale’s presentation

Page 16: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

16

To whom are guarantees provided?• Synchronous I/O definition:– Caller blocked until operation completes

Disk Screen

App App

Guarantee provided to application

App

Network

OS Kernel

From Nightingale’s presentation

Page 17: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

17

To whom are guarantees provided?

• Guarantee really provided to the user

OS Kernel

Disk Screen

App App App

Network

From Nightingale’s presentation

Page 18: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

18

Example: Synchronous I/O

OS Kernel DiskProcess

101 write(buf_1);

102 write(buf_2);

103 print(“work done”);

104 foo();

Application blocks

Application blocks

%work done%

TEXT

%

From Nightingale’s presentation

Page 19: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

19

Observing synchronous I/O101 write(buf_1);

102 write(buf_2);

103 print(“work done”);

104 foo();

• Sync I/O externalizes output based on causal ordering– Enforces causal ordering by blocking an application

• External sync: Same causal ordering without blocking applications

Depends on 1st write

Depends on 1st & 2nd write

From Nightingale’s presentation

Page 20: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

20

Example: External synchrony

OS Kernel DiskProcess

101 write(buf_1);

102 write(buf_2);

103 print(“work done”);

104 foo();

TEXT

%work done%%

From Nightingale’s presentation

Page 21: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

External Synchrony Design Overview

• Synchrony defined by externally observable behavior.– I/O is externally synchronous if output cannot be distinguished

from output that could be produced from synchronous I/O.– File system does all the same processing as for synchronous.

• Two optimizations made to improve performance.– Group committing is used (commits are atomic).– External output is buffered and processes continue execution.

• Output guaranteed to be committed every 5 seconds.

Page 22: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

External Synchrony Implementation

• Xsyncfs leverages Speculator infrastructure for output buffering and dependency tracking for uncommitted state.

• Speculator tracks commit dependencies between processes and uncommitted file system transactions.

• ext3 operates in journaled mode.

Page 23: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Evaluation

• Durability• Performance– IO intensive application (Postmark)– Application that synchronize explicitly (MySQL)– Network intensive, Read-heavy application

(SPECweb)– Output-trigger commit on delay

Page 24: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Postmark benchmark

Xsyncfs within 7% of ext3 mounted asynchronously

1

10

100

1000

10000

Tim

e (S

eco

nd

s)

ext3-asyncxsyncfsext3-syncext3-barrier

From Nightingale’s presentation

Page 25: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

The MySQL benchmark

Xsyncfs can group commit from a single client

0500100015002000250030003500400045005000

0 5 10 15 20

Number of db clients

New

Ord

er T

ran

sact

ion

s P

er M

inu

te

xsyncfsext3-barrier

From Nightingale’s presentation

Page 26: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Specweb99 throughput

Xsyncfs within 8% of ext3 mounted asynchronously

0

50

100

150

200

250

300

350

400

Th

rou

gh

pu

t (K

b/s

)

ext3-async

xsyncfs

ext3-sync

ext3-barrier

From Nightingale’s presentation

Page 27: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Specweb99 latencyRequest

sizeext3-async xsyncfs

0-1 KB 0.064 seconds 0.097 seconds

1-10 KB 0.150 second 0.180 seconds

10-100 KB 1.084 seconds 1.094 seconds

100-1000 KB 10.253 seconds 10.072 seconds

Xsyncfs adds no more than 33 ms of delayFrom Nightingale’s presentation

Page 28: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Discussions

• Is the idea sound?– Nice idea, new idea.

• Flaws?– Are the experiments realistic?

• What are your take-aways from this paper?

Page 29: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Speculative Execution in a Distributed File SystemEdmund B. Nightingale, Peter M. Chen, and Jason Flinn

Rethink the SyncEdmund B. Nightingale, Kaushik Veeraraghavan, Peter M. Chen and Jason Flinn

Page 30: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

IdeaExampleDesign

Evaluation

Page 31: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Speculation Execution

• Question– How to improve the distributed file system

performance?• Characteristics of DFS– Single, coherent namespace

• Existing approach– Trade-off consistency for performance

Page 32: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

The Idea

• Speculative execution– Hide IO latency• Issue multiple IO operations concurrently

– Also improve IO throughput• Group commit

• For it to succeed– Correct– Efficient– Easy to use

Page 33: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Conditions for Success of Speculations

• Results of Speculation is highly predictable– Concurrent updates on cached files are rare

• Checkpointing is faster than Remote I/O– 50us ~ 6ms (amortizable) v.s. network RTT

• Modern computers have spare resources– CPUs are idle for significant portions of time– Extra memory is available for checkpoints

Page 34: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Speculator Interface

• Speculator provides a lightweight checkpoint and rollback mechanism

• Interface to encapsulate implementation details:– create_speculation– commit_speculation– fail_speculation

• Separation of policy and mechanism– Speculator remain ignorant on why clients speculate– DFS do not concern how speculation is done

Page 35: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

35Undo log

Implementing SpeculationPro

cess

Checkpoint Spec

1) System call 2) Create speculation

Time

From Nightingale’s presentation

Ordered list of speculative operations

Tracks kernel objects that depend on it

Copy on write fork()

Page 36: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

36

Speculation Success

Undo log

Checkpoint

1) System call 2) Create speculation

Proce

ss

3) Commit speculation

Time

Spec

From Nightingale’s presentation

Ordered list of speculative operations

Tracks kernel objects that depend on it

Page 37: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

37

Speculation Failure

Undo log

Checkpoint

1) System call 2) Create speculation

Proce

ss

3) Fail speculation

Proce

ss

Time

Spec

From Nightingale’s presentation

Ordered list of speculative operations

Tracks kernel objects that depend on it

Page 38: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Ensuring correctness

• Two invariants– Speculative state should never be visible to user or any

external devices– Process should never view speculative state unless it

speculatively depends on the state• Non-speculative process must block or become speculative

when viewing speculative states

• Three ways to ensure correct executions:– Block– Buffer– Propagate speculations (dependencies)

Page 39: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

39

Output Commits

“stat worked”

“mkdir worked”

Undo log

Checkpoint

Checkpoint

Spec(stat)

Spec(mkdir)

1) sys_stat 2) sys_mkdir

Proce

ss

Time

3) Commit speculation

From Nightingale’s presentation

Page 40: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Multi-Process Speculation

• Processes often cooperate– Example: “make” forks children to compile, link,

etc.– Would block if speculation limits to one task

• Allow kernel objects to have speculative state– Examples: inodes, signals, pipes, Unix sockets, etc.– Propagate dependencies among objects– Objects rolled back to prior states when specs fail

Page 41: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

41

Spec 1Spec 1

Multi-Process Speculation

Spec 2

pid 8001

Checkpoint

Checkpoint

inode 3456

Chown-1

Write-1

pid 8000

CheckpointCheckpoint

Checkpoint

Chown-1

Write-1

From Nightingale’s presentation

Page 42: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Multi-Process Speculation

• Supports– Objects in distributed file system– Objects in local memory file system -- RAMFS– Modified Local ext3 file system– IPCs:• Pipes and fifos, Unix sockets, signals, fork and exits

• Does not Support– System V IPC, Futex, shared memory

Page 43: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Using SpeculationClient 1 Client 2

1. cat foo > bar

2. cat bar

Time

Question: What does client 2 view in ‘bar’?

Reproduced from Nightingale’s Presentation

Handling Mutating Operations• Server permits other processes to see speculatively

changed file only if cached version matches the server version

• Server must process message in the same order as clients see

• Server never store speculative data

Page 44: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

44

• Speculator makes group commit possible

write

writecommit

commit

ClientClient Server Server

Using Speculation

Reproduced from Nightingale’s Presentation

Page 45: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Evaluation: Speculative Execution

• To answer the following questions– Performance gain from propagating dependencies– Impact on performance when speculation fails– Impact on performance of group commit and

sharing state

Page 46: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

46

Apache Build

• With delays SpecNFS up to 14 times faster

0

50

100

150

200

250

300

No delay

Tim

e (s

eco

nd

s)

NFS

SpecNFS

BlueFS

ext3

0

500

1000

1500

2000

2500

3000

3500

4000

4500

30 ms delay

From Nightingale’s presentation

Page 47: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

47

The Cost of Rollback

• All files out of date SpecNFS up to 11x faster

0

20

40

60

80

100

120

140

NFS SpecNFS ext3

No delay

Tim

e (s

eco

nd

s)

0

200

400

600

800

1000

1200

1400

1600

1800

2000

NFS SpecNFS ext3

30ms delay

No files invalid10% files invalid

50% files invalid100% files invalid

From Nightingale’s presentation

Page 48: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

48

Group Commit & Sharing State

050

100150200250300350400450500

NFS SpecNFS BlueFS

0 ms delay

Tim

e (s

eco

nd

s)

0

500

1000

1500

2000

2500

3000

3500

4000

4500

NFS SpecNFS BlueFS

30ms delay

Default

No prop

No grp commit

No grp commit & no prop

From Nightingale’s presentation

Page 49: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Discussions

• Is speculation in OS the right level of abstraction?– Similar Ideas:

• Transaction and Rollback in Relational Database• Transactional Memory• Speculative Execution in OS

• What if the conditions for success do not hold?• Portability of code– Code perform worse if OS does not speculate– What about transform source code to perform speculation?

• Why isn’t this used nowadays?

Page 50: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Conclusions

• Performance need not be sacrified for durability

• The transaction and rollback infrastructure in OS is very useful, two good papers!

• Ideas are not new, but are generic.

Page 51: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Thanks!

Page 52: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Things they did not do

• Mechanism to prevent disk corruption when crash occurs. They used the default journaled mode.

Page 53: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

ComparisonSpeculative Execution Rethink the Sync

Synchronous IO -> Asynchronous IO

Distributed File System Local File System

Checkpointing --

Pipelining Sequential IO --

Propagate Dependencies Propagate Dependencies

Group Commit Group Commit

-- Output-triggered commit

Page 54: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

54

Systems Calls• Modify system call jump table• Block calls that externalize state– Allow read-only calls (e.g. getpid)– Allow calls that modify only task state (e.g. dup2)

• File system calls -- need to dig deeper– Mark file systems that support Speculator

getpidrebootmkdir

Call sys_getpid()

Block until specs resolved

Allow only if fs supports Speculator

Page 55: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

Scenario 1:write ();print ();write ();print ();

Source: OSDI official blog

Question:Does xsyncfs perform similarly as synchronous IO?

Page 56: Speculative Execution In Distributed File System and External Synchrony Edmund B.Nightingale, Kaushik Veeraraghavan Peter Chen, Jason Flinn Presented by

• Scenario 2:Process A Process B

acquire_mutex(x)

write (val) acquire_mutex(x)

release_mutex(x)

read(val)

release_mutex(x)

print(val)

Time

Question: Will process B fail to read (Step 4) the update by process A?Will the print comes before the write in process A have committed?

Source: OSDI official blog