storage systems cse 598d, spring 2007 rethink the sync april 3, 2007 mark johnson

Post on 18-Jan-2016

214 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Storage SystemsStorage SystemsCSE 598d, Spring 2007CSE 598d, Spring 2007

Rethink the SyncRethink the SyncApril 3, 2007April 3, 2007Mark JohnsonMark Johnson

Goals of a File System

• Durability– Programmers expect that what they write to

the file system will exist at a later point

• Performance– Programmers expect a certain level of

performance from the file system

• Contrasting roles!

What is a Synchronous File System

• Pretty much what we are used to.• Block Calling application until

modifications are committed to disk.– Application will not think data was written,

when it has not been– Some exception taken for disk level caching

because true synchronous writes are very very slow.

Asynchronous File System

• Modifications are committed long after the call completes

• Complicates applications that require durability or ordering guarantees.– Shifts burden to programmers.

• However, most file systems provide an abstraction layer in spite of potential problems.

Interesting Compromises

• Fsynch does not provide against data loss in the event of power failure

• Is a conscious decision because real synchronous behavior is too slow.

• Apps that require very strong durability guarantees (Dbs), often recommend disabling drive cache.

External Synchronization

• API provides a set of guarantees provided to the clients of the file system.

• Application Centric view– Synchronous IO

• File system provides guarantees to the applications

• Seems simple, what else is there?

External Sync

• User Centric View!– File System provides guarantees to the

USER of an application

• Ties output to commit strategy– Network– Screen– Other I/O

External Sync

• Returns control to the application before data is committed.

• But it buffers all output until the data is committed.

• Modifications are committed in order, and only then are external output buffers flushed.

• External I/O rarely blocks (screen/net)

xsynchfs

• File system developed as part of the Speculator project– “Speculative Execution in a distributed FS”

• xsynch adds commit dependency– Also has inheritance of these dependencies

across processes

• Most output is delayed no more than the time to commit a single transaction

Output Triggered Commits

• When output is used, it triggers a commit of any pending disk data.

• Relies on a causal relationship between FS writes and external output.

Design

• Defined by externally observable behavior

• Application State is an internal property of the computer system.

• “An IO is externally synchronous if the external output produced by the computer system cannot be distinguished from output had the IO been synchronous.”

More Design

• Traditional systems have two partitions, kernel and user.

• Usually the return from a system call is an 'external event'

• This approach adds a third layer, external interfaces.

• Only on the boundary between app/external is a commit done

Still More Design

• Requires that all applications access services through the OS

• Must be an abstraction layer for everything

• Thoughts? Does this prevent direct I/O?

Correctness

• Must ensure that external output occurs in same causal order that would have occurred had the IO been synchronous.

Limitations

• Complicates app recovery after catastrophic failure because application continues after failure, but before errors detected.

• Possible Idea to checkpoint and rollback– Was rejected due to overhead concerns

• User may have temporal expectations.– xsyncfs commits every 5 seconds anyway

Bottom Line

• Uncommitted Output never visible to external observer.

Implementation

• Uses much from Speculator– Avoids overhead of rollback/checkpoints

• Propogates dependencies• New transactions created on commit of

previous transaction• Uses journaled mode to guaranteed

causuality.

Shared Memory

• Sticky issue since more than one process could be using the memory to output.

• Easy Approach: All processes using shared memory become part of the dependency chain.

• This is very conservative, though.

Correctness Results

• Actually provided more visible durability than the synchronous file system.

Performance

• Within 7% of true asynchronous file system

• Analysis shows that overhead of tracking accounts for most of the 7%

• Synchronous was an order of a magnitude slower– Interesting since xsyncfs provides better

durability guarantees.

Conclusion

• This seems to be a very good idea• Performance indicators show that

xsyncfs is a good, if not better replacement for traditional synchronous file systems

• However, given the current technology of caching in drives, is it necessary?

top related