Transcript

Don’t Give Up on Distributed File Systems

Jeremy Stribling, Emil Sit, Frans Kaashoek, Jinyang Li, and Robert Morris

MIT CSAIL and NYU

• New apps tend to use new storage layers• Examples:

• Can we invent this layer once?

Reinventing the Storage Wheel

BLAST

What About a File System?

• A FS enables quick-prototyping for apps– A familiar interface– Language-independent usage model– Hierarchical namespace useful for apps– Write distributed apps in shell scripts

if [ -f /fs/cwc/$URL ]; then if notexpired /fs/cwc/$URL; then cat /fs/cwc/$URL exit fifiwget $URL –O - | tee /fs/cwc/$URL

Why Won’t That Work Today?

• Needs of distributed apps:– Control over consistency and delays– Efficient data sharing between peers

• Current systems focus on FS transparency– Hide faults with long timeouts– Centralized file servers

Example: Cooperative Web Cache

• Would rather fail and refetch than wait

• Perfect consistency isn’t crucial

• Avoid hotspots

if [ -f /fs/cwc/$URL ]; then if notexpired /fs/cwc/$URL; then cat /fs/cwc/$URL exit fifiwget $URL –O - | tee /fs/cwc/$URL

Our Proposal: WheelFS

• A distributed wide-area FS to simplify apps

• Main contributions:

1) Give apps control with semantic cues

2) Provide good performance according to Read Globally, Write Locally

Basic Design: Reading and Writing

Node653

Node076

Node150 Node

554

Node402

Node257

File 135?

076 150257 402554 653

File135

135

135135

File135v2

File135v3

135v2

135v2

135v3

135v3

Cached135

Createfoo/bar

550

File550(bar)

Dir209(foo)

bar = 550

Explicit Semantic Cues

• Allow direct control over system behavior

• Meta-data that attach to files, dirs, or refs

• Apply recursively down dir tree

• Possible impl: intra-path component– /wfs/cwc/.cue/foo/bar

Semantic Cues: Writability• Applies to files

• WriteMany (default)

• WriteOnce Node653 Node

076

Node150

Node554

Node402

Node257

File 135?

File135

File135v2

File135v3

Cached135v3

Cached135

Semantic Cues: Freshness• Applies to file references

• LatestVersion (default)

• AnyVersion

• BestVersion

Node653 Node

076

Node150

Node554

Node402

Node257

File 135?

File135

Cached135

Semantic Cues: Write Consistency• Applies to files or directories

• Strict (default)

• Lax Node653 Node

076

Node150

Node554

Node402

Node257

WriteFile 135

File135

135

WriteFile 135

File135v2

135v2

Example: Cooperative Web Cache

• Reading an older version is ok:– cat /wfs/cwc/.bestversion,maxtime=250/foo

• Writing conflicting versions is ok:– wget http://foo > /wfs/cwc/.lax,writemany/foo

if [ -f /wfs/cwc/.maxtime=250,bestversion/$URL ]; then if notexpired /wfs/cwc/.maxtime=250,bestversion/$URL; then cat /wfs/cwc/.maxtime=250,bestversion/$URL exit fifiwget $URL –O - | tee /wfs/cwc/.lax,writemany/$URL

Example: Cooperative Web Cache

Node653 Node

076

Node150

Node554

Node402

Node257

File135

Cached135

Client $URL

“$URL”?135

135?135 = v1402

Chunk

Chunk

Chunk

Cached135

No!

$URL

File550

“$URL” == 550

Dir070

(/wfs/cwc)

Discussion• Current set of cues enough for many apps

– All-sites-pings– Grid computations– OverCite

• Stuff we swept under the rug:– Security– Atomic renames across dirs– Storage load-balancing– Unreferenced files

Related Work

• Every FS paper ever written

• Specifically:– Cluster FS: Farsite, GFS, xFS, Ceph– Wide-area FS: JetFile, CFS, Shark– Grid: LegionFS, GridFTP, IBP– POSIX I/O High Performance Computing

Extensions

Conclusion

• WheelFS: distributed storage layer for newly-written applications

• Control through explicit semantic cues

• Performance by reading globally and writing locally


Top Related