wide area data sharing with logistical networking micah beck, assoc. prof. & director logistical...
TRANSCRIPT
Wide Area Data Sharing with Logistical Networking
Micah Beck, Assoc. Prof. & DirectorLogistical Computing & Internetworking (LoCI) LabComputer Science Department
End-to-End Workshop, Miami Feb 5, 2003
» Funding• Dept. of Energy
SciDAC• National Science
Foundation ANIR• UT Center for Info
Technology Research
» LoCI Lab developers
Logistical Networking Research
» University of Tennessee• Micah Beck• James S. Plank• Jack Dongarra
» University of California, Santa Barbara• Rich Wolski
The Data Sharing Problem
» Large data objects created as byproducts of common operations
» A large community of potential collaborators that might need access to the data
» Asynchrony between collaborators (especially when in different time zones)
» No single administrative domain» No centrally managed resource pool (DB or FS)» Control of access to data is necessary
The Internet in Collaboration
» Providing bandwidth resources on demand» Network community is fluid, loosely organized» Any two endpoints can communicate» User authentication, security are managed by
endpoints» But: there is no persistence, hence no direct
support for asynchronous collaboration!
Logistical Networking in Collaboration
» Adding persistence of data to the network while maintaining other important properties:• Resources available to any community member
»10TB now, 50TB by 2003, 100TB-1PB goal• No centralized administrative domain
»Each allocation is individually managed• No central management of resource pool
»Allocations are time limited»Storage reclaimed when they expire!
• Access control, security managed by endpoints.
The Network Storage Stack
Applications
Logistical File System
Logistical Tools
L-Bone
IBP
Local Access
Physical
exNode
• Our adaption of the network stack architecture for storage
• Like the IP Stack
• Each level encapsulates details from the lower levels, while still exposing details to higher levels
IBP: The Internet Backplane Protocol
» Storage provisioned on community “depots”» Very primitive service (similar to block service, but
more sharable)• Goal is to be a common platform (exposed)• Also part of end-to-end design
» Best effort service – no heroic measures• Availability, reliability, security, performance
» Allocations are time-limited!• Leases are respected, can be renewed• Permanent storage is to strong to share!
Models of Sharing: Logistical Networking
» Moderately valuable resources• Storage, server cycles
» Sharing enabled by relative plenty
» Internet-like policies• Loose access control• No per-use accounting
» Primary design goal: scalability• Application autonomy• Resource
transparency» Burdens of scalability
• The End-to-End Principles
• Weak operation semantics
• Vulnerability to Denial of Service
The Network Storage Stack
The L-bone:Resource Discovery& Proximity queries
IBP: Allocating and managing networkstorage (like a network malloc)
The exNode:A data structurefor aggregation
LoRS: The Logistical Runtime System:Aggregation tools and methodologies
The Logistical Backbone (L-Bone)
» LDAP-based storage resource discovery.
» Query by capacity, network proximity, geographical proximity, stability, etc.
» Periodic monitoring of depots.
» Currently10 Terabytes of shared storage.• 50 TB awarded, 100TB proposed • Our goal is 1PB global total
L-Bone: January 2003
The Network Storage Stack
The L-bone:Resource Discovery& Proximity queries
IBP: Allocating and managing networkstorage (like a network malloc)
The exNode:A data structurefor aggregation
LoRS: The Logistical Runtime System:Aggregation tools and methodologies
The exNode
» The Network “File Descriptor» XML-based data structure/serialization» Map byte-extents to IBP buffers (or other
allocations).» Allows for replication, flexible decomposition of
data.» Also allows for error-correction/checksums» Arbitrary metadata.
ExNode vs inode
exNode
inode
IBP Allocations
the network
local system
disk blocks
kernel
capabilities
block addresses
user
ExNode Mobility
XML Serialization
The exNode serialization is a portable soft link
The Network Storage Stack
The L-bone:Resource Discovery& Proximity queries
IBP: Allocating and managing networkstorage (like a network malloc)
The exNode:A data structurefor aggregation
LoRS: The Logistical Runtime System:Aggregation tools and methodologies
Logistical Runtime System
» Basic Primitives:• Upload, Download, Augment, Refresh
» End-to-end Services• Checksums, Encryption, Compression
» Other Things We Can Do• Routing through an intermediate depot to
reduce IP RTT, speeding up TCP transfers• Overlay multicast using either multiple
TCP streams or IP multicast at tree nodes» What’s missing?
• Management by Applications!
Upload
Augment
Download
Routing through Intermediate Depots
IBP Enables Data Intensive Collaboration
» Large files can be uploaded to nearby depots, then managed by movement between depots• End systems are not involved in long distance
transfers» Data can be moved near to distant collaborator
without being downloaded into their end system• Direct access to collaborators private storage is
not required» Depot-to-depot transfers can take advantage of
multithreading, UDP transfer, Net/Web 100, other high-performance optimizations
The Next Step: Computation!
» Depots can store data, but cannot compute, e.g.• Recomputing checksums for stored data would
help maintain redundancy• Operations such as XOR required to recover
redundantly stored data in case of loss» The Network Functional Unit is an extension of the
depot that operates on stored data• NFU operations are limited, cannot access data
outside of depot• Management of “process state” must be
performed at end systems.
LoCI Lab Online http://loci.cs.utk.edu
» IBP server and clients for Unix/Linux/OS X• Additional clients for Java, Win32
» Logistical Runtime System libraries and tools• Run under Unix/Linux/OS X natively• Ported to Windows under Cygwin • Includes visualization (Tcl/tk)• Web interface
» Logistical Backbone resource discovery server• Unix/Linux/OS X only
» Publications, documentation, L-Bone status