navigating and sharing in a decentralized world
DESCRIPTION
Navigating and Sharing in a Decentralized World. Francisco Matias Cuenca-Acuna http://www.panic-lab.rutgers.edu/. People. Graduate students Christopher Peery Konstantinos Kleisouris Faculty Richard Martin Thu Nguyen. Federated computing. - PowerPoint PPT PresentationTRANSCRIPT
Navigating and Sharing in a Decentralized World
Francisco Matias Cuenca-Acuna
http://www.panic-lab.rutgers.edu/
People• Graduate students
− Christopher Peery− Konstantinos Kleisouris
• Faculty− Richard Martin− Thu Nguyen
Federated computing• Current trend toward ubiquitous Internet connectivity is
driving a new model of federated computing− Computing systems that are geographically distributed and may
span multiple organizations
• Concurrently, deep penetration of computer usage − 500 million PCs in operation worldwide (1 for every 12 people)
− 80% of them are in desktops− 40% annual growth
− 600 million Internet users worldwide
Federated computing appearing at every level− Social group-based sharing
− P2P: Gnutella, KaZaA, DirectConnect− Web-based: Ebay, Google groups, Yahoo groups, DMOZ
− Scientific computing− Many emerging research grids: http://www.gpds.org/
− Business-to-business ecommerce
Source http://news.com.com/2100-1040-940713.html
The challenge• Federated computing provides the opportunity to
harness vast amount of resources− Consider just data sharing
− Users produce 740TB of information per year− Information per person is growing continuously
− 80% annual growth on total disk capacity sold per year Emergence of huge distributed data repositories
− Local community of 3000 undergraduate students sharing 20TB of data
− WWW: Google had indexed 1 billion pages (20TB of content) by 2000
− The European Data Grid has only 100’s of nodes but PB’s of data
• Challenge: how to manage and actually use these resources− Decentralized control− Widely distributed− Heterogeneous components
Source http://www.sims.berkeley.edu/research/projects/how-much-info/
The PlanetP Project• Information and resource management for networked
communities− Data sharing
− Provide content-based access & ranking of results− Allow user to cooperatively organize data− Provide predictable data availability
− Deployment, monitoring, and management of federated services
− Provide a common runtime environment− Distributively follow sysadmin guidelines for service deployment − Example: UDDI naming service for web services
• Dealing with Decentralization− Self-management & self-configuration − Autonomous cooperation− Loosely synchronized global information− Randomized algorithms
Current state of the project
Data propagation
Content indexing and ranking
Automatic replication for availability
Global namespace and storage management
Service management
Current state of the project
Data propagation
Content indexing and ranking
Automatic replication for availability
Global namespace and storage management
Service management• Based on epidemic communication
• Very resilient to node/network failures
• Membership management
• Every node has a loosely synchronized view of the community
Current state of the project
Data propagation
Content indexing and ranking
Automatic replication for availability
Global namespace and storage management
Service management• Distributed information ranking algorithm
• Allows search engine like queries
• 2 step search & rank to deal with outdated information
Current state of the project
Data propagation
Content indexing and ranking
Automatic replication for availability
Global namespace and storage management
Service management• Allow users to specify data availability
• Present a probabilistic availability model
• Monitor availability as community changes
Work in progress
Data propagation
Content indexing and ranking
Automatic replication for availability
Global namespace and storage management
Service management• File system interface over communal content
• Unlike the Web the namespace is writeable
• Dynamic namespace management
• Automated local storage management
• Remove content if we can recover it
• Hoarding for disconnected operation
Work in progress
Data propagation
Content indexing and ranking
Automatic replication for availability
Global namespace and storage management
Service management• Distributed runtime for Web Services
• Administrators just dictate the policy
• They reason about
• capacity
• availability
• privacy issues
• Provide self deployment and monitoring
The PlanetP Project http://www.panic-lab.rutgers.edu/
Questions?