peer-to-peer computing research a fad? - dtc · p2p: an exciting social development •internet...
TRANSCRIPT
![Page 1: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/1.jpg)
Peer-to-peer computing researcha fad?
Frans [email protected]
NSF Project IRIShttp://www.project-iris.net
Berkeley, ICSI, MIT, NYU, Rice
![Page 2: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/2.jpg)
What is a P2P system?
• A distributed system architecture:• No centralized control• Nodes are symmetric in function
• Large number of unreliable nodes• Enabled by technology improvements
Node
Node
Node Node
Node
Internet
![Page 3: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/3.jpg)
P2P: an exciting social development
• Internet users cooperating to share, forexample, music files• Napster, Gnutella, Morpheus, KaZaA, etc.
• Lots of attention from the popular press“The ultimate form of democracy on the
Internet”“The ultimate threat to copy-right protection
on the Internet”
![Page 4: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/4.jpg)
How to build robust services?
• Many critical services use Internet• Hospitals, government agencies, etc.
• These services need to be robust• Node and communication failures• Load fluctuations (e.g., flash crowds)• Attacks (including DDoS)
![Page 5: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/5.jpg)
Example: peer-to-peer data archiver
• Back up hard disk to other users’ machines• Why?
• Backup is usually difficult• Many machines have lots of spare disk space
• Requirements for cooperative archiver:• Divide data evenly among many computers• Find data• Don’t lose any data• High performance: backups are big
• More challenging than sharing music!
![Page 6: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/6.jpg)
The promise of P2P computing
• Reliability: no central point of failure• Many replicas• Geographic distribution
• High capacity through parallelism:• Many disks• Many network connections• Many CPUs
• Automatic configuration• Useful in public and proprietary settings
![Page 7: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/7.jpg)
Traditional distributed computing:client/server
• Successful architecture, and will continue to be so• Tremendous engineering necessary to make
server farms scalable and robust
Server
Client
Client Client
Client
Internet
![Page 8: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/8.jpg)
Application-level overlays
• One per application• Nodes are decentralized• NOC is centralized
ISP3
ISP1 ISP2
Site 1
Site 4
Site 3Site 2
N
N N
N
N
N
P2P systems are overlay networks without central control
![Page 9: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/9.jpg)
Distributed hash table (DHT)
Distributed hash table
Distributed application
get (key) data
node node node….
put(key, data)
Lookup service
lookup(key) node IP address
• DHT distributes data storage over perhaps millions of nodes• Many applications can use the same DHT infrastructure
(Backup)
(DHash)
(Chord)
![Page 10: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/10.jpg)
A DHT has a good interface
• Put(key, value) and get(key) Æ value• Simple interface!
• API supports a wide range of applications• DHT imposes no structure/meaning on keys
• Key/value pairs are persistent and global• Can store keys in other DHT values• And thus build complex data structures
![Page 11: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/11.jpg)
A DHT makes a good sharedinfrastructure
• Many applications can share one DHT service• Much as applications share the Internet
• Eases deployment of new applications• Pools resources from many participants
• Efficient due to statistical multiplexing• Fault-tolerant due to geographic distribution
![Page 12: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/12.jpg)
Many applications for DHTs
• Streaming [SplitStream, …]• Content distribution networks [Coral, Squirrel, ..]• File systems [CFS, OceanStore, PAST, Ivy, …]• Archival/Backup store [HiveNet,Mojo,Pastiche]• Censor-resistant stores [Eternity, FreeNet,..]• DB query and indexing [PIER, …]• Event notification [Scribe]• Naming systems [ChordDNS, Twine, ..]• Communication primitives [I3, …]
Common thread: data is location-independent
![Page 13: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/13.jpg)
DHT implementation challenges
1. Scalable lookup2. Handling failures3. Network-awareness for performance4. Data integrity5. Coping with systems in flux6. Balance load (flash crowds)7. Robustness with untrusted participants8. Heterogeneity9. Anonymity10. Indexing
Goal: simple, provably-good algorithms
thistalk
![Page 14: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/14.jpg)
1. The lookup problem
put(key, data)
Internet
N1N2 N3
N6N5N4
Publisher Clientget(key)
?
How do you find the node responsible for a key?
![Page 15: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/15.jpg)
• Central server knows where all keys are
• Simple, but O(N) state for server• Server can be attacked (lawsuit killed Napster)
Centralized lookup (Napster)
Client
Lookup(key)
N6
N9 N7
DB
N8
N3
N2N1“key is at N4”
N4key/value
![Page 16: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/16.jpg)
Flooding queries (Gnutella)
N4
Client
N6
N9
N7N8
N3
N2N1 Lookup(key)
• Lookup by asking every node
• Robust but expensive: O(N) messages per lookup
key/value
![Page 17: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/17.jpg)
Routed lookups
N30
N50
N60 N45
N25
N5
N15N10Client2
get(key=50)
• Each node has a numeric identifier (ID)• Route lookup(key) in ID space
N100
N12
put(key=50, data=…)Client1
![Page 18: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/18.jpg)
Routing algorithm goals
• Fair (balanced) key range assignments• Easy to maintain routing table
• Dead nodes lead to time outs
• Small number of hops to route message• Low stretch
• Stay robust despite rapid change
• Solutions:• Small table and Log(N) hops: CAN, Chord, Kademlia,
Pastry, Tapestry, Koorde, etc.• Big table and One/two hops: Kellips, EpiChord, etc.
![Page 19: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/19.jpg)
Chord key assignments
N32N90
N105
N60
• Each node has 160-bit ID
CircularID space
• ID space is circularK20
K5
K80
• Data keys are also IDs
(N90 is responsible forkeys K61 through K90)
• A key is stored on the next higher node• Good load balance• Easy to find keys slowly
![Page 20: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/20.jpg)
Chord’s routing table
• Routing table listsnodes:• _ way around circle• _ way around circle• 1/8 way around circle• …• next around circle
• The table is small:• log N entries N80
__
1/8
1/161/321/641/128
![Page 21: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/21.jpg)
Chord lookups take O(log N) hops
• Each step goes atleast halfway to thedestination
Node N32 looks up key K19
N32
N10N5
N110
N99
N80N60
N20K19
• Lookups are fast:• log N steps
_
_
![Page 22: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/22.jpg)
2. Handling failures: redundancy
• Each node knows about nextr nodes on circle
• Each key is stored by the rnodes after it on the circle
• To save space, each nodestores only a piece of theblock
• Collecting half the pieces isenough to reconstruct theblock
N32
N10N5
N110
N99
N80N60
N20K19
K19
N40 K19
![Page 23: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/23.jpg)
Redundancy handles failuresFa
iled
Look
ups
(Fra
ctio
n)
Failed Nodes (Fraction)
• 1000 DHT nodes• Average of 5 runs• 6 replicas for each key
• Kill fraction of nodes• Then measure how
many lookups fail• All replicas must be
killed for lookup to fail
![Page 24: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/24.jpg)
3. Reducing Chord lookup delay
• Any node with a closer ID could be used in route• Knowing about proximity could help performance
• Key challenge: avoid O(N2) pings
N20
N41N80N40
• N20’s routing table may point to distant nodes
![Page 25: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/25.jpg)
Estimate latency using syntheticcoordinate system
• Each node estimates its position• Position = (x,y): “synthetic coordinates”• x and y units are time (milliseconds)• Distance between two nodes’
coordinates predicts network latency• Challenges: triangle equality, etc.
![Page 26: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/26.jpg)
Vivaldi synthetic coordinates• Each node starts with a
random incorrect position
0,1
1,2
2,3
3,0
![Page 27: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/27.jpg)
Vivaldi synthetic coordinates• Each node starts with a
random incorrect position
2 ms
1 ms
2 ms
1 ms
• Each node “pings” a fewother nodes to measurenetwork latency (distance)
A
B
![Page 28: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/28.jpg)
Vivaldi synthetic coordinates• Each node starts with a
random incorrect position• Each node “pings” a few
other nodes to measurenetwork latency (distance)
2
1
2
1• Each nodes “moves” to
cause measured distancesto match coordinates
![Page 29: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/29.jpg)
Vivaldi synthetic coordinates• Each node starts with a
random incorrect position• Each node “pings” a few
other nodes to measurenetwork latency (distance)
2
1
2
1• Each nodes “moves” to
cause measured distanceto match coordinates
• Minimize force in springnetwork
![Page 30: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/30.jpg)
Vivaldi in action
• Execution on 86 PlanetLabInternet hosts
• Each host only pings a fewother random hosts
• Most hosts find usefulcoordinates after a fewdozen pings
• Use 3D coordinates
![Page 31: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/31.jpg)
Vivaldi vs. network coordinates
• Vivaldi’s coordinates match geography well• over-sea distances shrink (faster than over-land)• orientation of Australia and Europe is “wrong”
• Simulations confirm results for larger networks
![Page 32: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/32.jpg)
Vivaldi predicts latency well
0
200
400
600
0 200 400 600
Predicted latency (ms)
Act
ual
late
ncy
(m
s) NYUAUS
y = x
![Page 33: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/33.jpg)
0
100
200
300
400
500
Without Vivaldi With Replica Selection
Fetc
h t
ime
(ms)
Lookup Download
DHT fetches with Vivaldi (1)
• Choose the lowest-latency copy of data
![Page 34: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/34.jpg)
DHT fetches with Vivaldi (2)
0
100
200
300
400
500
Without Vivaldi With ReplicaSelection
With ProximityRouting
Fetc
h t
ime
(ms) Lookup Download
• Choose the lowest-latency copy of data• Route Chord lookup through nearby nodes
![Page 35: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/35.jpg)
DHT implementation summary
• Chord for looking up keys• Replication at successors for fault tolerance• Fragmentation and erasure coding to reduce
storage space• Vivaldi network coordinate system for
• Server selection• Proximity routing
![Page 36: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/36.jpg)
Backup system on DHT
• Store file system image snapshots as hashtrees• Can access daily images directly• Yet images share storage for common blocks• Only incremental storage cost• Encrypt data
• User-level NFS server parses file systemimages to present dump hierarchy
• Application is ignorant of DHT challenges• DHT is just a reliable block store
![Page 37: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/37.jpg)
Future work
DHTs• Improve performance• Handle untrusted nodes
Vivaldi• Does it scale to larger and more diverse
networks?Apps
• Need lots of interesting applications
![Page 38: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/38.jpg)
Philosophical questions
• How decentralized should systems be?• Gnutella versus content distribution network• Have a bit of both? (e.g., CDNs)
• Why does the distributed systems communityhave more problems with decentralizedsystems than the networking community?• “A distributed system is a system in which a
computer you don’t know about renders your owncomputer unusable”
• Internet (BGP, NetNews)
![Page 39: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/39.jpg)
Related Work
Lookup algs• CAN, Kademlia, Koorde, Pastry, Tapestry,
Viceroy, …DHTs
• DOLR, Past, OpenHash…Network coordinates and springs
• GNP, Hoppe’s mesh relaxationApplications
• Ivy, OceanStore, Pastiche, Twine, …
![Page 40: Peer-to-peer computing research a fad? - DTC · P2P: an exciting social development •Internet users cooperating to share, for example, music files •Napster, Gnutella, Morpheus,](https://reader033.vdocuments.site/reader033/viewer/2022060602/6056c4c45efe1144d1790456/html5/thumbnails/40.jpg)
Conclusions
• Peer-to-peer promises some great properties• DHTs are a good way to build peer-to-peer
applications:• Easy to program• Single, shared infrastructure for many applications• Robust in the face of failures• Scalable to large number of servers• Self configuring• Can provide high performance
http://www.project-iris.net