peer-peer and application-level networkingledvina/dht/ucalgary/p2p-tutorial.pdf · 30 gnutella:...
TRANSCRIPT
1
Peer-peer and Application-level Networking
Don TowsleyUMass-Amherst
with help of lots of others (J. Kurose, B. Levine, J. Crowcroft, CMPSCI 791N class)
Don Towsley 2002
2
0. Introduction
backgroundmotivationoutline of the tutorial
3
Peer-peer networking
4
Peer-peer networking Focus at the application level
5
Peer-peer networkingPeer-peer applications
Napster, Gnutella, CAN: file sharingad hoc networksmulticast overlays (e.g., video distribution)
6
Peer-peer networkingQ: What are the new technical challenges?Q: What new services/applications enabled?Q: Is it just “networking at the application-level”?
“There is nothing new under the sun” (William Shakespeare)
7
Tutorial Contents
introductionclient-server v. P2Parchitectures
centralized search• Napster
distributed search -flooding
• Gnutelladistributed search -hashing
• CAN, CHORD, …application-level multicast
research issuessecuritymodelingmore general applications
summary
8
Client Server v. Peer to Peer(1)
RPC/RMIsynchronousassymmetricemphasis on language integration and binding models (stub IDL/XDR compilers etc)Kerberos style security – access control, crypto
messagesasynchronoussymmetricemphasis on service location, content addressing, application layer routing.anonymity, high availability, integrity.harder to get right☺
9
Peer to peer systems actually old
IP routers are peer to peer.routers discover topology, and maintain itrouters are neither client nor serverrouters continually talk to each otherrouters inherently fault tolerantrouters are autonomous
10
Peer to peer systems
nodes have no distinguished roleno single point of bottleneck or failure.need distributed algorithms for
service discovery (name, address, route, metric, etc)neighbour status trackingapplication layer routing (based possibly on content, interest, etc)resilience, handing link and node failures…
11
Ad hoc networks and peer2peer
wireless ad hoc networks have many similarities to peer to peer systemsno a priori knowledgeno given infrastructurehave to construct it from “thin air”!
12
Overlays and peer 2 peer systems
P2p technology often used to create overlays offering services that could be offered in the IP leveluseful deployment strategyoften economically a way around other barriers to deploymentIP was an overlay (on telephone core infrastructure)not all overlays are P2P (AKAMAI)
13
P2P Architecture Classification
centralized service location (CSL)Napster
distributed service location with flooding (DSLF)
Gnutelladistributed service location with hashing (DSLH)
CAN, Pastry, Tapestry, Chord
14
Centralized Search Architecture
centralized directory service
search directory
Lord of the Rings?
A, C
A
B
C
D
E
15
NAPSTER
the most (in)famousnot the first (c.f. probably Eternity, from Ross Anderson in Cambridge)but instructive for what it gets right, andalso wrong…also has a political message…and economic and legal…
16
Napsterprogram for sharing files over the Interneta “disruptive” application/technology?history:
5/99: Shawn Fanning (freshman, Northeasten U.) founds Napster Online music service12/99: first lawsuit3/00: 25% UWisc traffic Napster2/01: US Circuit Court of
Appeals: Napster knew users violating copyright laws
7/01: # simultaneous online users:Napster 160K, Gnutella: 40K,
Morpheus: 300K
17
judge orders napsterto stop in July ‘01 other filesharing apps take over!
gnutellanapsterfastrack
8M
6M
4M
2M
0.0bi
ts p
er s
ec
18
Napster: how does it work
Application-level, client-server protocol over point-to-point TCP
Four steps:connect to Napster serverupload your list of files (push) to server.give server keywords to search the full list with.select “best” of correct answers. (pings)
19
Napster
napster.com
users
File list is uploaded
1.
20
Napster
napster.com
user
Requestand
results
User requests search at server.
2.
21
Napster
napster.com
user
pings pings
User pings hosts that apparently have data.
Looks for best transfer rate.
3.
22
Napster
napster.com
user
Retrievesfile
User retrieves file
4.
23
Napster: architecture notes
centralized server: single logical point of failurecan load balance among servers using DNS rotationpotential for congestionNapster “in control” (freedom is an illusion)
no security: passwords in plain textno authentication no anonymity
24
Distributed Search/Flooding
25
Distributed Search/Flooding
26
Gnutella
peer-to-peer networking: applications connect to peer applications focus: decentralized method of searching for fileseach application instance serves to:
store selected filesroute queries (file searches) from and to its neighboring peersrespond to queries (serve file) if file stored locally
27
Gnutella
Gnutella history:3/14/00: release by AOL, almost immediately withdrawntoo late: 23K users on Gnutella at 8 am this AMmany iterations to fix poor initial design (poor design turned many people off)
what we care about:how much traffic does one query generate?
how many hosts can it support at once?what is the latency associated with querying?is there a bottleneck?
28
Gnutella: how it worksSearching by flooding:
if you don’t have the file you want, query 7 of your partners.if they don’t have it, they contact 7 of their partners, for a maximum hop count of 10.requests are flooded, but there is no tree structure.no looping but packets may be received twice.reverse path forwarding(?)
Note: Play gnutella animation at: http://www.limewire.com/index.jsp/p2p
29
Flooding in Gnutella: loop prevention
Seen already list: “A”
30
Gnutella: initial problems and fixes
freeloading: WWW sites offering search/retrieval from Gnutella network without providing file sharing or query routing.
Block file-serving to browser-based non-file-sharing users prematurely terminated downloads:
long download times over modemsmodem users run gnutella peer only briefly (Napster problem also!) or any users becomes overloadedfix: peer can reply “I have it, but I am busy. Try again later” late 2000: only 10% of downloads succeed2001: more than 25% downloads successful (is this success or failure?)
www.limewire.com/index.jsp/net_improvements
31
Gnutella: initial problems and fixes (more)
2000: avg size of reachable network ony 400-800 hosts. Why so smalll?
modem users: not enough bandwidth to provide search routing capabilities: routing black holes
Fix: create peer hierarchy based on capabilitiespreviously: all peers identical, most modem blackholesconnection preferencing:
• favors routing to well-connected peers• favors reply to clients that themselves serve large number of
files: prevent freeloadingLimewire gateway functions as Napster-like central server on behalf of other peers (for searching purposes)
www.limewire.com/index.jsp/net_improvements
32
Gnutella Discussion:
architectural lessons learned?anonymity and security?other?good source for technical info/open questions:http://www.limewire.com/index.jsp/tech_papers
33
Kazaa
hierarchical Gnutellasupernodes and regular nodes
most popular p2p app>120M downloads
not well understoodbinariesencrypted communications
supernodes
34
hash tablesessential building block in software systems
Internet-scale distributed hash tablesequally valuable to large-scale distributed
systems?• peer-to-peer systems
– CAN, Chord, Pastry, …• large-scale storage management systems
– Publius, OceanStore,, CFS ...• mirroring on the Web
Internet-scale hash tables
35
hash tablesessential building block in software systems
Internet-scale distributed hash tablesequally valuable to large-scale distributed
systems?• peer-to-peer systems
– CAN, Chord, Pastry, …• large-scale storage management systems
– Publius, OceanStore,, CFS ...• mirroring on the Web
Internet-scale hash tables
36
Content-Addressable Network[Ratnasamy,etal]
introductiondesignevaluationstrengths & weaknessesongoing work
37
Content-Addressable Network(CAN)
CAN: Internet-scale hash table
interfaceinsert(key,value)value = retrieve(key)
38
Content-Addressable Network(CAN)
CAN: Internet-scale hash table
interfaceinsert(key,value)value = retrieve(key)
propertiesscalableoperationally simplegood performance (w/ improvement)
39
Outline
introductiondesignevaluationstrengths & weaknessesongoing work
40
K V
CAN: basic idea
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
41
CAN: basic idea
insert(K1,V1)
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
42
CAN: basic idea
insert(K1,V1)
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
K V
43
CAN: basic idea
(K1,V1)
K V
K VK V
K V
K V
K V
K V
K V
K V
K V
K V
44
CAN: basic idea
retrieve (K1)
K V
K VK V
K V
K V
K V
K V
K V
K V
K V
K V
45
CAN: solution
virtual Cartesian coordinate space
entire space is partitioned amongst all the nodes
every node “owns” a zone in the overall space
abstractioncan store data at “points” in the space can route from one “point” to another
point = node that owns the enclosing zone
46
CAN: simple example
1
0 1
1
47
CAN: simple example
1 2
48
CAN: simple example
1
2
3
49
CAN: simple example
1
2
3
4
50
CAN: simple example
51
CAN: simple example
I
52
CAN: simple example
node I::insert(K,V)
I
53
(1) a = hx(K)
CAN: simple example
x = a
node I::insert(K,V)
I
54
(1) a = hx(K)b = hy(K)
CAN: simple example
x = a
y = b
node I::insert(K,V)
I
55
(1) a = hx(K)b = hy(K)
CAN: simple example
(2) route(K,V) -> (a,b)
node I::insert(K,V)
I
56
CAN: simple example
(2) route(K,V) -> (a,b)
(3) (a,b) stores (K,V)
(K,V)
node I::insert(K,V)
I(1) a = hx(K)b = hy(K)
57
CAN: simple example
(2) route “retrieve(K)” to (a,b) (K,V)
(1) a = hx(K)b = hy(K)
node J::retrieve(K)
J
58
Data stored in the CAN is addressed by name (i.e. key), not location (i.e. IP address)
CAN
59
CAN: routing table
60
CAN: routing
(a,b)
(x,y)
61
A node only maintains state for its immediate neighboring nodes
CAN: routing
62
CAN: node insertion
Bootstrapnode
1) Discover some node “I” already in CANnew node
63
CAN: node insertion
I
new node1) discover some node “I” already in CAN
64
CAN: node insertion
2) pick random point in space
I
(p,q)
new node
65
CAN: node insertion
(p,q)
3) I routes to (p,q), discovers node J
I
J
new node
66
CAN: node insertion
newJ
4) split J’s zone in half… new owns one half
67
Inserting a new node affects only a single other node and its immediate neighbors
CAN: node insertion
68
CAN: node failures
Need to repair the space
recover database (weak point)• soft-state updates• use replication, rebuild database from replicas
repair routing • takeover algorithm
69
CAN: takeover algorithm
simple failuresknow your neighbor’s neighborswhen a node fails, one of its neighbors takes over its zone
more complex failure modessimultaneous failure of multiple adjacent nodes scoped flooding to discover neighborshopefully, a rare event
70
Only the failed node’s immediate neighbors are required for recovery
CAN: node failures
71
Design recap
basic CANcompletely distributedself-organizingnodes only maintain state for their immediate neighbors
additional design featuresmultiple, independent spaces (realities)background load balancing algorithmsimple heuristics to improve performance
72
Evaluation
scalabilitylow-latencyload balancingrobustness
73
CAN: scalability
for a uniformly partitioned space with n nodes and ddimensions
per node, number of neighbors is 2daverage routing path is (dn1/d)/4 hopssimulations show that the above results hold in practice
can scale the network without increasing per-node state
optimal choice of d for given n yields~log(n) nbrs with ~log(n) hops
74
CAN: low-latency
#nodes
Late
ncy
stre
tch
020
406080
100120140
160180
16K 32K 65K 131K
w/o heuristics
w/ heuristics
#dimensions = 2
0
2
4
6
8
10
#nodes16K 32K 65K 131K
#dimensions = 10
75
CAN: load balancing
dealing with hot-spotspopular (key,value) pairsnodes cache recently requested entriesoverloaded node replicates popular entries at neighbors
uniform coordinate space partitioninguniformly spread (key,value) entriesuniformly spread out routing load
76
Uniform Partitioning
added check at join time, pick a zonecheck neighboring zonespick the largest zone and split that one
77
CAN: node insertion
(p,q)
78
CAN: node insertion
79
CAN: node insertion
80
0
20
40
60
80
100
Uniform Partitioning
V 2V 4V 8V
Volume
Percentageof nodes
w/o check
w/ check
V = total volumen
V16
V8
V4
V2
65,000 nodes, 3 dimensions
81
CAN: Robustness
completely distributed no single point of failure ( not applicable to pieces of database when node failure happens)
not exploring database recovery (in case there are multiple copies of database)
resilience of routingcan route around trouble
82
Strengths
more resilient than flooding broadcast networksefficient at locating informationfault tolerant routingnode & Data High Availability (w/ improvement)manageable routing table size & network trafficcan build variety of services (application multicast)
83
Multicast
associate multicast group with index, Ireverse path forwarding tree
I
84
Multicast
associate multicast group with index, Ireverse path forwarding treesend to (hx(I),hy(I))
I
85
Weaknesses
impossible to perform a fuzzy searchsusceptible to malicious activitymaintain coherence of all the indexed data (network overhead, efficient distribution)still relatively higher routing latencypoor performance w/o improvementnodes coming and going?
86
Summary
CANan Internet-scale hash tablepotential building block in Internet applications
scalabilityO(d) per-node state
low-latency routingsimple heuristics help a lot
robustdecentralized, can route around trouble
87
Related Work
TapestryZhao, Kubiatowicz, Joseph (UCB)
ChordStoica, Morris, Karger, Kaashoek, Balakrishnan (MIT / UCB)
PastryDruschel and Rowstron(Rice / Microsoft Research)
88
Basic Idea of Chord
m bit identifier space for both keys and nodes
Key identifier = SHA-1(key)
Key=“LetItBe” ID=60SHA-1
IP=“198.10.10.1” ID=123SHA-1node identifier = SHA-1(IP address)
both are uniformly distributed
how to map key IDs to node IDs?
89
Consistent Hashing [Karger 97]
A key is stored at its successor: node with next higher ID
N32
N90
N123 K20
K5
Circular 7-bitID space
0IP=“198.10.10.1”
K101
K60Key=“LetItBe”
90
Consistent Hashingevery node knows of every other node
requires global informationrouting tables are large O(N)lookups are fast O(1)
N32
N90
N123
0
Hash(“LetItBe”) = K60
N10
N55
Where is “LetItBe”?
“A has K60”
K60
91
Chord: Basic Lookup
N32
N90
N123
0
Hash(“LetItBe”) = K60
N10
N55
Where is “LetItBe”?
“A has K60”
K60
every node knows its successor in the ring
requires O(N) time
92
“Finger Tables”
every node knows m other nodes in the ring
increase distance exponentially
N8080 + 20
80 + 2180 + 22
80 + 23
N9680 + 24
N11280 + 25
N1680 + 26
93
“Finger Tables”finger i points to successor of n+2i
table contains O(log N) entries
N120
N8080 + 20
N112
N96
N16
80 + 2180 + 22
80 + 23
80 + 24
80 + 25 80 + 26
94
Lookups are Faster
lookups take O(Log N) hops
N32
N10
N5
N20N110
N99
N80
N60
Lookup(K19)
K19
95
Issues
joins/leavesload balancing…
96
Performance
several measurement studies (Napster, Gnutella)
highly variable connection timeslots of freeloading
little analysis
97
Performance Modeling [Ge, etal]
evaluation of different architecturespopulation of users
cycle through on-off statessystem provides
common services (e.g., search)download services
98
1
M
think timecommonservices
file downloadservices
off-line
µs(Na)
µs(Na,·)
qf(Na,,·)
99
Download Services
heavy tailed, Zipf preference distribution, pi ∝ 1/iαservice capacity proportional to popularity
µf(Na,i) = pi Cf
100
Common Services
CSLµs(Na) = C1; qf(Na ) = 0
DSLFµs(Na) = Cq Na /Tβ
T – max. TTL for flooding β – connectivity parameter; β>1 qf(Na ) > 0
DSLHµs(Na) = Cq Na /log(Na)qf(Na ) = 0
101
Solution Methods
bounding analysisfixed point solutions
102
Comparison of Three Architectures
CSL has services bottleneckDSLF suffers from occasional failure to find fileDSLH more scalable
1
10
100
1000
1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09
Total Population: N
Syst
em T
roug
hput
: T
CIADIFADIHA
CSLDSLFDSLH
103
0
500
1000
1500
2000
2500
3000
0 1E+08 2E+08 3E+08 4E+08Total Population: N
Syst
em T
hrou
ghpu
t: T
capacity: supernode/other node = 10/1,#supernode/#node=1/52.250capacity: supernode/other node = 2/1,#supernode/#node=1/11.250capacity: supernode/other node = 1/1,#supernode/#node=1/6.125
Supernodes (Kazaa)
hierarchy helpsplacing well-provisioned nodes at top good idea
104
Anonymity: Crowds [Reiter98]
decentralized P2P solutionanonymous within the Crowdjondo (John Doe)
ProxyUser
path based
105
Path-based Initiator Anonymity
R
X
Y
Z
I
Packets passed from initiator, I, to peers which deliver the packet to the responder R.
106
Crowds Paths
R
X
Y
Z
I
• weighted coin flip• spinner
107
Performance Issues
routing in overlays incurs a performance penalty
can it be quantified?can it be mitigated?
dynamic nature of user populationrobustness?performance?tradeoff between service location paradigms?
108
Performance Issues
p2p file sharing vs. CDN/web (Akamai)compare robustness?compare performance?handling flash crowds?
p2p measurementkazaa!!
security of p2phow to measure/evaluate security
109
Wrapup discussion questions (1):What is a peer-peer network (what is not a peer-to-peer network?). Necessary:
every node is designed to (but may not by user choice) provide some service that helps other nodes in the network get serviceeach node potentially has the same responsibility, functionality (maybe nodes can be polymorhpic)some applications (e.g., Napster) are a mix of peer-peer and centralized (lookup is centralized, file service is peer-peer) [recursive def. of peer-peer](logical connectivity rather than physical connectivity) routing will depend on service and data
110
Overlays?
What is the relationship between peer-peer and application overlay networks?
peer-peer and application overlays are different things. It is possible for an application level overlay to be built using peer-peer (or vice versa) but not always necessaryoverlay: in a wired net: if two nodes can communicate in the overlay using a path that is not the path the network level routing would define for them. Logical network on top of underlying network
• source routing?wireless ad hoc nets – what commonality is there REALLY?
111
Wrapup discussion questions (3):is ad hoc networking a peer-peer application?
Yes (30-1)why peer-peer over client-server?
A well-deigned p2p provides better “scalability”why client-server over peer-peer
peer-peer is harder to make reliableavailability different from client-server (p2p is more often only partially “up”)more trust is required
if all music were free in the future (and organized), would we have peer-peer.
Is there another app: ad hoc networking, any copyrighted data, peer-peer sensor data gathering and retrieval, simulation
evolution #101 – what can we learn about systems?
112
THANKS!
slides can be found athttp://gaia.cs.umass.edu/towsley/p2p-tutorial.pdf