peer-peer and application-level networkingledvina/dht/ucalgary/p2p-tutorial.pdf · 30 gnutella:...

1

Peer-peer and Application-level Networking

Don TowsleyUMass-Amherst

with help of lots of others (J. Kurose, B. Levine, J. Crowcroft, CMPSCI 791N class)

Don Towsley 2002

2

0. Introduction

backgroundmotivationoutline of the tutorial

3

Peer-peer networking

4

Peer-peer networking Focus at the application level

5

Peer-peer networkingPeer-peer applications

Napster, Gnutella, CAN: file sharingad hoc networksmulticast overlays (e.g., video distribution)

6

Peer-peer networkingQ: What are the new technical challenges?Q: What new services/applications enabled?Q: Is it just “networking at the application-level”?

“There is nothing new under the sun” (William Shakespeare)

7

Tutorial Contents

introductionclient-server v. P2Parchitectures

centralized search• Napster

distributed search -flooding

• Gnutelladistributed search -hashing

• CAN, CHORD, …application-level multicast

research issuessecuritymodelingmore general applications

summary

8

Client Server v. Peer to Peer(1)

RPC/RMIsynchronousassymmetricemphasis on language integration and binding models (stub IDL/XDR compilers etc)Kerberos style security – access control, crypto

messagesasynchronoussymmetricemphasis on service location, content addressing, application layer routing.anonymity, high availability, integrity.harder to get right☺

9

Peer to peer systems actually old

IP routers are peer to peer.routers discover topology, and maintain itrouters are neither client nor serverrouters continually talk to each otherrouters inherently fault tolerantrouters are autonomous

10

Peer to peer systems

nodes have no distinguished roleno single point of bottleneck or failure.need distributed algorithms for

service discovery (name, address, route, metric, etc)neighbour status trackingapplication layer routing (based possibly on content, interest, etc)resilience, handing link and node failures…

11

Ad hoc networks and peer2peer

wireless ad hoc networks have many similarities to peer to peer systemsno a priori knowledgeno given infrastructurehave to construct it from “thin air”!

12

Overlays and peer 2 peer systems

P2p technology often used to create overlays offering services that could be offered in the IP leveluseful deployment strategyoften economically a way around other barriers to deploymentIP was an overlay (on telephone core infrastructure)not all overlays are P2P (AKAMAI)

13

P2P Architecture Classification

centralized service location (CSL)Napster

distributed service location with flooding (DSLF)

Gnutelladistributed service location with hashing (DSLH)

CAN, Pastry, Tapestry, Chord

14

Centralized Search Architecture

centralized directory service

search directory

Lord of the Rings?

A, C

A

B

C

D

E

15

NAPSTER

the most (in)famousnot the first (c.f. probably Eternity, from Ross Anderson in Cambridge)but instructive for what it gets right, andalso wrong…also has a political message…and economic and legal…

16

Napsterprogram for sharing files over the Interneta “disruptive” application/technology?history:

5/99: Shawn Fanning (freshman, Northeasten U.) founds Napster Online music service12/99: first lawsuit3/00: 25% UWisc traffic Napster2/01: US Circuit Court of

Appeals: Napster knew users violating copyright laws

7/01: # simultaneous online users:Napster 160K, Gnutella: 40K,

Morpheus: 300K

17

judge orders napsterto stop in July ‘01 other filesharing apps take over!

gnutellanapsterfastrack

8M

6M

4M

2M

0.0bi

ts p

er s

ec

18

Napster: how does it work

Application-level, client-server protocol over point-to-point TCP

Four steps:connect to Napster serverupload your list of files (push) to server.give server keywords to search the full list with.select “best” of correct answers. (pings)

19

Napster

napster.com

users

File list is uploaded

1.

20

Napster

napster.com

user

Requestand

results

User requests search at server.

2.

21

Napster

napster.com

user

pings pings

User pings hosts that apparently have data.

Looks for best transfer rate.

3.

22

Napster

napster.com

user

Retrievesfile

User retrieves file

4.

23

Napster: architecture notes

centralized server: single logical point of failurecan load balance among servers using DNS rotationpotential for congestionNapster “in control” (freedom is an illusion)

no security: passwords in plain textno authentication no anonymity

24

Distributed Search/Flooding

25

Distributed Search/Flooding

26

Gnutella

peer-to-peer networking: applications connect to peer applications focus: decentralized method of searching for fileseach application instance serves to:

store selected filesroute queries (file searches) from and to its neighboring peersrespond to queries (serve file) if file stored locally

27

Gnutella

Gnutella history:3/14/00: release by AOL, almost immediately withdrawntoo late: 23K users on Gnutella at 8 am this AMmany iterations to fix poor initial design (poor design turned many people off)

what we care about:how much traffic does one query generate?

how many hosts can it support at once?what is the latency associated with querying?is there a bottleneck?

28

Gnutella: how it worksSearching by flooding:

if you don’t have the file you want, query 7 of your partners.if they don’t have it, they contact 7 of their partners, for a maximum hop count of 10.requests are flooded, but there is no tree structure.no looping but packets may be received twice.reverse path forwarding(?)

Note: Play gnutella animation at: http://www.limewire.com/index.jsp/p2p

29

Flooding in Gnutella: loop prevention

Seen already list: “A”

30

Gnutella: initial problems and fixes

freeloading: WWW sites offering search/retrieval from Gnutella network without providing file sharing or query routing.

Block file-serving to browser-based non-file-sharing users prematurely terminated downloads:

long download times over modemsmodem users run gnutella peer only briefly (Napster problem also!) or any users becomes overloadedfix: peer can reply “I have it, but I am busy. Try again later” late 2000: only 10% of downloads succeed2001: more than 25% downloads successful (is this success or failure?)

www.limewire.com/index.jsp/net_improvements

31

Gnutella: initial problems and fixes (more)

2000: avg size of reachable network ony 400-800 hosts. Why so smalll?

modem users: not enough bandwidth to provide search routing capabilities: routing black holes

Fix: create peer hierarchy based on capabilitiespreviously: all peers identical, most modem blackholesconnection preferencing:

• favors routing to well-connected peers• favors reply to clients that themselves serve large number of

files: prevent freeloadingLimewire gateway functions as Napster-like central server on behalf of other peers (for searching purposes)

www.limewire.com/index.jsp/net_improvements

32

Gnutella Discussion:

architectural lessons learned?anonymity and security?other?good source for technical info/open questions:http://www.limewire.com/index.jsp/tech_papers

33

Kazaa

hierarchical Gnutellasupernodes and regular nodes

most popular p2p app>120M downloads

not well understoodbinariesencrypted communications

supernodes

34

hash tablesessential building block in software systems

Internet-scale distributed hash tablesequally valuable to large-scale distributed

systems?• peer-to-peer systems

– CAN, Chord, Pastry, …• large-scale storage management systems

– Publius, OceanStore,, CFS ...• mirroring on the Web

Internet-scale hash tables

35

hash tablesessential building block in software systems

Internet-scale distributed hash tablesequally valuable to large-scale distributed

systems?• peer-to-peer systems

– CAN, Chord, Pastry, …• large-scale storage management systems

– Publius, OceanStore,, CFS ...• mirroring on the Web

Internet-scale hash tables

36

Content-Addressable Network[Ratnasamy,etal]

introductiondesignevaluationstrengths & weaknessesongoing work

37

Content-Addressable Network(CAN)

CAN: Internet-scale hash table

interfaceinsert(key,value)value = retrieve(key)

38

Content-Addressable Network(CAN)

CAN: Internet-scale hash table

interfaceinsert(key,value)value = retrieve(key)

propertiesscalableoperationally simplegood performance (w/ improvement)

39

Outline

introductiondesignevaluationstrengths & weaknessesongoing work

40

K V

CAN: basic idea

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

41

CAN: basic idea

insert(K1,V1)

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

42

CAN: basic idea

insert(K1,V1)

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

K V

43

CAN: basic idea

(K1,V1)

K V

K VK V

K V

K V

K V

K V

K V

K V

K V

K V

44

CAN: basic idea

retrieve (K1)

K V

K VK V

K V

K V

K V

K V

K V

K V

K V

K V

45

CAN: solution

virtual Cartesian coordinate space

entire space is partitioned amongst all the nodes

every node “owns” a zone in the overall space

abstractioncan store data at “points” in the space can route from one “point” to another

point = node that owns the enclosing zone

46

CAN: simple example

1

0 1

1

47

CAN: simple example

1 2

48

CAN: simple example

1

2

3

49

CAN: simple example

1

2

3

4

50

CAN: simple example

51

CAN: simple example

I

52

CAN: simple example

node I::insert(K,V)

I

53

(1) a = hx(K)

CAN: simple example

x = a

node I::insert(K,V)

I

54

(1) a = hx(K)b = hy(K)

CAN: simple example

x = a

y = b

node I::insert(K,V)

I

55

(1) a = hx(K)b = hy(K)

CAN: simple example

(2) route(K,V) -> (a,b)

node I::insert(K,V)

I

56

CAN: simple example

(2) route(K,V) -> (a,b)

(3) (a,b) stores (K,V)

(K,V)

node I::insert(K,V)

I(1) a = hx(K)b = hy(K)

57

CAN: simple example

(2) route “retrieve(K)” to (a,b) (K,V)

(1) a = hx(K)b = hy(K)

node J::retrieve(K)

J

58

Data stored in the CAN is addressed by name (i.e. key), not location (i.e. IP address)

CAN

59

CAN: routing table

60

CAN: routing

(a,b)

(x,y)

61

A node only maintains state for its immediate neighboring nodes

CAN: routing

62

CAN: node insertion

Bootstrapnode

1) Discover some node “I” already in CANnew node

63

CAN: node insertion

I

new node1) discover some node “I” already in CAN

64

CAN: node insertion

2) pick random point in space

I

(p,q)

new node

65

CAN: node insertion

(p,q)

3) I routes to (p,q), discovers node J

I

J

new node

66

CAN: node insertion

newJ

4) split J’s zone in half… new owns one half

67

Inserting a new node affects only a single other node and its immediate neighbors

CAN: node insertion

68

CAN: node failures

Need to repair the space

recover database (weak point)• soft-state updates• use replication, rebuild database from replicas

repair routing • takeover algorithm

69

CAN: takeover algorithm

simple failuresknow your neighbor’s neighborswhen a node fails, one of its neighbors takes over its zone

more complex failure modessimultaneous failure of multiple adjacent nodes scoped flooding to discover neighborshopefully, a rare event

70

Only the failed node’s immediate neighbors are required for recovery

CAN: node failures

71

Design recap

basic CANcompletely distributedself-organizingnodes only maintain state for their immediate neighbors

additional design featuresmultiple, independent spaces (realities)background load balancing algorithmsimple heuristics to improve performance

72

Evaluation

scalabilitylow-latencyload balancingrobustness

73

CAN: scalability

for a uniformly partitioned space with n nodes and ddimensions

per node, number of neighbors is 2daverage routing path is (dn1/d)/4 hopssimulations show that the above results hold in practice

can scale the network without increasing per-node state

optimal choice of d for given n yields~log(n) nbrs with ~log(n) hops

74

CAN: low-latency

#nodes

Late

ncy

stre

tch

020

406080

100120140

160180

16K 32K 65K 131K

w/o heuristics

w/ heuristics

#dimensions = 2

0

2

4

6

8

10

#nodes16K 32K 65K 131K

#dimensions = 10

75

CAN: load balancing

dealing with hot-spotspopular (key,value) pairsnodes cache recently requested entriesoverloaded node replicates popular entries at neighbors

uniform coordinate space partitioninguniformly spread (key,value) entriesuniformly spread out routing load

76

Uniform Partitioning

added check at join time, pick a zonecheck neighboring zonespick the largest zone and split that one

77

CAN: node insertion

(p,q)

78

CAN: node insertion

79

CAN: node insertion

80

0

20

40

60

80

100

Uniform Partitioning

V 2V 4V 8V

Volume

Percentageof nodes

w/o check

w/ check

V = total volumen

V16

V8

V4

V2

65,000 nodes, 3 dimensions

81

CAN: Robustness

completely distributed no single point of failure ( not applicable to pieces of database when node failure happens)

not exploring database recovery (in case there are multiple copies of database)

resilience of routingcan route around trouble

82

Strengths

more resilient than flooding broadcast networksefficient at locating informationfault tolerant routingnode & Data High Availability (w/ improvement)manageable routing table size & network trafficcan build variety of services (application multicast)

83

Multicast

associate multicast group with index, Ireverse path forwarding tree

I

84

Multicast

associate multicast group with index, Ireverse path forwarding treesend to (hx(I),hy(I))

I

85

Weaknesses

impossible to perform a fuzzy searchsusceptible to malicious activitymaintain coherence of all the indexed data (network overhead, efficient distribution)still relatively higher routing latencypoor performance w/o improvementnodes coming and going?

86

Summary

CANan Internet-scale hash tablepotential building block in Internet applications

scalabilityO(d) per-node state

low-latency routingsimple heuristics help a lot

robustdecentralized, can route around trouble

87

Related Work

TapestryZhao, Kubiatowicz, Joseph (UCB)

ChordStoica, Morris, Karger, Kaashoek, Balakrishnan (MIT / UCB)

PastryDruschel and Rowstron(Rice / Microsoft Research)

88

Basic Idea of Chord

m bit identifier space for both keys and nodes

Key identifier = SHA-1(key)

Key=“LetItBe” ID=60SHA-1

IP=“198.10.10.1” ID=123SHA-1node identifier = SHA-1(IP address)

both are uniformly distributed

how to map key IDs to node IDs?

89

Consistent Hashing [Karger 97]

A key is stored at its successor: node with next higher ID

N32

N90

N123 K20

K5

Circular 7-bitID space

0IP=“198.10.10.1”

K101

K60Key=“LetItBe”

90

Consistent Hashingevery node knows of every other node

requires global informationrouting tables are large O(N)lookups are fast O(1)

N32

N90

N123

0

Hash(“LetItBe”) = K60

N10

N55

Where is “LetItBe”?

“A has K60”

K60

91

Chord: Basic Lookup

N32

N90

N123

0

Hash(“LetItBe”) = K60

N10

N55

Where is “LetItBe”?

“A has K60”

K60

every node knows its successor in the ring

requires O(N) time

92

“Finger Tables”

every node knows m other nodes in the ring

increase distance exponentially

N8080 + 20

80 + 2180 + 22

80 + 23

N9680 + 24

N11280 + 25

N1680 + 26

93

“Finger Tables”finger i points to successor of n+2i

table contains O(log N) entries

N120

N8080 + 20

N112

N96

N16

80 + 2180 + 22

80 + 23

80 + 24

80 + 25 80 + 26

94

Lookups are Faster

lookups take O(Log N) hops

N32

N10

N5

N20N110

N99

N80

N60

Lookup(K19)

K19

95

Issues

joins/leavesload balancing…

96

Performance

several measurement studies (Napster, Gnutella)

highly variable connection timeslots of freeloading

little analysis

97

Performance Modeling [Ge, etal]

evaluation of different architecturespopulation of users

cycle through on-off statessystem provides

common services (e.g., search)download services

98

1

M

think timecommonservices

file downloadservices

off-line

µs(Na)

µs(Na,·)

qf(Na,,·)

99

Download Services

heavy tailed, Zipf preference distribution, pi ∝ 1/iαservice capacity proportional to popularity

µf(Na,i) = pi Cf

100

Common Services

CSLµs(Na) = C1; qf(Na ) = 0

DSLFµs(Na) = Cq Na /Tβ

T – max. TTL for flooding β – connectivity parameter; β>1 qf(Na ) > 0

DSLHµs(Na) = Cq Na /log(Na)qf(Na ) = 0

101

Solution Methods

bounding analysisfixed point solutions

102

Comparison of Three Architectures

CSL has services bottleneckDSLF suffers from occasional failure to find fileDSLH more scalable

1

10

100

1000

1.E+04 1.E+05 1.E+06 1.E+07 1.E+08 1.E+09

Total Population: N

Syst

em T

roug

hput

: T

CIADIFADIHA

CSLDSLFDSLH

103

0

500

1000

1500

2000

2500

3000

0 1E+08 2E+08 3E+08 4E+08Total Population: N

Syst

em T

hrou

ghpu

t: T

capacity: supernode/other node = 10/1,#supernode/#node=1/52.250capacity: supernode/other node = 2/1,#supernode/#node=1/11.250capacity: supernode/other node = 1/1,#supernode/#node=1/6.125

Supernodes (Kazaa)

hierarchy helpsplacing well-provisioned nodes at top good idea

104

Anonymity: Crowds [Reiter98]

decentralized P2P solutionanonymous within the Crowdjondo (John Doe)

ProxyUser

path based

105

Path-based Initiator Anonymity

R

X

Y

Z

I

Packets passed from initiator, I, to peers which deliver the packet to the responder R.

106

Crowds Paths

R

X

Y

Z

I

• weighted coin flip• spinner

107

Performance Issues

routing in overlays incurs a performance penalty

can it be quantified?can it be mitigated?

dynamic nature of user populationrobustness?performance?tradeoff between service location paradigms?

108

Performance Issues

p2p file sharing vs. CDN/web (Akamai)compare robustness?compare performance?handling flash crowds?

p2p measurementkazaa!!

security of p2phow to measure/evaluate security

109

Wrapup discussion questions (1):What is a peer-peer network (what is not a peer-to-peer network?). Necessary:

every node is designed to (but may not by user choice) provide some service that helps other nodes in the network get serviceeach node potentially has the same responsibility, functionality (maybe nodes can be polymorhpic)some applications (e.g., Napster) are a mix of peer-peer and centralized (lookup is centralized, file service is peer-peer) [recursive def. of peer-peer](logical connectivity rather than physical connectivity) routing will depend on service and data

110

Overlays?

What is the relationship between peer-peer and application overlay networks?

peer-peer and application overlays are different things. It is possible for an application level overlay to be built using peer-peer (or vice versa) but not always necessaryoverlay: in a wired net: if two nodes can communicate in the overlay using a path that is not the path the network level routing would define for them. Logical network on top of underlying network

• source routing?wireless ad hoc nets – what commonality is there REALLY?

111

Wrapup discussion questions (3):is ad hoc networking a peer-peer application?

Yes (30-1)why peer-peer over client-server?

A well-deigned p2p provides better “scalability”why client-server over peer-peer

peer-peer is harder to make reliableavailability different from client-server (p2p is more often only partially “up”)more trust is required

if all music were free in the future (and organized), would we have peer-peer.

Is there another app: ad hoc networking, any copyrighted data, peer-peer sensor data gathering and retrieval, simulation

evolution #101 – what can we learn about systems?

112

THANKS!

slides can be found athttp://gaia.cs.umass.edu/towsley/p2p-tutorial.pdf

peer-peer and application-level networkingledvina/dht/ucalgary/p2p-tutorial.pdf · 30 gnutella:...

Documents