yappers: a peer-to-peer lookup service over arbitrary topology
DESCRIPTION
YAPPERS: A Peer-to-Peer Lookup Service over Arbitrary Topology. Qixiang Sun Prasanna Ganesan Hector Garcia-Molina Stanford University. Outline. Background and Motivation High-level overview of YAPPERS Brief evaluation. Where is X?. Problem. A. C. B. Problem (2). 1. Search. - PowerPoint PPT PresentationTRANSCRIPT
YAPPERS: A Peer-to-Peer Lookup Service over Arbitrary Topology
Qixiang SunPrasanna Ganesan
Hector Garcia-Molina
Stanford University
Outline
• Background and Motivation
• High-level overview of YAPPERS
• Brief evaluation
Problem
Where is X?
Problem (2)
1. Search
2. Node join/leave
3. Register/remove content
A
B
C
Background
• Gnutella-style • join– anywhere in the
overlay
• register– do nothing
• search– flood the overlay
Background (2)
• Distributed hash table (DHT)
• join– a unique location in
the overlay
• register– place pointer at a
unique node
• search– route towards the
unique node
. . .Chord
CAN
Background (3)
• Gnutella-style+ Simple+ Local control+ Robust+ Arbitrary topology
– Inefficient– Disturbs many nodes
• DHT+ Efficient search
– Restricted overlay– Difficulty with dynamism
Motivation
• Best of both worlds– Gnutella’s local interactions– DHT-like efficiency
• Respect application-defined topology– Social network– Ad hoc wireless network– Physical-network proximity
Partition Nodes
Given any overlay, first partition nodes into buckets (colors) based on hash of IP
Partition Nodes
Given any overlay, first partition nodes into buckets (colors) based on hash of IP
Partition Nodes (2)
Around each node, there is at least one node of each color
X Y
May require backup color assignments
Register Content
Partition content space into buckets (colors) and register pointer at “nearby” nodes.
Z
register red content locally
register yellow content at a yellow node
Nodes aroundZ form a smallhash table!
Searching Content
Start at a “nearby” colored node, search other nodes of the same color.
V
U
X Y
Z
W
Searching Content (2)
A smaller overlay for each color and use Gnutella-style flood
Fan-out = degree of nodes in the smaller overlay
Recap
• Hybrid approach– Around each node, act like a hash table– Flood the relevant nodes in the entire network
• What do we gain?– Respect original overlay– Efficient search for popular data– Avoid disturbing nodes unnecessarily
Brief Evaluation
• Using a 24,702 nodes Gnutella snapshotas the underlying overlay
• We study– Number of nodes contacted per query when
searching the entire network
– Trade-off in using our hybrid approach when flooding the entire network
Nodes Searched per Query
0
0.05
0.1
0.15
0.2
0.25
0.3
0 10 20 30 40 50
Number of Buckets (Colors)
Fra
ctio
n o
f N
od
es S
earc
hed Limited by the number
of nodes “nearby”
Trade-off
• Fan-out = degree of each colored node when flooding “nearby” nodes of the same color
Average Fan-out
Vanilla 835
Heuristics 82
• Good in searching nearby nodes quickly.
• Bad in searching the entire network
Conclusion
Does YAPPERS work?– YES
• Respects the original overlay• Searches efficiently in small area• Disturbs fewer nodes than Gnutella• Handles node arrival/departure better than DHT
– NO• Large fan-out (vanilla flooding won’t work)
For More Information
• A short position paper advocating locally-organized P2P systemshttp://dbpubs.stanford.edu/pub/2002-60
• Other P2P work at Stanfordhttp://www-db.stanford.edu/peers
Recap
• node join– anywhere in the overlay
• register content– at nearby node(s) of the appropriate color
• search– start at a nearby node of the search color
and then flood nodes of the same color.
What Do We Gain?
• Respect original overlay
• Efficient search for popular data
• Avoid disturbing nodes unnecessarily
• Better handling of dynamic node arrival and departure
Design Issues
• How to build a small hash table around each node, i.e., assign colors?
• How to connect nodes of the same color?
Small-scale Hash Table
Small = all nodes within h hops (e.g., h=2)– Consistent across overlapping hash tables– Stable when nodes enter/leave
A B
XC
Small-scale Hash Table (2)
• Fixed number of buckets (colors)
• Determine bucket (color) based on the hash value of node IP addresses
– Multiple nodes of the same color
– No nodes of a color
Searching the Overlay
Find another node of the same color in a “nearby” hash table
All nodeswithin h hops
A
C
B
Frontier Node
Need to track all nodes within 2h+1 hops
Searching the Overlay (2)
For a color C and each frontier node v,
1. determine which nodes v mightcontact to search for color C
2. contact these nodes
Theorem: Regardless of starting node, one can search all nodes of all color.
Buckets per Node
• Using 32 buckets (colors) per hash table
0
5
10
15
20
25
30
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Number of Buckets (Colors) per Node
Fra
cti
on
of
No
de
s (
%) AVG = 3.7
3.7
32= 11.5%
Overloading a Node
• A node may have many colors even if it has a large neighborhood.
A
X