chord: a scalable peer-to-peer lookup service for internet applications
DESCRIPTION
Chord: A scalable peer-to-peer lookup service for Internet applications. Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari Balakrishnan Kathleen Ting 23 November 2004. The Chord project aims to build scalable, robust, distributed systems using peer-to-peer ideas. - PowerPoint PPT PresentationTRANSCRIPT
Chord: A scalable peer-to-Chord: A scalable peer-to-peer lookup service for peer lookup service for Internet applicationsInternet applications
Ion Stoica, Robert Morris, David Karger, Ion Stoica, Robert Morris, David Karger, M. Frans Kaashock, Hari BalakrishnanM. Frans Kaashock, Hari Balakrishnan
Kathleen TingKathleen Ting23 November 200423 November 2004
The Chord project aims to build scalable, robust, The Chord project aims to build scalable, robust, distributed systems using peer-to-peer ideas. distributed systems using peer-to-peer ideas.
basis is Chord distributed hash lookup primitivebasis is Chord distributed hash lookup primitive Chord is completely decentralized and symmetric, Chord is completely decentralized and symmetric,
and can find data using only log(N) messages, and can find data using only log(N) messages, where N is the number of nodes in the system. where N is the number of nodes in the system.
Chord's lookup mechanism is provably robust in Chord's lookup mechanism is provably robust in the face of frequent node failures and re-joins. the face of frequent node failures and re-joins.
Goal: Better Peer-to-Peer StorageGoal: Better Peer-to-Peer Storage
Lookup is the key problemLookup is the key problem Lookup is not easy:Lookup is not easy:
GNUtella scales badlyGNUtella scales badly Freenet is impreciseFreenet is imprecise
Chord lookup provides:Chord lookup provides: Good naming semantics and efficiencyGood naming semantics and efficiency Elegant base for layered featuresElegant base for layered features
Chord ArchitectureChord Architecture Interface:Interface:
lookup(DocumentID) lookup(DocumentID) NodeID, IP-Address NodeID, IP-Address Chord consists ofChord consists of
Consistent HashingConsistent Hashing Small routing tables: log(n)Small routing tables: log(n) Fast join/leave protocolFast join/leave protocol
Chord Uses log(N) “Fingers”Chord Uses log(N) “Fingers”(0)
N80
½¼
1/8
1/161/321/641/128
Circular 7-bitID space
N80 knows of only seven other nodes.
Contributions from the Chord paperContributions from the Chord paper
1.1. Protocol that solves the lookup problemProtocol that solves the lookup problem1.1. Addition and deletion of Chord server nodesAddition and deletion of Chord server nodes2.2. Insert, update, and lookup of unstructured key/value pairsInsert, update, and lookup of unstructured key/value pairs
2.2. Simple system that uses it for storing informationSimple system that uses it for storing information3.3. Evaluation of Chord protocol and systemEvaluation of Chord protocol and system
1.1. Theoretical proofsTheoretical proofs2.2. Simulation results based on 10,000 nodesSimulation results based on 10,000 nodes3.3. Measurement of actual implementation of Chord systemMeasurement of actual implementation of Chord system
Chord protocol supports just one operationChord protocol supports just one operation Determine the node in a distributed system that stores the Determine the node in a distributed system that stores the
value for a given keyvalue for a given key Chord protocol uses a variant of consistent hashing to Chord protocol uses a variant of consistent hashing to
assign keys to Chord server nodesassign keys to Chord server nodes Load tends to be balancedLoad tends to be balanced When NWhen Nth th node joins (or leaves) network, only O(1/N) fraction node joins (or leaves) network, only O(1/N) fraction
of key/value pairs are moved to different locationof key/value pairs are moved to different location Previous research on consistent hashingPrevious research on consistent hashingimpractical to impractical to
scale because nodes know about every other node in scale because nodes know about every other node in networknetwork ChordChordeach node only maintains information about O(log N) each node only maintains information about O(log N)
nodes, resolves lookups using only O(log N) messages, nodes, resolves lookups using only O(log N) messages, updates require only O(logupdates require only O(log22N) messages when node joins or N) messages when node joins or leavesleaves
Benefits of ChordBenefits of Chord DecentralizedDecentralized Automatically adapts when hosts leave and joinAutomatically adapts when hosts leave and join ScalableScalable Guarantees that queries make a logarithmic number Guarantees that queries make a logarithmic number
of hops and that keys are well balancedof hops and that keys are well balanced Uses heuristic to achieve network proximityUses heuristic to achieve network proximity Doesn’t require the availability of geographic-Doesn’t require the availability of geographic-
location informationlocation information Prevents single points of failurePrevents single points of failure
Chord vs. DNSChord vs. DNS Similarities Similarities
Map names to valuesMap names to values DifferencesDifferences
No special root serversNo special root servers No restrictions on the format and meaning of No restrictions on the format and meaning of
names, as Chord names are just the key/value pairsnames, as Chord names are just the key/value pairs No attempt to solve administrative problemsNo attempt to solve administrative problems
Chord vs. FreenetChord vs. Freenet SimilaritiesSimilarities
Decentralized, symmetric, and automatically Decentralized, symmetric, and automatically adapts when hosts leave and joinadapts when hosts leave and join
DifferencesDifferences Queries always result in success or definitive Queries always result in success or definitive
failurefailure ScalableScalable
Cost of inserting and retrieving values, cost of adding Cost of inserting and retrieving values, cost of adding and removing hosts grows slowly with the total number and removing hosts grows slowly with the total number of hosts and key/value pairsof hosts and key/value pairs
API of the Chord systemAPI of the Chord system
FunctionFunction DescriptionDescription
insert (key, value)insert (key, value) Inserts a key/value binding at Inserts a key/value binding at rr distinct nodes. distinct nodes.
lookup (key)lookup (key) Returns the value associated with the key.Returns the value associated with the key.
update (key, newval)update (key, newval) Inserts the key/newval binding at Inserts the key/newval binding at r r nodes. Under nodes. Under stable conditions, exactly stable conditions, exactly r r nodes contain nodes contain
key/newval binding.key/newval binding.
join (n)join (n) Causes a node to add itself as a server to the Chord Causes a node to add itself as a server to the Chord system that node system that node n n is part of. Returns success or is part of. Returns success or
failure. failure.
leave ( )leave ( ) Leave the Chord system. No return value.Leave the Chord system. No return value.
Chord systemChord system Implemented as an application-layer overlay Implemented as an application-layer overlay
network of Chord server nodesnetwork of Chord server nodes Each node maintains a subset of the key/value Each node maintains a subset of the key/value
pairs as well as routing table entries that point pairs as well as routing table entries that point to a subset of carefully chosen Chord serversto a subset of carefully chosen Chord servers
Chord clients may, but don’t have to, run on Chord clients may, but don’t have to, run on the same hosts as Chord server nodesthe same hosts as Chord server nodes
Chord system design goalsChord system design goals ScalabilityScalability AvailabilityAvailability Load-balanced operationLoad-balanced operation DynamismDynamism UpdatabilityUpdatability Locating according to proximityLocating according to proximity
Scalable key locationScalable key location Each node stores information about only a Each node stores information about only a
small number of other nodessmall number of other nodes Amount of information maintained about other Amount of information maintained about other
nodes falls off exponentially with the distance nodes falls off exponentially with the distance in key-space between the two nodesin key-space between the two nodes
Finger table of a node may not contain enough Finger table of a node may not contain enough information to determine the successor of an information to determine the successor of an arbitrary key arbitrary key kk
Scalable key locationScalable key location
What happens when a node What happens when a node n n doesn’t know the doesn’t know the successor of a key successor of a key kk??
Theorem: With high probability, the number of nodes that must be contacted to resolve a successor query in an N-node network is O(log N).
Each recursive call to find the successor halves the distance to the target identifier.
Resolve successor queryResolve successor query
Node joins and departuresNode joins and departures In a dynamic network, nodes can join and leave at In a dynamic network, nodes can join and leave at
any time. any time. Preserve ability to locate every node in the networkPreserve ability to locate every node in the network
Each node’s finger table is correctly filledEach node’s finger table is correctly filled Each key Each key k k is stored at node successor (k)is stored at node successor (k)
If a node is not the immediate predecessor of the key, then its finger If a node is not the immediate predecessor of the key, then its finger table will hold a node closer to the key to which the query will be table will hold a node closer to the key to which the query will be forwarded, until the key’s successor node is reached.forwarded, until the key’s successor node is reached.
Theorem: With high probability, any node joining or Theorem: With high probability, any node joining or leaving an N-node Chord network will use O(logleaving an N-node Chord network will use O(log22N) N) messages to re-establish the Chord routing invariants.messages to re-establish the Chord routing invariants.
Predecessor pointerPredecessor pointer
Update fingers and predecessors of Update fingers and predecessors of existing nodesexisting nodes
Constant is ½Constant is ½
Chord lookup cost is O(log N)Chord lookup cost is O(log N)
Number of Nodes
Aver
age
Mes
sage
s per
Loo
kup
Chord propertiesChord properties As long as every node knows its immediate As long as every node knows its immediate
predecessor and successor, no lookup will stall predecessor and successor, no lookup will stall anywhere except at the node responsible for a anywhere except at the node responsible for a key. key.
Any other node will know of at least one node Any other node will know of at least one node (its successor) that is closer to the key than (its successor) that is closer to the key than itself, and will forward the query to that closer itself, and will forward the query to that closer node.node.
Chord propertiesChord properties Log(n) lookup messages and table space.Log(n) lookup messages and table space. Well-defined location for each ID.Well-defined location for each ID.
No search required.No search required. Natural load balance.Natural load balance. No name structure imposed.No name structure imposed. Minimal join/leave disruption.Minimal join/leave disruption. Does not store documents…Does not store documents…
ConclusionConclusion Intended to be used by decentralized, large-Intended to be used by decentralized, large-
scale distributed applicationsscale distributed applications Because many distributed applications need to Because many distributed applications need to
determine the node that stores a data itemdetermine the node that stores a data item Given a key, Chord will determine the node Given a key, Chord will determine the node
responsible for storing the key’s valueresponsible for storing the key’s value Maintains routing information for O(log N) nodesMaintains routing information for O(log N) nodes Resolves lookups with O(log N) messagesResolves lookups with O(log N) messages Updates with O(logUpdates with O(log22N) messages N) messages
ConclusionConclusion Chord provides distributed lookupChord provides distributed lookup
Efficient, low-impact join and leaveEfficient, low-impact join and leave Flat key space allows flexible extensionsFlat key space allows flexible extensions Good foundation for peer-to-peer systemsGood foundation for peer-to-peer systems
http://www.pdos.lcs.mit.edu/chordhttp://www.pdos.lcs.mit.edu/chord
Have a Happy Thanksgiving!Have a Happy Thanksgiving!