chapter 5 naming - løbner.dkkurser.lobner.dk/ddist/3. naming (1).pdf · tanenbaum & van steen,...
TRANSCRIPT
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
DISTRIBUTED SYSTEMS Principles and Paradigms
Second Edition ANDREW S. TANENBAUM
MAARTEN VAN STEEN
Chapter 5 Naming
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Plan
• Definitions and uses • Types of naming
– Flat naming – Structured naming – Attribute-based naming
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Definitions • Name
– A string of bits or characters that is used to refer to an entity
• Address – A name that refers to an access point of an entity
• Identifiers – A name that uniquely identifies an entity
• Human-friendly names – A character string name that is understandable by a human
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Naming
• Naming is fundamental in distributed systems
• Is it really?
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
What can we do without naming?
• Implies finite capacity of nodes in distributed systems – Otherwise we could assign an identifier (and a name)
• We could do broadcast or anycast – Broadcast traffic-heavy – Anycast equivalent to mobile computing
• A mobile computing example – Population protocols
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
A Motivating Example: Birds • Strap tiny, identical sensors
to many birds in a flock. • Sensors on two birds can
interact when the birds are close together.
• Want to detect when (at least) five birds have elevated body temperatures, indicating possible epidemic.
• (Material from Eric Ruppert)
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Population Protocols: System Model
• Sophistication of mobile nodes – Identically programmed – Finite state machines
• Infrastructure – None; not even identities
• Synchrony – Totally asynchronous – No limit on time it takes for a message to arrive
• Communication range – When nodes get next to each other they may
communicate – Anything that is always possible happens eventually
(fairness)
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Protocols • A protocol consists of
– a finite set of states – transition rules, mapping pairs of states to pairs of
states – input encoding
• mapping from inputs to states – output interpretation
• mapping from states to outputs
• Remark – Protocol must be independent of size of population!
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Simplest Example: Computing OR of Input Bits
• States – {0, 1}.
• One transition rule – 0, 1 → 1, 1.
• Input to a node is its state • Output of a node is its state
• Result – If all inputs are 0, all nodes will remain in state 0 – If some node has input 1, eventually all will have state 1
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Example: Threshold Predicate • Suppose each agent starts with input 0 or 1.
• Want to determine whether at least five nodes have input 1.
• Output convention: Each state has an associated output.
• Eventually, all nodes reach states with the correct output.
• What does a protocol for this look like? – (This was the problem of detecting bird flu epidemic.)
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Example: Threshold Predicate
• States – {0, 1, 2, 3, 4, 5}
• Transition rules – x, y → 0, x+y (for x+y < 5) – x, y → 0, 5 (for x+y => 5) – x, 5 → 5, 5
• Can we at any point in time be sure whether there is no bird flu?
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Example: Majority
• Every node is initially red or blue.
• Determine whether # reds > # blues
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Example: Majority • States:
– {red, blue, yes, no}.
• Rules: – red, blue → no, no – red, no → red, yes – blue, yes → blue, no – yes, no → no, no
– eliminates all blues or all reds – red changes answers to yes – blue changes answers to no – takes care of a tie
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Example: Majority
• Is this execution a problem? – No, because of fairness we do not have an
infinite cycle
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
More Generally: Predicates • A predicate has yes/no output. • Assume every agent should eventually produce
correct output
• Predicate must be symmetric (order of inputs is unimportant).
• So, we can write predicate as P(x1, x2, . . . , xk) where – k = number of possible initial states, – xi = number of agents starting in ith state.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Limits of Anonymity
• Theorem: A predicate is computable iff it is on the following list – where a, ci’s are integer constants
– where a, b and ci’s are constants
– Boolean combinations of the above predicates
• How do the two examples map to this? – Bird flu? – Majority of red?
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Back to Names… • How do we map names to addresses so that we
can refer to entities?
• Naming system – Maintains a name-to-address binding – E.g., www.daimi.au.dk -> IP no. 130.225.16.54
• Operation depends on naming type – Flat – Structured – Attribute-based
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Flat Names
• Identifiers are often just “random” strings of bits
• Implies no information on location embedded in identifier – E.g., identifiers in Chord on the application
layer – E.g., Media Access Control (MAC) adresses
on the data link layer
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
ARP
• My IP address is currently 10.198.5.65 – My physical/MAC address for my wireless
network card is 00:1f:f3:ba:59:7c – How to map my IP address to my MAC
address? • Address Resolution Protocol (ARP)
– A machine broadcasts a packet on the local network
– Receivers check whether they are listening to the IP address
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
An ARP Scenario
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
An ARP Scenario
• (Caching not shown)
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
ARP
• Broadcast is inefficient for large networks
• Multicast may be used also in point-to-point networks – But rarely enabled
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Mobility
• Problem – Name/identifier/address constant, location
changes • Solutions
– Multicast groups could be used with ARP-like protocol • Node multicasts new location to group when it has
moved – Forwarding pointers – Home-based approaches
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Forwarding Pointers
• Figure 5-1. The principle of forwarding pointers using (client stub, server stub) pairs.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Forwarding Pointers
• Figure 5-2. Redirecting a forwarding pointer by storing a shortcut in a client stub.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Forwarding Pointers
• Figure 5-2. Redirecting a forwarding pointer by storing a shortcut in a client stub.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Forwarding Pointers
• Transparency as advantage – Specifically migration transparency
• Disadvantages – Chain may grow very large – Intermediary nodes may need to maintain
links for a long time – Multiple-points-of-failure
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Home-Based Approaches • Node may want to maintain address while it
moves – Fits well with, e.g., high mobility and DNS (more later)
• Mobile IP – Home address
• The address of the mobile node – Home agent
• Keeps current location information of the node • Tunnels datagrams to mobile node
– Care-of-address • Termination point of the tunnel to a mobile node
• Part of network/IP layer in IPv6
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Home-Based Approaches
• Figure 5-3. The principle of Mobile IP.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Home-Based Approaches
• Disadvantages – Single-point-of-failure – Increased communication latency for first
packets
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Chord Revisited • Distributed Hash Table
– Use hash function to map nodes and keys to an m-bit identifier • E.g., 160 bit from using SHA-1
– Each node should store keys (and values) for which its identifier is closest
• Store roughly K/N keys – Store and lookup key/value pairs whose identifier is close to key
• Lookup in O(log N) time
• Consequences – Load balancing – Scalability – Decentralization – Availability
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Definitions
• Definitions of state for node n, using m-bit identifiers
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
A Chord Network
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Simple Lookup
// forward the query around the circle
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Scalable Lookup
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Distributed Hash Tables General Mechanism
• Figure 5-4. Resolving key 26 from node 1 and key 12 from node 28 in a Chord system.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Lookup Complexity
• Time is O(log N) (with high probability) –
• Otherwise id would not be between finger[k] and finger[k+1]
–
– After log N forwardings, distance will be at most
• 1 node expected in that interval
n
id
finger[k]
finger[k+1]
€
id − finger[k] ≤ 2k−1
€
id − n = finger[k]− n + id − finger[k] ≥ 2k−1 + id − finger[k]> 2 ⋅ (id − finger[k])
€
2m /2logN = 2m /N
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Dynamic Operations • p joining a network from node
n – n’ := n.lookup(p) – p.predecessor := n’ – p.successor := n’.successor
– p.finger[1] := n’.finger[2] – … – p.finger[m] := n.lookup( )
– Copy data as necessary from n’ • Rest of the network does not
learn about p from joining
n
n’ p
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Dynamic Operations
• Run n.stabilize() occasionally – Check whether n.successor.predecessor == n – Update n and n.successor as necessary
• Two cases of nodes leaving – p.leave() – Failure
• Correctness depends on correctness of successor pointers
• Maintain list of successors • Modify stabilize() to handle this
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Chord and Multicasting
• Generate identifier for group – mid
• Find multicast root – root := lookup(mid)
• Nodes join group – Find forwarders throughout
lookup(mid) – Forwarder notes children
• Multicast – lookup(mid) + data
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Chord and Multicasting • Example
– mid = 13 – root = 11
• 4.join(13) – ”4.lookup(13)” – 9 becomes forwarder
• 1.join(13) – ”1.lookup(13)” – No need to send join
request all the way to 11
• n.mcast(13) – n.lookup(13) – Send data from 11 to
nodes in tree
Example multicast tree
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Hierarchical Approaches • Network is divided into domains
– Single top-level/root domain – Multiple non-overlapping subdomains – Leaf-domains
• Each domain has associated directory node
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Hierarchical Approaches: Globe
• Directory nodes have location records for their contents – In leaf nodes this is an address – In other nodes this is pointers
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Hierarchical Approaches: Globe
• Look up the domain containing E using expanding ring search – Follow pointers from directory of that domain until addresses for E are
found
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Hierarchical Approaches: Globe
• Figure 5-8. (a) An insert request is forwarded to the first node that knows about entity E.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Hierarchical Approaches: Globe
• Figure 5-8. (b) A chain of forwarding pointers to the leaf node is created.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5
Summary
• Naming is fundamental to distributed systems
• Different types of names may be used – Flat naming
• E.g., DHT
– Structured naming • E.g, DNS
– Attribute-based naming • E.g., LDAP