examples of “overlay” networks 15-744: computer networkingprs/15-744-f12/lectures/12-dht.pdf ·...

16
1 15-744: Computer Networking L-12 DHT and DTN Examples of “Overlay” Networks Overlay routing infrastructure Distributed name look up Content delivery networks Peer-to-peer content sharing Distributed hash tables Disruption tolerant networks 2 3 Overview Routed Lookups – Chord Comparison of DHTs • I3 • DTNs 4 Routing: Structured Approaches Goal: make sure that an item (file) identified is always found in a reasonable # of steps Abstraction: a distributed hash-table (DHT) data structure insert(id, item); item = query(id); Note: item can be anything: a data object, document, file, pointer to a file… Many proposals CAN (ICIR/Berkeley) Chord (MIT/Berkeley) Pastry (Rice) Tapestry (Berkeley) •…

Upload: dinhhuong

Post on 11-May-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Examples of “Overlay” Networks 15-744: Computer Networkingprs/15-744-F12/lectures/12-dht.pdf ·  · 2012-10-19Examples of “Overlay” Networks ... • Peer-to-peer content

1

15-744: Computer Networking

L-12 DHT and DTN

Examples of “Overlay” Networks

• Overlay routing infrastructure• Distributed name look up• Content delivery networks• Peer-to-peer content sharing • Distributed hash tables• Disruption tolerant networks

2

3

Overview

• Routed Lookups – Chord

• Comparison of DHTs

• I3

• DTNs

4

Routing: Structured Approaches

• Goal: make sure that an item (file) identified is always found in a reasonable # of steps

• Abstraction: a distributed hash-table (DHT) data structure • insert(id, item);• item = query(id);• Note: item can be anything: a data object, document, file, pointer

to a file…• Many proposals

• CAN (ICIR/Berkeley)• Chord (MIT/Berkeley)• Pastry (Rice)• Tapestry (Berkeley)• …

Page 2: Examples of “Overlay” Networks 15-744: Computer Networkingprs/15-744-F12/lectures/12-dht.pdf ·  · 2012-10-19Examples of “Overlay” Networks ... • Peer-to-peer content

2

5

Routing: Chord

• Associate to each node and item a unique idin an uni-dimensional space

• Properties • Routing table size O(log(N)) , where N is the

total number of nodes• Guarantees that a file is found in O(log(N))

steps

6

Aside: Hashing• Advantages

• Let nodes be numbered 1..m• Client uses a good hash function to map a URL to 1..m • Say hash (url) = x, so, client fetches content from node

x• One hop access

• Any problems?• What happens if a node goes down?• What happens if a node comes back up? • What if different nodes have different views?

7

Robust hashing• Assume 90 documents, node 1..9, and node 10

which was dead is alive again• % of documents in the wrong node?

• 10, 19-20, 28-30, 37-40, 46-50, 55-60, 64-70, 73-80, 82-90

• Disruption coefficient = ½• Unacceptable, use consistent hashing – idea behind

Akamai!

8

Consistent Hash

• “view” = subset of all hash buckets that are visible

• Desired features• Balanced – in any one view, load is equal

across buckets• Smoothness – little impact on hash bucket

contents when buckets are added/removed• Spread – small set of hash buckets that may

hold an object regardless of views • Load – across all views # of objects assigned to

hash bucket is small

Page 3: Examples of “Overlay” Networks 15-744: Computer Networkingprs/15-744-F12/lectures/12-dht.pdf ·  · 2012-10-19Examples of “Overlay” Networks ... • Peer-to-peer content

3

9

Consistent Hash – Example

• Smoothness addition of bucket does not cause much movement between existing buckets

• Spread & Load small set of buckets that lie near object• Balance no bucket is responsible for large number of

objects

• Construction• Assign each of C hash buckets to

random points on mod 2n circle, where, hash key size = n.

• Map object to random position on circle

• Hash of object = closest clockwise bucket

0

8

412Bucket

14

10

Routing: Chord Basic Lookup

N32

N90

N105

N60

N10N120

K80

“Where is key 80?”

“N90 has K80”

11

Routing: Finger table - Faster Lookups

N80

½¼

1/8

1/161/321/641/128

12

Routing: Chord Summary

• Assume identifier space is 0…2m

• Each node maintains• Finger table

• Entry i in the finger table of n is the first node that succeeds or equals n + 2i

• Predecessor node• An item identified by id is stored on the

successor node of id

Page 4: Examples of “Overlay” Networks 15-744: Computer Networkingprs/15-744-F12/lectures/12-dht.pdf ·  · 2012-10-19Examples of “Overlay” Networks ... • Peer-to-peer content

4

13

Routing: Chord Example

• Assume an identifier space 0..8

• Node n1:(1) joinsall entries in its finger table are initialized to itself

01

2

34

5

6

7i id+2i succ0 2 11 3 12 5 1

Succ. Table

14

Routing: Chord Example

• Node n2:(3) joins

01

2

34

5

6

7i id+2i succ0 2 21 3 12 5 1

Succ. Table

i id+2i succ0 3 11 4 12 6 1

Succ. Table

15

Routing: Chord Example

• Nodes n3:(0), n4:(6) join

01

2

34

5

6

7i id+2i succ0 2 21 3 62 5 6

Succ. Table

i id+2i succ0 3 61 4 62 6 6

Succ. Table

i id+2i succ0 1 11 2 22 4 0

Succ. Table

i id+2i succ0 7 01 0 02 2 2

Succ. Table

16

Routing: Chord Examples

• Nodes: n1:(1), n2(3), n3(0), n4(6)

• Items: f1:(7), f2:(1) 01

2

34

5

6

7 i id+2i succ0 2 21 3 62 5 6

Succ. Table

i id+2i succ0 3 61 4 62 6 6

Succ. Table

i id+2i succ0 1 11 2 22 4 0

Succ. Table7

Items 1

Items

i id+2i succ0 7 01 0 02 2 2

Succ. Table

Page 5: Examples of “Overlay” Networks 15-744: Computer Networkingprs/15-744-F12/lectures/12-dht.pdf ·  · 2012-10-19Examples of “Overlay” Networks ... • Peer-to-peer content

5

17

Routing: Query• Upon receiving a

query for item id, a node:

• Check whether stores the item locally

• If not, forwards the query to the largest node in its successor table that does not exceed id

01

2

34

5

6

7 i id+2i succ0 2 21 3 62 5 6

Succ. Table

i id+2i succ0 3 61 4 62 6 6

Succ. Table

i id+2i succ0 1 11 2 22 4 0

Succ. Table7

Items 1

Items

i id+2i succ0 7 01 0 02 2 2

Succ. Table

query(7)

18

What can DHTs do for us?

• Distributed object lookup• Based on object ID

• De-centralized file systems• CFS, PAST, Ivy

• Application Layer Multicast• Scribe, Bayeux, Splitstream

• Databases• PIER

19

Overview

• Routed Lookups – Chord

• Comparison of DHTs

• I3

• DTNs

20

Comparison• Many proposals for DHTs

• Tapestry (UCB) -- Symphony (Stanford) -- 1hop (MIT)

• Pastry (MSR, Rice) -- Tangle (UCB) -- conChord (MIT)

• Chord (MIT, UCB) -- SkipNet (MSR,UW) -- Apocrypha (Stanford)

• CAN (UCB, ICSI) -- Bamboo (UCB) -- LAND (Hebrew Univ.)

• Viceroy (Technion) -- Hieras (U.Cinn) -- ODRI (TexasA&M)

• Kademlia (NYU) -- Sprout (Stanford)

• Kelips (Cornell) -- Calot (Rochester)

• Koorde (MIT) -- JXTA’s (Sun)

• What are the right design choices? Effect on performance?

Page 6: Examples of “Overlay” Networks 15-744: Computer Networkingprs/15-744-F12/lectures/12-dht.pdf ·  · 2012-10-19Examples of “Overlay” Networks ... • Peer-to-peer content

6

21

Deconstructing DHTs

Two observations:1. Common approach

• N nodes; each labeled with a virtual identifier (128 bits)• define “distance” function on the identifiers• routing works to reduce the distance to the destination

2. DHTs differ primarily in their definition of “distance”• typically derived from (loose) notion of a routing geometry

22

DHT Routing Geometries

• Geometries: • Tree (Plaxton, Tapestry)• Ring (Chord)• Hypercube (CAN)• XOR (Kademlia)• Hybrid (Pastry)

• What is the impact of geometry on routing?

23

Tree (Plaxton, Tapestry)

Geometry• nodes are leaves in a binary tree• distance = height of the smallest common subtree • logN neighbors in subtrees at distance 1,2,…,logN

001000 011010 101100 111110

24

Hypercube (CAN)

000

100

001

010

110 111

011

101

Geometry• nodes are the corners of a hypercube• distance = #matching bits in the IDs of two nodes• logN neighbors per node; each at distance=1 away

Page 7: Examples of “Overlay” Networks 15-744: Computer Networkingprs/15-744-F12/lectures/12-dht.pdf ·  · 2012-10-19Examples of “Overlay” Networks ... • Peer-to-peer content

7

25

Ring (Chord)

Geometry• nodes are points on a ring• distance = numeric distance between two node IDs• logN neighbors exponentially spaced over 0…N

000

101 011

010

001

110

111

100

26

Hybrid (Pastry)

Geometry:• combination of a tree and ring• two distance metrics• default routing uses tree; fallback to ring under failures

• neighbors picked as on the tree

27

XOR (Kademlia)

00 01 1110

01 11 1000

Geometry:• distance(A,B) = A XOR B• logN neighbors per node spaced exponentially• not a ring because there is no single consistent

ordering of all the nodes

28

Geometry’s Impact on Routing• Routing

• Neighbor selection: how a node picks its routing entries• Route selection: how a node picks the next hop

• Proposed metric: flexibility• amount of freedom to choose neighbors and next-hop paths

• FNS: flexibility in neighbor selection• FRS: flexibility in route selection

• intuition: captures ability to “tune” DHT performance

• single predictor metric dependent only on routing issues

Page 8: Examples of “Overlay” Networks 15-744: Computer Networkingprs/15-744-F12/lectures/12-dht.pdf ·  · 2012-10-19Examples of “Overlay” Networks ... • Peer-to-peer content

8

29

FNS for Ring Geometry

• Chord algorithm picks ith neighbor at 2i distance

• A different algorithm picks ith neighbor from [2i , 2i+1)

000

101

100

011

010

001

110

111

30

FRS for Ring Geometry

• Chord algorithm picks neighbor closest to destination

• A different algorithm picks the best of alternate paths

000

101

100

011

010

001

110

111 110

31

Flexibility: at a Glance

Flexibility Ordering of Geometries

Neighbors

(FNS)

Hypercube << Tree, XOR, Ring, Hybrid

(1) (2i-1)

Routes

(FRS)

Tree << XOR, Hybrid < Hypercube < Ring(1) (logN/2) (logN/2) (logN)

33

Geometry Flexibility Performance?

Validate over three performance metrics:1. resilience2. path latency3. path convergence

Metrics address two typical concerns: • ability to handle node failure• ability to incorporate proximity into overlay

routing

Page 9: Examples of “Overlay” Networks 15-744: Computer Networkingprs/15-744-F12/lectures/12-dht.pdf ·  · 2012-10-19Examples of “Overlay” Networks ... • Peer-to-peer content

9

34

Analysis of Static ResilienceTwo aspects of robust routing• Dynamic Recovery : how quickly routing state is

recovered after failures• Static Resilience : how well the network routes before

recovery finishes• captures how quickly recovery algorithms need to work• depends on FRS

Evaluation:• Fail a fraction of nodes, without recovering any state• Metric: % Paths Failed

35

Does flexibility affect static resilience?

Tree << XOR ≈ Hybrid < Hypercube < RingFlexibility in Route Selection matters for Static Resilience

0

20

40

60

80

100

0 10 20 30 40 50 60 70 80 90% Failed Nodes

% F

aile

d Pa

ths

Ring

Hybrid

XORTreeHypercube

36

0

20

40

60

80

100

0 400 800 1200 1600 2000Latency (msec)

CD

F

FNS Ring

Plain Ring

FRS Ring

FNS + FRS Ring

Which is more effective, FNS or FRS?

Plain << FRS << FNS ≈ FNS+FRSNeighbor Selection is much better than

Route Selection39

Overview

• Routed Lookups – Chord

• Comparison of DHTs

• I3• Slides Yuan; following slides FYI

• DTNs

Page 10: Examples of “Overlay” Networks 15-744: Computer Networkingprs/15-744-F12/lectures/12-dht.pdf ·  · 2012-10-19Examples of “Overlay” Networks ... • Peer-to-peer content

10

Multicast

S1

C1 C2

S2

R RP RR

RR

RP: Rendezvous Point

40

Mobility

HA FA

Home Network

Network 5

5.0.0.1 12.0.0.4

Sender

Mobile Node

5.0.0.3

41

42

i3: Motivation• Today’s Internet based on point-to-point

abstraction

• Applications need more:• Multicast• Mobility• Anycast

• Existing solutions:• Change IP layer• Overlays

So, what’s the problem?A different solution for each service

The i3 solution• Solution:

• Add an indirection layer on top of IP• Implement using overlay networks

• Solution Components:• Naming using “identifiers”• Subscriptions using “triggers”• DHT as the gluing substrate

43

IndirectionEvery problem

in CS …

Only primitiveneeded

Page 11: Examples of “Overlay” Networks 15-744: Computer Networkingprs/15-744-F12/lectures/12-dht.pdf ·  · 2012-10-19Examples of “Overlay” Networks ... • Peer-to-peer content

11

i3: Rendezvous Communication

• Packets addressed to identifiers (“names”)• Trigger=(Identifier, IP address): inserted by

receiver

44

Sender Receiver (R)

ID R

trigger

send(ID, data)send(R, data)

Senders decoupled from receivers

45

i3: Service Model• API

• sendPacket(id, p);• insertTrigger(id, addr);• removeTrigger(id, addr); // optional

• Best-effort service model (like IP)• Triggers periodically refreshed by end-hosts• Reliability, congestion control, and flow-

control implemented at end-hosts

i3: Implementation

• Use a Distributed Hash Table • Scalable, self-organizing, robust• Suitable as a substrate for the Internet

46

Sender Receiver (R)

ID R

trigger

send(ID, data)send(R, data)

DHT.put(id)

IP.route(R)

DHT.put(id)

Mobility

• The change of the receiver’s address • from R to R’ is transparent to the sender

48

Page 12: Examples of “Overlay” Networks 15-744: Computer Networkingprs/15-744-F12/lectures/12-dht.pdf ·  · 2012-10-19Examples of “Overlay” Networks ... • Peer-to-peer content

12

Multicast• Every packet (id, data) is forwarded to each

receiver Ri that inserts the trigger (id, Ri)

49

Generalization: Identifier Stack• Stack of identifiers

• i3 routes packet through these identifiers

• Receivers• trigger maps id to <stack of ids>

• Sender can also specify id-stack in packet

• Mechanism:• first id used to match trigger• rest added to the RHS of trigger • recursively continued

51

Service Composition• Receiver mediated: R sets up chain and

passes id_gif/jpg to sender: sender oblivious

• Sender-mediated: S can include (id_gif/jpg, ID) in his packet: receiver oblivious

52

Sender(GIF)

Receiver R(JPG)

ID_GIF/JPG S_GIF/JPGID R

send((ID_GIF/JPG,ID), data)

S_GIF/JPG

send(ID, data) send(R, data)

56

Overview

• Routed Lookups – Chord

• Comparison of DHTs

• I3

• DTNs• Slides Sam; following slides FYI

Page 13: Examples of “Overlay” Networks 15-744: Computer Networkingprs/15-744-F12/lectures/12-dht.pdf ·  · 2012-10-19Examples of “Overlay” Networks ... • Peer-to-peer content

13

Delay-Tolerant Networking Architecture• Goals

• Support interoperability across ‘radically heterogeneous’ networks

• Tolerate delay and disruption• Acceptable performance in high

loss/delay/error/disconnected environments• Decent performance for low loss/delay/errors

• Components• Flexible naming scheme • Message abstraction and API• Extensible Store-and-Forward Overlay Routing• Per-(overlay)-hop reliability and authentication

57

Disruption Tolerant Networks

58

Disruption Tolerant Networks

59

Naming Data (DTN)• Endpoint IDs are processed as names

• refer to one or more DTN nodes• expressed as Internet URI, matched as strings

• URIs• Internet standard naming scheme [RFC3986]• Format: <scheme> : <SSP>

• SSP can be arbitrary, based on (various) schemes

• More flexible than DOT/DONA design but less secure/scalable• Data-centric networking approaches iscussed later in

the course

60

Page 14: Examples of “Overlay” Networks 15-744: Computer Networkingprs/15-744-F12/lectures/12-dht.pdf ·  · 2012-10-19Examples of “Overlay” Networks ... • Peer-to-peer content

14

62

Message Abstraction• Network protocol data unit: bundles

• “postal-like” message delivery• coarse-grained CoS [4 classes]• origination and useful life time [assumes sync’d clocks]• source, destination, and respond-to EIDs• Options: return receipt, “traceroute”-like function, alternative

reply-to field, custody transfer• fragmentation capability• overlay atop TCP/IP or other (link) layers [layer ‘agnostic’]

• Applications send/receive messages• “Application data units” (ADUs) of possibly-large size• Adaptation to underlying protocols via ‘convergence layer’• API includes persistent registrations

DTN Routing• DTN Routers form an overlay network

• only selected/configured nodes participate• nodes have persistent storage

• DTN routing topology is a time-varying multigraph• Links come and go, sometimes predictably• Use any/all links that can possibly help (multi)• Scheduled, Predicted, or Unscheduled Links

• May be direction specific [e.g. ISP dialup]• May learn from history to predict schedule

• Messages fragmented based on dynamics• Proactive fragmentation: optimize contact volume• Reactive fragmentation: resume where you failed

63

Example Routing Problem

64

Village

Internet

City

bike

2

3 1

Example Graph Abstraction

65

time (days)bike (data mule)

intermittent high capacityGeo satellite

medium/low capacitydial-up link

low capacity

City

Village 1

Village 2

Connectivity: Village 1 – City

band

wid

th

bikesatellitephone

Page 15: Examples of “Overlay” Networks 15-744: Computer Networkingprs/15-744-F12/lectures/12-dht.pdf ·  · 2012-10-19Examples of “Overlay” Networks ... • Peer-to-peer content

15

The DTN Routing Problem• Inputs: topology (multi)graph, vertex buffer limits, contact

set, message demand matrix (w/priorities)

• An edge is a possible opportunity to communicate:• One-way: (S, D, c(t), d(t))• (S, D): source/destination ordered pair of

contact• c(t): capacity (rate); d(t): delay• A Contact is when c(t) > 0 for some period

[ik,ik+1]

• Vertices have buffer limits; edges in graph if ever in any contact, multigraph for multiple physical connections

• Problem: optimize some metric of delivery on this structure• Sub-questions: what metric to optimize?,

efficiency?66

Knowledge-Performance Tradeoff

67

Use of Knowledge Oracles

Contacts+

Queuing+

Traffic

ContactsSummary

Contacts

Contacts+

Queuing(local)

Contacts+

Queuing(global)

None

Algorithm

LP

MED

ED

EDLQEDAQ

Oracle

FC

Knowledge-Performance Tradeoff

68

Routing Solutions - Replication• “Intelligently” distribute identical data copies to

contacts to increase chances of delivery• Flooding (unlimited contacts)• Heuristics: random forwarding, history-based forwarding,

predication-based forwarding, etc. (limited contacts)

• Given “replication budget”, this is difficult• Using simple replication, only finite number of copies in the

network [Juang02, Grossglauser02, Jain04, Chaintreau05]• Routing performance (delivery rate, latency, etc.) heavily

dependent on “deliverability” of these contacts (or predictability of heuristics)

• No single heuristic works for all scenarios!

69

Page 16: Examples of “Overlay” Networks 15-744: Computer Networkingprs/15-744-F12/lectures/12-dht.pdf ·  · 2012-10-19Examples of “Overlay” Networks ... • Peer-to-peer content

16

Using Erasure Codes• Rather than seeking particular “good” contacts,

“split” messages and distribute to more contacts to increase chance of delivery• Same number of bytes flowing in the network, now in

the form of coded blocks• Partial data arrival can be used to reconstruct the

original message• Given a replication factor of r, (in theory) any 1/r code blocks

received can be used to reconstruct original data• Potentially leverage more contacts opportunity that

result in lowest worse-case latency• Intuition:

• Reduces “risk” due to outlier bad contacts

70

Erasure Codes

71

Message n blocks

Encoding

Opportunistic Forwarding

Decoding

Message n blocks

DTN Security

• Payload Security Header(PSH) end-to-end security header

• Bundle Authentication Header (BAH) hop-by-hop security header

72

Bundle Agent

Bundle Application

Security Policy Router(may check PSH value)

Source

BAH

PSH

Sender Receiver/Sender

Receiver/Sender

Receiver/Sender

Destination

BAH BAH BAH

credit: MITRE

So, is this just e-mail?

• Many similarities to (abstract) e-mail service• Primary difference involves routing, reliability and security• E-mail depends on an underlying layer’s routing:

• Cannot generally move messages ‘closer’ to their destinations in a partitioned network

• In the Internet (SMTP) case, not disconnection-tolerant or efficient for long RTTs due to “chattiness”

• E-mail security authenticates only user-to-user

73

naming/ routing flow multi- security reliable prioritylate binding contrl app delivery

e-mail Y N (static) N(Y) N(Y) opt Y N(Y)DTN Y Y (exten) Y Y opt opt Y