overlay neighborhoods for distributed publish/subscribe systems

93
Overlay Neighborhoods for Distributed Publish/Subscribe Systems Reza Sherafat Kazemzadeh Supervisor: Dr. Hans-Arno Jacobsen SGS PhD Thesis Defense University of Toronto September 5, 2012

Upload: keelty

Post on 23-Feb-2016

42 views

Category:

Documents


0 download

DESCRIPTION

Overlay Neighborhoods for Distributed Publish/Subscribe Systems. Reza Sherafat Kazemzadeh Supervisor: Dr. Hans-Arno Jacobsen SGS PhD Thesis Defense University of Toronto September 5, 2012. Content-Based Pub/Sub. P. P. P. P. Publish. P. P. Pub/Sub. S. S. S. S. S. P. S. S. S. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

Overlay Neighborhoods for Distributed Publish/Subscribe Systems

Reza Sherafat KazemzadehSupervisor: Dr. Hans-Arno Jacobsen

SGS PhD Thesis DefenseUniversity of TorontoSeptember 5, 2012

Page 2: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

2

Content-Based Pub/Sub

Pub/Sub

S

SS SS

S

S

PPublish PP

P

SS

Subscribers

PPP

Publishers

Page 3: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

3

Thesis Contributions

List of publications:[ACM Surveys] Dependable publish/subscribe systems (being submitted)[Middleware’12] Opportunistic Multi-Path Publication Forwarding in Pub/Sub Overlays[ICDCS’12] Publiy+: A Peer-Assisted Pub/Sub Service for Timely Dissemination of Bulk Content[SRDS’11] Partition-Tolerant Distributed Publish/Subscribe Systems[SRDS’09] Reliable and Highly Available Distributed Publish/Subscribe Service[ACM Transactions on Parallel and Distributed Systems] Reliable Message Delivery in Distributed Publish/Subscribe Systems Using Overlay Neighborhoods (being submitted)[Middleware Demos/Posters’12] Introducing Publiy (being submitted)

Dependability

Reliability

Ordered delivery

Fault-tolerance

Page 4: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

4

Thesis Contributions

Dependability

Reliability

Ordered delivery

Fault-tolerance

Multipath forwarding

Adaptive overlay mesh

Dynamic forwarding strategies

Efficient data structure

List of publications:[ACM Surveys] Dependable publish/subscribe systems (being submitted)[Middleware’12] Opportunistic Multi-Path Publication Forwarding in Pub/Sub Overlays[ICDCS’12] Publiy+: A Peer-Assisted Pub/Sub Service for Timely Dissemination of Bulk Content[SRDS’11] Partition-Tolerant Distributed Publish/Subscribe Systems[SRDS’09] Reliable and Highly Available Distributed Publish/Subscribe Service[ACM Transactions on Parallel and Distributed Systems] Reliable Message Delivery in Distributed Publish/Subscribe Systems Using Overlay Neighborhoods (being submitted)[Middleware Demos/Posters’12] Introducing Publiy (being submitted)

Page 5: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

5

Thesis Contributions

Dependability

Reliability

Ordered delivery

Fault-tolerance

Multipath forwarding

Adaptive overlay mesh

Dynamic forwarding strategies

Efficient data structures

Content Dissemination

Bulk content dissemination

Peer-assisted hybrid architecture

Dissemination strategies

List of publications:[ACM Surveys] Dependable publish/subscribe systems (being submitted)[Middleware’12] Opportunistic Multi-Path Publication Forwarding in Pub/Sub Overlays[ICDCS’12] Publiy+: A Peer-Assisted Pub/Sub Service for Timely Dissemination of Bulk Content[SRDS’11] Partition-Tolerant Distributed Publish/Subscribe Systems[SRDS’09] Reliable and Highly Available Distributed Publish/Subscribe Service[ACM Transactions on Parallel and Distributed Systems] Reliable Message Delivery in Distributed Publish/Subscribe Systems Using Overlay Neighborhoods (being submitted)[Middleware Demos/Posters’12] Introducing Publiy (being submitted)

Page 6: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

6

Overlay Neighborhoods

Thesis Contributions

Dependability

Reliability

Ordered delivery

Fault-tolerance

Multipath forwarding

Adaptive overlay mesh

Dynamic forwarding strategies

Efficient data structures

Content Dissemination

Bulk content dissemination

Peer-assisted hybrid architecture

Dissemination strategies

List of publications:[ACM Surveys] Dependable publish/subscribe systems (being submitted)[Middleware’12] Opportunistic Multi-Path Publication Forwarding in Pub/Sub Overlays[ICDCS’12] Publiy+: A Peer-Assisted Pub/Sub Service for Timely Dissemination of Bulk Content[SRDS’11] Partition-Tolerant Distributed Publish/Subscribe Systems[SRDS’09] Reliable and Highly Available Distributed Publish/Subscribe Service[ACM Transactions on Parallel and Distributed Systems] Reliable Message Delivery in Distributed Publish/Subscribe Systems Using Overlay Neighborhoods (being submitted)[Middleware Demos/Posters’12] Introducing Publiy (being submitted)

Page 7: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

7

Publications:

[SRDS’11] Partition-Tolerant Distributed Publish/Subscribe Systems

[SRDS’09] Reliable and Highly Available Distributed Publish/Subscribe Service

[ACM Transactions on Parallel and Distributed Systems] Reliable Message Delivery in Distributed Publish/Subscribe Systems Using Overlay Neighborhoods (being submitted)

[ACM Surveys] Dependable publish/subscribe systems (being submitted)

[Middleware Demos/Posters’12] Introducing Publiy (being submitted)

DEPENDABILITY IN PUB/SUB SYSTEMSPart I

Page 8: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

8

Dependable pub/sub systems

Challenges of Dependabilityin Content-based Pub/Sub Systems

The “end-to-end principle” is not applicable in a pub/sub system– Loose-coupling between publishers and subscribers (endpoints)– End-point cannot distinguish message loss from filtered

messages: This is especially true in content-based systems supporting flexible publication filtering

PS Pub/Sub

Middleware

✓✗✓✗?

Loss cannot be differentiated

from filtered pubs

Filtered out(not matching sub)

Page 9: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

9

Dependable pub/sub systems

Overlay Neighborhoods

Primary network: An initial spanning tree– Brokers maintain

neighborhood knowledge– Allows brokers to transform

overlay in a controlled manner

d-Neighborhood knowledge(d is a config. parameter):– Knowledge of other brokers

within distance d– Knowledge of forwarding paths

within neighborhood

3-neighborhood

2-neighborhood

1-neighborhood

Page 10: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

10

Dependable pub/sub systems

queue

Publication Forwarding Algorithm1. Received pubs are placed on a

FIFO msg queue and kept until processing is complete

2. All known subscriptions having interest in p are identified after matching

3. Forwarding path of the publication within downstream neighborhoods are identified

4. Publication is sent to closest available brokers towards matching subscribers

p

d-neighborhood

S

S

S

upst

ream

dow

nstr

eam

Page 11: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

11

Dependable pub/sub systems

S

S

S

When There are Failures

• Broker reconnects the overlay by creating new links to neighbors of the failed brokers

• Publications in message queue are re-transmitted bypassing failed neighbors

• Multiple concurrent failed neighbors (up to d-1) are bypassed similarly

P

Page 12: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

12

Dependable pub/sub systems

Impact of Mass Failures on ThroughputExperiment setup:• 500 brokers (failures injected at random brokers)• Measurement interval of 2 mins (aggregate publish rate changes depending number of

failures)

Expected # of deliveries w/o

failures

Page 13: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

13

Dependable pub/sub systems

Impact of Mass Failures on ThroughputExperiment setup:• 500 brokers (failures injected at random brokers)• Measurement interval of 2 mins (aggregate publish rate changes depending number of

failures)

Actual deliveries

with failures

Expected # of deliveries w/o

failures

Page 14: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

14

Dependable pub/sub systems

Impact of Mass Failures on ThroughputExperiment setup:• 500 brokers (failures injected at random brokers)• Measurement interval of 2 mins (aggregate publish rate changes depending number of

failures)

Actual deliveries

with failures

Expected # of deliveries w/o

failures

Page 15: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

15

Dependable pub/sub systems

Impact of Mass Failures on ThroughputExperiment setup:• 500 brokers (failures injected at random brokers)• Measurement interval of 2 mins (aggregate publish rate changes depending number of

failures)

Actual deliveries

with failures

Expected # of deliveries w/o

failures

Page 16: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

16

Dependable pub/sub systems

Impact of Mass Failures on ThroughputExperiment setup:• 500 brokers (failures injected at random brokers)• Measurement interval of 2 mins (aggregate publish rate changes depending number of

failures)

Expected # of deliveries w/o

failures

Actual deliveries

with failures

Low deliveries with d=1

Page 17: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

17

Dependable pub/sub systems

Impact of Mass Failures on ThroughputExperiment setup:• 500 brokers (failures injected at random brokers)• Measurement interval of 2 mins (aggregate publish rate changes depending number of

failures)

Low deliveries with d=1

Page 18: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

18

Dependable pub/sub systems

Impact of Mass Failures on ThroughputExperiment setup:• 500 brokers (failures injected at random brokers)• Measurement interval of 2 mins (aggregate publish rate changes depending number of

failures)

Low deliveries with d=1

Page 19: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

19

Publications:

[Middleware’12] Opportunistic Multi-Path Publication Forwarding in Pub/Sub Overlays

OPPORTUNISTIC MULTI-PATHPUBLICATION FORWARDING

Part II

Page 20: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

20

Multi-path publication forwarding

– Forwarding paths in the overlay are constructed in“fixed end-to-end” manner (no/little path diversity)

– This results in a high number of “pure forwarding” brokers

– Low yield (ratio of msgs delivered over msgs sent is small) Low efficiency

Problems in Existing Pub/Sub Systems

B CA D E PS✗ ✗ ✓✗✓

Page 21: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

21

Multi-path publication forwarding

Monitor neighborhood

traffic

Selectively create additional soft links

Different Pub. Forwarding Strategies

Multi-Path Forwarding in a Nutshell

Actively utilize neighborhoods

A Soft links

Page 22: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

• Conventional systems:Strategy 0Total msgs: 6

• Forwarding strategy 1Total msgs: 5

• Forwarding strategy 2Total msgs: 3

Different Forwarding Strategies

A B C

* *

*

p * *

*

A B C

* *

*

p * *

*

A B C

* *

*

p * *

*

Page 23: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

23

Multi-path publication forwarding

Maximum System ThroughputExperiment setup:• 250 brokers• Publish rate of 72,000 msgs/min

S2 outperforms S0 by 90%

S1 outperforms S0 by 60%

Page 24: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

24

Publications:

[ICDCS’12] Publiy+: A Peer-Assisted Publish/Subscribe Service for Timely Dissemination of Bulk Content

BULK CONTENT DISSEMINATION INPUB/SUB SYSTEMS

Part III

Page 25: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

25

Bulk content dissemination

Applications Scenarios InvolvingBulk Content Dissemination

Fast replication of content:(video clips, pics)• Scalability• Reactive delivery• Selective delivery

Distributionof software

updates

P2P filesharing

File synch.

Replicationwithin CDN

Socialnetworks

Page 26: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

Bulk content dissemination

Control layer

brokers

Data layer

subscribers

A case for a peer-assisted design

Control layer (for metadata)• P/S broker overlay• Distributed repository

maintaining users’subscriptions

Data layer (for actual data)• Form peer swarm• Exchange blocks

of data

Subs

crib

e

Subs

crib

e

Subs

crib

e

Subs

crib

e Subs

crib

e

Hybrid Architecture

Page 27: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

27

Bulk content dissemination

Scalability w.r.t. Number of SubscribersNetwork setup:• 300 and 1000 clients• 1 source publishing 100 MB of content

Page 28: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

28

Conclusion• We introduced the notion of overlay neighborhoods in

distributed pub/sub systems– Neighborhoods expose brokers’ knowledge of nearby neighbors and

the publication forwarding paths that crosses these neighborhoods

• We used neighborhood in different ways– Passive use of neighborhoods for ensuring reliable and ordered

delivery– Active use of neighborhoods for multipath publication forwarding– Bulk content dissemination

Page 29: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

29

Thanks for your attention!

Page 30: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

30

BONUS SLIDES IF NEEDED!EXTRAS

Page 31: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

31

OVERLAY NEIGHBORHOODS

Page 32: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

32

London

Toronto

Trader 1

Content-Based Publish/Subscribe

Pub/Sub

S

SS SS

S

S

PPublish PP

P

sub = [STOCK=IBM]

sub= [CHANGE>-8%]

NY

Trader 2Stock quote dissemination

application

Page 33: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

33

Overlay neighborhoods

System ArchitectureTree dissemination networks: One path from source to destination• Pros:

– Simple, loop-free– Preserves publication order

(difficult for non-tree content-based P/S)• Cons:

– Trees are highly susceptible to failures

Primary tree: Initial spanning tree that is formedas brokers join the system

– Maintain neighborhood knowledge– Allows brokers to reconfigure overlay

after failures on the fly

∆-Neighborhood knowledge: ∆ is configuration parameterensures handling ∆-1 concurrent failures (worst case)

– Knowledge of other brokers within distance ∆ Join algorithm

– Knowledge of routing paths within neighborhood Subscription propagation algorithm

3-neighborhood

2-neighborhood

1-neighborhood

Page 34: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

34

Dependable pub/sub systems

Overlay Disconnections

When there are d or more concurrent failures– Publication delivery may be interrupted– No publication loss

B BB B B

Remain connected

Subtree Subtree

B CA D E

Failed chain of d brokers

DisconnectedSubtrees are Disconnected

Page 35: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

35

Dependable pub/sub systems

Experimental Evaluation

Studied various aspects of system’s operation:– Impact of failures/recoveries on delivery delay– Impact of failures on other brokers– Size of d-neighborhoods– Likelihood of disconnections– Impact of disconnections on system throughput

Discussed next

Page 36: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

36

Dependable pub/sub systems

Publication Forwarding in Absence of Overlay Fragments

• Forwarding only uses subscriptions accepted brokers.

• Steps in forwarding of publication p:– Identify anchor of accepted subscriptions that match p– Determine active connections towards matching subscriptions’ anchors– Send p on those active connections and wait for confirmations– If there are local matching subscribers, deliver to them– If no downstream matching subscriber exists, issue confirmation towards P– Once confirmations arrive, discard p and send a conf towards p

Publications

ABCDEP SSubscriptions

p

☑ ☑ ☑ ☑ ☑ ☑

CE

p p p p p

Deliver to localsubscribers

confconfconfconfconfconf

p

Page 37: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

37

Dependable pub/sub systems

Publication Forwarding in Presence of Overlay Partitions

Key forwarding invariant to ensure reliability: we ensure that no stream of publications are delivered to a subscriber after being forwarded by brokers that have not accepted its subscription.

• Case1: Sub s has been accepted with no pid. It is safe to bypass intermediate brokers

conf

conf

conf

Publications

ABCDEP S

Subscriptionsp

C BD☑ ☑ ☑ ☑ ☑ ☑ ☑

p p

Deliver to localsubscribersconf

p

Page 38: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

38

Dependable pub/sub systems

Publication Forwarding (cont’d)• Case2: Sub s has been accepted with some pid.

– Case 2a: Publisher’s local broker has accepted s and we ensure all intermediate forwarding brokers have also done so:

It is safe to deliver publications from sources beyond the partition.

conf

conf

conf

Publications

ABCDEP S

Subscriptionsp

C BD☑ ☑ ☑ ☑ ☑*

p p

conf

p

Page 39: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

39

Dependable pub/sub systems

Publication Forwarding (cont’d)• Case2: Sub s has been accepted with some pid.

– Case 2a: Publisher’s local broker has accepted s and we ensure all intermediate forwarding brokers have also done so:

It is safe to deliver publications from sources beyond the partition.

conf

conf

conf

Publications

ABCDEP S

Subscriptionsp

C BD☑ ☑ ☑ ☑ ☑*

p p

Depending on when this link has been establishedeither recovery or subscription propagation ensure

C accepts s prior to receiving p

conf

p

Page 40: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

40

Dependable pub/sub systems

Publication Forwarding (cont’d)• Case2: Subscription s is accepted with some pid tags.

– Case 2b: Publisher’s broker has not accepted s:

It is unsafe to deliver publications from this publisher (invariant).

Publications

ABCDEP S

Subscriptionsp

☑ ☑*

p p*

s was acceptedat S with the same pid tag

p p

p

Tag with pid

Page 41: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

41

Dependable pub/sub systems

Overlay Fragments• When primary tree is setup, brokers communicate with their immediate neighbors in the

primary tree through FIFO links.

• Overlay fragments: Broker crash or link failures creates “fragments” and some neighbor brokers “on the fragment” become unreachable from neighboring brokers

• Active connections: At each point they try to maintain a connection to its closest neighbor in the primary tree.– Only active connections are used by brokers

ABCDEF SP D

pid1=<C, {D}>

Fragment detectorBrokers on the fragmentBrokers beyond

the fragmentBrokers onthe fragment

Active connection to E

?

x

Page 42: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

42

Dependable pub/sub systems

Overlay Fragments – 2 Adjacent Failures

• What if there are more failures, particularly adjacent failures?

• If ∆ is large enough the same process can be used for larger fragments.

ABCDEF SP D

pid1=<C, {D}>

Brokers beyondthe fragment

Brokers onthe fragment

E

+ pid2=<C, {D, E}>

Active connection to F

Page 43: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

43

Dependable pub/sub systems

Overlay Fragments - ∆ Adjacent Failures

• Worst case scenario: ∆-neighborhood knowledge is not sufficient to reconnect the overlay.

• Brokers “on” and “beyond” the fragment are unreachable.

No new active connection

ABCDEF SP D

pid1=<C, {D}>

Brokers beyondthe fragment

Brokers onthe fragment

E

pid2=<C, {D, E}>

F

+ pid3=<C, {D, E, F}>

Page 44: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

44

Dependable pub/sub systems

FragmentsBrokers are connected to closest reachable neighbors & aware of nearby fragment identifiers.

• How does this affect end-to-end connectivity? For any pair of brokers, a fragment on the primary path between them is:

– An “island” if end-to-end brokers are reachable through a sequence of active connections

– A “barrier” if end-toe-end brokers are unreachable through some sequence of active connections

ABCDEF SP DEF

ABCDEF SP D

sourcedestination

destination source

Page 45: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

45

Dependable pub/sub systems

Store-and-Forward

• A copy is first preserved on disk

• Intermediate hops send an ACK to previous hop after preserving

• ACKed copies can be dismissed from disk

• Upon failures, unacknowledged copies survive failure and are re-transmitted after recovery– This ensures reliable delivery but may cause delays while the machine is down

P P PPFromhere

Tohere

ackackack

Page 46: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

46

Dependable pub/sub systems

Mesh-Based Overlay Networks [Snoeren, et al., SOSP 2001]

• Use a mesh network to concurrently forward msgs on disjoint paths

• Upon failures, the msg is delivered using alternative routes

• Pros: Minimal impact on delivery delay

• Cons: Imposes additional traffic & possibility of duplicate delivery

PPPP

Fromhere

Tohere

Page 47: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

47

Dependable pub/sub systems

Replica-based Approach [Bhola , et al., DSN 2002]

• Replicas are grouped into virtual nodes• Replicas have identical routing information

PhysicalMachines

Virtual node

Page 48: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

48

Dependable pub/sub systems

Replica-based Approach[Bhola , et al., DSN 2002]

• Replicas are grouped into virtual nodes• Replicas have identical routing information

• We compare against this approach

PP

PP

P

P

Virtual node

Page 49: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

Multi-path publication forwarding

Problems with a Single Overlay Tree

• Tree provides no routing diversity

• Overloaded root– All traffic goes through a

single broker

• Under utilization: Not all availablecapacity is effectively used

Tree: Single path connectivitynot suitable for diverse

forwarding patterns

?

Unutilizedbandwidth capacity

Overloaded root

Page 50: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

Multi-path publication forwarding

Related Work – Structured Topologies

• A topology is an interconnection between brokers: – Topology relatively stable: long-term connections– Most commonly a global/per-publisher spanning tree

• Topology adaptation change topology based on:– Traffic patterns [1,2] – optimize a cost function– Maintain acyclic property by adding + removing links

• Advantages:– Fixed topology enables high-throughput connections– Routes may be improved from a “course-grained” system-wide perspective

• Disadvantages:– Routes may never be optimal for individual broker pairs– Introduces pure forwarding brokers– Diversity of routing is not accounted for

[1] Virgillito, A., Beraldi, R., Baldoni, R.: On event routing in content-based publish/subscribe through dynamic networks. In: FTDCS. (2003)[2] Virgillito, A., Beraldi, R., Baldoni, R.: On event routing in content-based publish/subscribe through dynamic networks. In: FTDCS. (2003)

Re-configur

e

Tree A

Tree A’

Page 51: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

Multi-path publication forwarding

Related Work – Unstructured Topologies

• No fixed topology exists: – Short-term links are created based on

message destination– [3] uses dissemination trees computed

at the publishers’ brokers

• Advantages:– Routes may be optimal– Zero pure forwarding brokers

• Disadvantages:– Link maintenance is difficult and on-demand– Global knowledge is required and no support

for subscription covering/merging– Scalability problems

[3] Cao, F., Singh, J.: MEDYM: Match-early and dynamic multicast for content-based publish-subscribe service networks. ICDCSW (2005)

Page 52: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

52

Multi-path publication forwarding

Publication Forwarding StrategiesStrategy S1

• Publication is sent on intersection of primary paths towards matching subscribers

• Some pure forwarding brokers are bypassed

• Broker incurs no extra outgoing load

Strategy S2

• Publication is sent on as far as possible directly towards matching subscribers

• As many pure forwarding brokers as possible are bypassed

• Broker incurs high outgoing load

A B C

X

D

Y

Z

p X Y

Z

A B C

X

D

Y

Z

p X Y

Z

Local matching subscribers

Bypassed pure forwarder

Bypassed pure forwarder

Page 53: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

Master vs. Working Routing Data Structures

• Overlay views captured by brokers’ d-neighborhoods are relatively static Master Overlay Map (MOM)

• Brokers link connectivity change dynamically, brokers need an efficient way to compute forwarding paths over the changing set of links Working Overlay Map (WOM) via Edge retraction

• MOM only contains brokers with a direct link - it acts as a quick cache

Master Overlay Map

Working Overlay Map

construct

Page 54: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

54

Multi-path publication forwarding

Experimental Evaluation

Experimental setup– Various overlays

• Primary network size: 120, 250, 500 brokers• Fanout parameter: 3 and 10

– Datasets with sparse or dense matching distributions• Synthetic datasets based on Zipf distribution• Real world datasets constructed from Social Networking

user traces• Synthetic datasets with covering

Page 55: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

55

Multi-path publication forwarding

Overlay Reconfiguration

Page 56: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

56

Multi-path publication forwarding

Connectivity in the Overlay MeshExperiment setup:• 120 and 250 brokers• Fanout of 10

1000

100

100

10

1

Pair-

wise

for

war

ding

pat

hs

Page 57: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

57

Multi-path publication forwarding

Impact of Broker Fanout on Subscription CoveringExperiment setup:• 500 brokers• Fanout of 5-25

Page 58: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

58

Multi-path publication forwarding

Impact of Broker Fanout on Subscription CoveringExperiment setup:• 500 brokers• Fanout of 5-25

Page 59: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

59

Multi-path publication forwarding

Impact of Broker Fanout on Subscription CoveringExperiment setup:• 500 brokers• Fanout of 5-25

Page 60: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

60

Multi-path publication forwarding

Impact of Broker Fanout on Subscription CoveringExperiment setup:• 500 brokers• Fanout of 5-25

Page 61: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

61

Multi-path publication forwarding

Impact of Broker Fanout on Subscription CoveringExperiment setup:• 500 brokers• Fanout of 5-25

Page 62: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

62

Multi-path publication forwarding

Publication Hop CountExperiment setup:• 120 Brokers• Sparse publication/subscription workload• Publish rate of 1,800 msgs/sec Deliveries: 73,000 in 5 min

Page 63: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

63

Multi-path publication forwarding

Publication Hop CountExperiment setup:• 120 Brokers• Sparse publication/subscription workload• Publish rate of 1,800 msgs/sec Deliveries: 73,000 in 5 min

Page 64: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

64

Multi-path publication forwarding

Publication Hop CountExperiment setup:• 120 Brokers• Sparse publication/subscription workload• Publish rate of 1,800 msgs/sec Deliveries: 73,000 in 5 min

Page 65: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

65

Multi-path publication forwarding

Publication Hop CountExperiment setup:• 120 Brokers• Sparse publication/subscription workload• Publish rate of 1,800 msgs/sec Deliveries: 73,000 in 5 min

Page 66: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

66

Multi-path publication forwarding

Publication Hop CountExperiment setup:• 120 Brokers• Sparse publication/subscription workload• Publish rate of 1,800 msgs/sec Deliveries: 73,000 in 5 min

Page 67: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

67

Multi-path publication forwarding

Publication Hop CountSparse Matching Workload Dense Matching Workload

Multi-path forwarding is more effective in sparse workloads

Page 68: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

68

Multi-path publication forwarding

System Yield (measure of efficiency)Delivered

publicationsStrategy Number of pure

Pure ForwardersSystem Yield

73,000(Sparse Workload)

Strategy 0 91,000 44%

Strategy 1 42,000 63%

Strategy 2 29,000 71%

Delivered publications

Strategy Number of pure Pure Forwarders

System Yield

284,000(Dense Workload)

Strategy 0 195,000 59%

Strategy 1 104,000 73%

Strategy 2 69,000 80%

Page 69: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

69

Multi-path publication forwarding

Maximum System ThroughputExperiment setup:• 250 brokers• Publish rate of 72,000 msgs/min

Page 70: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

Bulk content dissemination

Data Exchange Using Network Coding

Blocks matrix(k * n)

Randomcoefficients Cused for coding

Decoding usesk linearlyindependent blocks

Segmentation Encode intodata packets DecodeSegment 3

Yi =

B1

B k

Segment 2

Yi =

Segment 1

Yi = B1

Bk

File

Segment 3

Yi =

B1

B k

Segment 2

Yi =

Segment 1

Yi = B1

Bk

FileCi Xi

Encode Transfer

Page 71: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

Bulk content dissemination

Hybrid architectureRegional dissemination Cross-Regional dissemination

Control layer

Data layer

Matchingsubscribers

PList

Control layer

Data layerPublisher

PList

Codedpackets

Matchingsubscribers

Involves routing of notificationsin control layer

Information is immediately available at broker

Page 72: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

72

Bulk content dissemination

Evaluation Results

• Experimental setup– UofT’s SciNet cluster with up to 1000 nodes– Peers have a capped uplink bandwidth

(100-200 KB/s)

• Network setup:– 5 Regions– 120, 300 or 1000 subscribers uniformly

distributed among regions

Page 73: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

73

Bulk content dissemination

Content Serving PolicyNetwork setup:• 300 clients• 1 source publishes 100 MB of content

Page 74: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

74

Bulk content dissemination

Content Serving PolicyNetwork setup:• 300 clients• 1 source publishes 100 MB of content

Page 75: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

75

Bulk content dissemination

Content Serving PolicyNetwork setup:• 300 clients• 1 source publishes 100 MB of content

Page 76: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

76

Bulk content dissemination

Content Serving PolicyNetwork setup:• 300 clients• 1 source publishes 100 MB of content

Page 77: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

77

Bulk content dissemination

Impact of Packet LossNetwork setup:• 300 clients• 1 source publishes 100 MB of content

Page 78: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

78

Bulk content dissemination

Impact of Packet LossNetwork setup:• 300 clients• 1 source publishes 100 MB of content

Page 79: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

79

Bulk content dissemination

Impact of Packet LossNetwork setup:• 300 clients• 1 source publishes 100 MB of content

Page 80: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

80

Bulk content dissemination

Impact of source Fanout on dissemination timeNetwork setup:• 300 clients• 1 source publishes 100 MB of content

Page 81: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

81

Bulk content dissemination

Impact of source Fanout on dissemination timeNetwork setup:• 300 clients• 1 source publishes 100 MB of content

Page 82: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

82

Bulk content dissemination

Impact of source Fanout on dissemination timeNetwork setup:• 300 clients• 1 source publishes 100 MB of content

Page 83: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

83

Bulk content dissemination

Impact of source Fanout on dissemination timeNetwork setup:• 300 clients• 1 source publishes 100 MB of content

Page 84: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

84

Bulk content dissemination

Effectiveness of Traffic Shaping

REG 1 REG 2 REG 3 REG 4 REG 5

REG 1

REG 2

REG 3

REG 4

REG 5

Experiment setup:• 5 regions and 1000 clients (capped uplink bandwidth at 200 KB/s)• 1 sources publish 100 MB

Page 85: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

85

Bulk content dissemination

Effectiveness of Traffic Shaping

REG 1 REG 2 REG 3 REG 4 REG 5

REG 1 19.57%

REG 2

REG 3

REG 4

REG 5

Experiment setup:• 5 regions and 1000 clients (capped uplink bandwidth at 200 KB/s)• 1 sources publish 100 MB

Regional traffic

Page 86: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

86

Bulk content dissemination

Effectiveness of Traffic Shaping

REG 1 REG 2 REG 3 REG 4 REG 5

REG 1 19.57% 0.109% 0.109% 0.109% 0.108%

REG 2

REG 3

REG 4

REG 5

Experiment setup:• 5 regions and 1000 clients (capped uplink bandwidth at 200 KB/s)• 1 sources publish 100 MB

Regional traffic

Cross-regional traffic

Page 87: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

87

Bulk content dissemination

Effectiveness of Traffic Shaping

REG 1 REG 2 REG 3 REG 4 REG 5

REG 1 19.57% 0.109% 0.109% 0.109% 0.108%

REG 2 0.110% 19.57% 0.109% 0.109% 0.109%

REG 3 0.110% 0.110% 19.55% 0.108% 0.108%

REG 4 0.114% 0.114% 0.114% 19.57% 0.114%

REG 5 0.110% 0.110% 0.110% 0.111% 19.49%

Experiment setup:• 5 regions and 1000 clients (capped uplink bandwidth at 200 KB/s)• 1 sources publish 100 MB

Regional traffic

Cross-regional traffic

Page 88: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

88

Bulk content dissemination

Traffic Sharing Among Competing Contentwith Uniform Popularity

Experiment setup:• 5 regions and 1000 clients (capped uplink bandwidth at 200 KB/s)• 15 sources (3 in each region) publish 100 MB with uniform popularity

1 TB of data

Page 89: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

89

Bulk content dissemination

Traffic Sharing Among Competing Contentwith Different Popularity

Experiment setup:• 5 regions and 1000 clients (capped uplink bandwidth at 200 KB/s)• 15 sources (3 in each region) publish 100 MB • Content has 1x, 2x, and 3x popularity

Most popular

Least popular

Medium popularity

Page 90: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

90

Bulk content dissemination

Contribution of PeersContribution of the source

Avg per segment: 136% content size

Contribution of subscribers

Overall avg: 102% of download size

Network setup:• 300 clients• 1 source publishes 100 MB of content

Page 91: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

91

Bulk content dissemination

Comparison With BitTorrentExperiment setup:• 120 clients (capped uplink bandwidth at 200 KB/s)• 1 source publishes 100 MB of content

Upon release all clients start download

Within 1300 s download ends

Page 92: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

92

Bulk content dissemination

Comparison With BitTorrentExperiment setup:• 120 clients (capped uplink bandwidth at 200 KB/s)• 1 source publishes 100 MB of content

[BT]: Polling intervalof 10 minutes

[BT]: Within 1700 s downloads end

Page 93: Overlay Neighborhoods for Distributed Publish/Subscribe Systems

93

Bulk content dissemination

Comparison With BitTorrentExperiment setup:• 120 clients (capped uplink bandwidth at 200 KB/s)• 1 source publishes 100 MB of content

Polling intervalof 2 seconds

[BT]: Within 1600 s downloads end