analysis and algorithms of the construction of the minimum cost content-based publish/subscribe...
TRANSCRIPT
Analysis and algorithms of the construction of the minimum cost
content-based publish/subscribe overlay
Yaxiong Zhao and Jie [email protected]
Yaxiong Zhao will be graduating next summer!
Outline
• Introduction• Analysis– Integer programming formulation– Two-stage approximation– Sub-channeling and multicast-based
approximation
• Simulation results• Q&A
Content-based pub/sub overlay
• Overlay networks built with the content-based pub/sub principals– Brokers, publishers and subscribers are
connected with overlay links– Brokers are dedicated
servers• Do not publish or subscribe
– Publishers and subscribers are called users collectively• A user can publish and subscribe simultaneously
Problem formulation
• Given a set of brokers B, a large number of users U and a 1-dimensional content space C
• Constraints– Message generating function defined on C
• A density function• Give the message rate of a publisher by integration
– Users are not allowed to connect with each other• Privacy
– Each user must connect with one and only one broker• Reduce cost and end-user complexity
Cont’d
• Objectives–Wire brokers and users into a connected
overlay– Distribute traffic on overlay links– Achieve minimum cost for the
bandwidth used
Outline
• Introduction• Analysis– Integer programming formulation– Two-stage approximation– Sub-channeling and multicast-based
approximation
• Simulation results• Q&A
Complexity
• Reduce from the general Steiner tree problem – Steiner tree problem can be seen as a special case of the
above problem with the following settings• Identical fixed link costs• One publisher• All subscribers have an identical demand
• The general Steiner tree problem is NP-hard– Means that our problem unlikely has a efficient optimal
solution
Integer programming formulation
• Two parts of the optimization– Access: the traffic between brokers and users C1
– Core: the traffic between brokers C2
• The design of the approximation algorithms try to optimize these two parts– Separately or together
xij=1 if user i connects to broker jbi(out) outgoing traffic of user Icij is the cost of the link between i and j
c’ij=1 the cost of the link between broker i and jFij flow between broker i and j
Outline
• Introduction• Analysis and solutions– Integer programming formulation– Two-stage approximation– Sub-channeling and multicast-based
approximation
• Simulation results• Q&A
Two-stage greedy packing
• Each user connects to the broker with which it has the lowest-cost overlay link–Minimize the peripheral cost
• Then connect all of the brokers using weighted shortest path–With routing cost as the link cost
A sample network and the results
Two-stage clustering
• Clustering publisher and subscriber pairs that have the lowest cost-to-bandwidth ratio
• Starting with biggest flow with decreasing order– Find the minimum cost path connecting
the broker and the subscriber– Fix the links– Assign remaining flows
Outline
• Introduction• Analysis and solutions– Integer programming formulation– Two-stage approximation– Sub-channeling and multicast-based
approximation
• Simulation results• Q&A
Sub-channeling and multicast
• We try to formulate the problem using multicast– This is achieved through sub-channeling– Use small sub-channel to approximate
the event traffic on the entire content space
Cont’d
• Approximate the minimum-cost multicast through on each sub-channel– Using Minimum-spanning tree
• Obtain a network wiring for brokers and users on each sub-channel– For each user, the traffic volume passing from it to its
chosen broker is recorded
• Choose a connection for each user according to the weighted probability obtained from the traffic volume– For each sub-channel the traffic volume/ for link Li is Vi
– The probability to choose this link is Vi/∑Vi
Outline
• Introduction• Analysis– Integer programming formulation– Two-stage approximation– Sub-channeling and multicast-based
approximation
• Simulation results• Q&A
Simulation settings
• 1000 to 10000 of users• 100 to 1000 brokers– Keep a 10/1 ratio– A realistic setting in a cloud-computing era
• 100 networks of a given size– Obtain the average value
• Cost reduction ratio (CRR)– The cost achieved by random connection CR– The cost achieved by our algorithms CA– CRR = CR/CA
CRR vs. scale
CRR vs. Access-to-core link cost ratio
Q & A
Send an Email to [email protected] if your questions are not answered