maximum likelihood network topology identification mark coates mcgill university robert nowak rui...
Post on 20-Dec-2015
216 views
TRANSCRIPT
Maximum Likelihood Network Topology Identification
Mark CoatesMcGill University
Robert Nowak Rui CastroRice University
DYNAMICS May 5th,2003
Network Tomography
• Inferring network topology based on “external” end-to-end measurements.
• Traceroute requires cooperation of routers:May not be met in practice
• This paper assumes no internal network cooperation
• Solely host-based unicast measurements
How does it work?
Information we have
• End-to-end measurements that measure the degree of correlation between receivers
• Associate metric i,j with pair of receivers i,j R
Monotonicity property: pi,pj,pk : Paths from sender to i,j,k
If pi shares more links with pj than with pk, then
i,j > i,k
An example
Here 18,19 > i,19 for all other i
Examples ?
Simple Bottom-up merging algorithms can be used to identify full, logical topology
Two-fold Contribution
• Novel measurement scheme: – Sandwich Probing– Each probe: three packets– Main Idea: Small packets queues behind the
large, inducing extra seperation between small packets on shared links
• A stochastic search method for topology identification
d
0
1
2
53 4
Sandwich Probing
35d
d
01: queuing delay of p2 on link 01, 35= 01
ij: sum of ’s on the shared links to receiver i and j
no cross-traffic:
p1
p2
Advantages over loss and delay based metrics
Probe loss is rare on Internet. Large number of measurements required
For measuring delay, clock sync required
Each measurement contributes here.
Measurement framework
ijx
Measurement ofij contaminated
by cross traffic
Multiple measurements
ijijijij nNx /ˆ, 2CLT
0
1
2
53 4
Cross traffic: zero-mean effect on ijx
Likelihood Formulation
• Estimated metrics are randomly distributed according to density p
• p parameterized by underlying topology T and set of true metric values
• When is viewed as function of T and , it is called the likelihood of T and .
Likelihood Formulation
• Maximum Likelihood Tree is given by:
F denotes forest of all possible trees
G denotes set of all metrics satisfying monotonicity property
Maximization involved is formidable
Brute Force method: for N = 10, more than 1.8 x 106 trees
Simplifying the problem
• Parameters are chosen to maximize the value for a given tree T
• To provide the very best fit T can provide to Data
• Log likelihood of T
Maximum Likelihood Tree is the one in the forest that has the largest likelihood value
Stochastic Search
• Reversible Markov Chain Monte Carlo Method
• Using above techniques, authors devise a rapid search method to find optimal trees.
• “Learning using Bayesian Statistics”• Prior and Posterior distributions
Main Idea: Posterior Distribution gives the region of high likelihood trees in F
MCMC Algorithm
true topology MCMC topology
Can Layer 2 branching points
High speed connections can fool tomography