approximating the mst weight in sublinear time

21
Approximating the MST Weight in Sublinear Time Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley)

Upload: guinevere-madden

Post on 02-Jan-2016

44 views

Category:

Documents


1 download

DESCRIPTION

Approximating the MST Weight in Sublinear Time. Bernard Chazelle (Princeton) Ronitt Rubinfeld (NEC) Luca Trevisan (U.C. Berkeley). Sublinear Time Algorithms. Make sense for problems on very large data sets - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Approximating the MST Weight in Sublinear Time

Approximating the MST Weight in Sublinear Time

Bernard Chazelle (Princeton)

Ronitt Rubinfeld (NEC)

Luca Trevisan (U.C. Berkeley)

Page 2: Approximating the MST Weight in Sublinear Time

Sublinear Time Algorithms

• Make sense for problems on very large data sets

• Go contrary to common intuition that “an algorithm must be given at least enough time to read all the input”

• Must be probabilistic• Must be approximate

Page 3: Approximating the MST Weight in Sublinear Time

Approximation

• For decision problems:the output is the correct answer either for the given input, or at least for some other input “close” to it.(Property Testing)

• For optimization problems:the output is a number that is close to the cost of the optimal solution for the given input.(There is not enough time to construct a solution)

Page 4: Approximating the MST Weight in Sublinear Time

Previous Examples

• The cost of the max cut in a graph with n nodes and cn2 edges can be approximated to within a factor in time 2poly(1/c).(Goldreich, Goldwasser, Ron)

• Other results for “dense” instances of optimization problems, for low-rank approximation of matrices, . . .

• No results (that we know of) for problems on bounded-degree graphs.

Page 5: Approximating the MST Weight in Sublinear Time

Our Result

• Given a connected weighted graph G, with maximum degree d and with weights in the range {1, . . . , w},

• we can compute the weight of the minimum spanning tree of G to within a factor of in time O(dw-2log w/);

• we also prove that it is necessary to look at dw-2) entries in the representation of G.

(We assume that G is represented using adjacency lists)

Page 6: Approximating the MST Weight in Sublinear Time

Main Intuition

• Suppose all weights are 1 or 2• Then the MST weight is equal to

n – 2 + # of conn. comp. induced by weight-1 edges

weight 1

weight 2connected componentsInduced by weight-1 edges

MST

Page 7: Approximating the MST Weight in Sublinear Time

Algorithm

Page 8: Approximating the MST Weight in Sublinear Time

Algorithm for weights in {1,2}• To approximate the MST weight to within a

multiplicative factor (1+) it’s enough to approximate c1 to within an additive factor n

(c1:= # of connected components induced by weight-1 edges)

• To approximate c1 we use ideas from Goldreich-Ron (property testing of connectivity)

• The algorithm runs in time O(d-2log-1)

Page 9: Approximating the MST Weight in Sublinear Time

Approximating # of connected components

• Given a graph G of max degree d with n nodes we want to compute c, the number of connected components of G up to an additive error n.

• For every vertex u, definenu := 1 / size of component of u

• Thenc = u nu

• And if we callau:= max {nu, }

• Thenc = u au n

Page 10: Approximating the MST Weight in Sublinear Time

Wrapping up the analysis

• Can estimate summation of au using sampling

• Once we pick a vertex u at random, the value au can be computed in time O(d/)

• We need to pick O(1/) vertices, so we get running time O(d/)

Page 11: Approximating the MST Weight in Sublinear Time

Algorithm

CC-APPROX() Repeat O(1/2) times

pick a random vertex v

do a BFS from v, stopping after 2/ steps

b:= 1 / number of visited vertices

return (average of the values b) * n

Page 12: Approximating the MST Weight in Sublinear Time

Improved Algorithm

CC-APPROX(, W)

Repeat O(1/2) times

pick a random vertex v

do first step of a BSF from v

b:=0; t:=1

(*) flip a coin

If heads, and visited <W nodes so far

t:=2*t

continue BSF until ends or t nodes are visited

if BSF ends, b:= 2#random coins / nodes visited

else go to (*)

return (average of the values b) * n

• Inner procedure takes average O(dlog W) time

Page 13: Approximating the MST Weight in Sublinear Time

Analysis

• Main idea: if v is in a component of size c<W, then b is zero with prob. ~(1 – 1/c) and ~1 with probability ~1/c. The average of b is 1/c.

• Setting W:=2/ we get– each time, the average of b is within /2 from the

average over v of nv

(that is, (# conn. comp.)/n)– Repeating O(1/2) times, the probability of

deviating by another factor /2 is bounded by a constant

– The average running time is O(d-2logW), that is O(d-2log -1).

Page 14: Approximating the MST Weight in Sublinear Time

General Weights

• Generalize argument for weight 1 and 2.• Let

ci = # of connected components induced by edges of weight at most i

• Then the MST weight is

n – w + i=1,. . ., w-1 ci

Page 15: Approximating the MST Weight in Sublinear Time

Final Algorithm

• For j=1,. . ., w-1, call CC-APPROX(,2w/) on the subgraph of G obtained by removing edges of cost >j

• Get ai, an approximation of ci

• Return n – w + i=1,. . ., w-1 ai

• Average answer is within n/2 from cost of MST, and variance is bounded

• Total running time O(dw-2log w/)

Page 16: Approximating the MST Weight in Sublinear Time

Extensions

• Low average degree

• Non-integer weights

Page 17: Approximating the MST Weight in Sublinear Time

Lower Bound

Page 18: Approximating the MST Weight in Sublinear Time

Abstract sampling problem

• Fix p,• Define two binary distributions A,B• Pr[A=1] = p, Pr[A=0]=1-p• Pr[B=1] = p+ p, Pr[B=0]=1-p-p• Distinguishing A from B with constant

probability requires (1/p2) samples

Page 19: Approximating the MST Weight in Sublinear Time

Reduction• Fix p = 1/w• We consider two distributions of weights over

a cycle of length n• In distribution G, for each edge we sample

from A; if A=0 the edge gets weight 1, otherwise it gets weight w

• In distribution H, same with B• H and G are likely to have MST costs that

differ by about n• To distinguish them we need to look at

(w/2) edge weights

Page 20: Approximating the MST Weight in Sublinear Time

Higher Degree

• Sample from G or H as before, – also add d-1 forward edges of weight w+1

from each vertex– randomly permute names of vertices

• Now, on average, reading t edge weights gives us t/d samples from A or B, so t=(dw/2)

Page 21: Approximating the MST Weight in Sublinear Time

Conclusions

• A plausibility result showing that approximation for a standard graph problem in bounded degree (and sparse) graphs can be achieved in time independent of number of vertices

• Use of approximate cost without solution?• More problems?

– The trivial Max SAT approximation algorithms can be implemented in constant time, and give (an implicit representation of) a solution

– Non-trivial Max SAT approximation? (say, 3/4)– Something really useful?