approximating a single component of the solution to a linear system · 2015-05-21 · approximating...
TRANSCRIPT
Approximating a single component of the solution to a linear system
1
Christina E. Lee, Asuman Ozdaglar, Devavrat Shah [email protected] [email protected] [email protected]
MIT LIDS
2
How do I compare to my competitors?
Bonacich Centrality Rank Centrality Pagerank
3 Bonacich Centrality Rank Centrality Pagerank
Stationary distribution of Markov chain
Q: Do I need to compute the entire solution vector in order to get an estimate of my centrality and the centrality of my competitors?
Eek, too expensive!!
Solution to Linear System
Q: How do we efficiently compute the solution for a single coordinate within a large network?
4
(1) Stationary distribution of Markov chain
(2) Solution to Linear System of Equations
• G = P (i.e. a stochastic matrix), z = 0 • Sample short random walks,
relies on sharp characterization of πi
• If A positive definite, then we can choose G, z s.t. spectral radius G < 1, and x = Gx + z equivalent
• Expand computation in local neighborhood, use sparsity of G to bound size of neighborhood
Would like a “local” method, i.e. cheaper than computing full vector.
Overview of Literature
5
Stationary Distribution of Markov Chain
x = Gx + z, spectral radius(G) < 1
Probabilistic
Monte Carlo
MCMC methods Ulam von Neumann
Samples long random walks Good if G has good spectral properties Challenge is to control the variance of estimator Naturally parallelizable
Overview of Literature
6
Stationary Distribution of Markov Chain
x = Gx + z, spectral radius(G) < 1
Probabilistic
Monte Carlo
MCMC methods Ulam von Neumann
Iterative Algebraic
Power method Linear iterative update, Jacobi, Gauss-Seidel
Iterative matrix-vector multiplication Good if G is sparse, and has good spectral properties Distributed, but synchronous Each multiplication O(n2)
Overview of Literature
7
Stationary Distribution of Markov Chain
x = Gx + z, spectral radius(G) < 1
Probabilistic
Monte Carlo
MCMC methods Ulam von Neumann
Other SVD Gaussian elimination
Not designed for single component computation Requires global computation
What if we specifically thought about solving for a single component?
Iterative Algebraic
Power method Linear iterative update, Jacobi, Gauss-Seidel
Stationary Distribution of Markov Chain
x = Gx + z, spectral radius(G) < 1
Probabilistic
Monte Carlo
MCMC methods Ulam von Neumann
Iterative Algebraic
Power method Linear iterative update, Jacobi, Gauss-Seidel
Other SVD Gaussian elimination
Contributions
8
• New Monte Carlo method which uses • Samples “local” truncated returning random walks • Convergence is a function of mixing time • Estimate always guaranteed to be an upper bound • Suggest and analyze specific termination criteria
which gives multiplicative error bounds for high πi, and thresholds low πi
[L., Ozdaglar, Shah NIPS 2013]
Stationary Distribution of Markov Chain
x = Gx + z, spectral radius(G) < 1
Probabilistic
Monte Carlo
MCMC methods Ulam von Neumann
Iterative Algebraic
Power method Linear iterative update, Jacobi, Gauss-Seidel
Other SVD Gaussian elimination
Contributions
9
• Can be implemented asynchronously • Maintains sparse intermediate vectors • Provides invariant which track progress of method
[L., Ozdaglar, Shah ArXiv 1411.2647 2014]
Stationary Distribution of Markov Chain
x = Gx + z, spectral radius(G) < 1
Probabilistic
Monte Carlo
MCMC methods Ulam von Neumann
Iterative Algebraic
Power method Linear iterative update, Jacobi, Gauss-Seidel
Other SVD Gaussian elimination
Contributions
10
Both methods are inspired by previous algorithms designed for computing PageRank
[JW03, H03, FRCS05, ALNO07, BCG10, BBCT12]
[ABCHMT07]
Computing Stationary Probability
• Consider a Markov chain
– Finite state space
– Transition matrix
– Irreducible, positive recurrent
• Goal: compute stationary probability focusing on nodes with higher values
11
Node-Centric Monte Carlo
• Standard Monte Carlo method samples long random walks until it converges to stationarity ergodic theorem
• Our method is based on the property
12
Intuition
Estimate:
13
i
Algorithm
Gather Samples (i, N, θ)
Sample N truncated return paths to i
= fraction of samples truncated
i i i i
Returned to i !
i i i
Keep walking …
i i Path length exceeded
i
Q: How do we choose appropriate N, θ? 14
Gather Samples (i, N(k), θ(k))
Terminate if Satisfied
Compute θ(k+1) and N(k+1)
Double θ: θ(k+1) = 2*θ(k)
Increase N such that by Chernoff’s bound:
Algorithm In
crem
ent
k
15
Convergence
Q: Can we design a “smart” termination criteria?
16
Gather Samples (i, N(k), θ(k))
Terminate if Satisfied(Δ)
Output current estimate if
(a) Stationary prob. is small
(b) Fraction of truncated samples is small
Compute θ(k+1) and N(k+1)
Algorithm In
crem
ent
k
17
Results for Termination Criteria
18 Results extend to countable state space Markov chains.
Termination condition 1 guarantees πi small
Termination condition 2 guarantees multiplicative error bound
Dependent on algorithm parameters, not input Markov chain
19
Simulation - Pagerank
Nodes sorted by stationary probability
Stationary
probability Random graph using configuration model and power law degree distribution
Random graph using configuration model and power law degree distribution
Obtain close estimates for important nodes
20
Simulation - Pagerank
Nodes sorted by stationary probability
Stationary
probability
Random graph using configuration model and power law degree distribution
Obtain close estimates for important nodes
21
Simulation - Pagerank
Nodes sorted by stationary probability
Stationary
probability corrects for the bias! Fraction samples not
truncated
corrects for the bias!
22 Rank Centrality Pagerank
Compute Stationary Probability by sampling local random walks
23 Pagerank Bonacich Centrality
When spectral radius of G < 1,
• Importance sampling, sample walks and reweight → unbiased estimator
• If , may not exist sampling matrix such that variance is finite
• Does not exploit sparsity
Q: Extend to solving linear system? Ulam von Neumann algorithm
Stationary Distribution of Markov Chain
x = Gx + z, spectral radius(G) < 1
Probabilistic
Monte Carlo
MCMC methods Ulam von Neumann
Iterative Algebraic
Power method Linear iterative update, Jacobi, Gauss-Seidel
Other SVD Gaussian elimination
Contributions
24
[JW03, H03, FRCS05, ALNO07, BCG10, BBCT12]
[ABCHMT07]
Synchronous Algorithm
• Standard stationary linear iterative method
• Relies upon Neumann series representation
• Solving for a single coordinate
25
Sparsity pattern is k-hop neighborhood of i
x(t) as dense as z
Synchronous Algorithm
26
Sparsity of p(t) grows as breadth-first traversal
Initialize:
Update step:
Terminate if:
Output:
Synchronous - Guarantee
27 27
If , then the algorithm terminates with
(a) Estimate satisfying
(b) Total number of multiplications bounded by
where
Independent of size of matrix
Synchronous to Asynchronous
Initialize:
Update step:
Terminate if:
Output:
Update for node u of r(t),
28 What if we implemented the updates asynchronously instead?
Q: Does this converge?
For all t, for any chosen sequence of u,
Asynchronous Algorithm
29
Initialize:
Update step: Choose coordinate u
Terminate if:
Output:
Yes, if ρ(G) < 1 and u chosen infinitely often At what rate?
Asynchronous - Guarantee
30 30
If , then the algorithm terminates with
(a) Estimate satisfying
(b) With probability > 1 - δ, total number of multiplications is bounded by
where
Independent of size of matrix
Simulations Compare different choices for
Bonacich centrality in Erdos-Renyi graph: n=1000 and p=0.0276
31
32
Compute Stationary Probability by sampling local random walks
• Can be implemented asynchronously • Maintains sparse intermediate vectors • Provides invariant which track progress of method
[L., Ozdaglar, Shah arXiv:1411.2647 2014]
Compute Ax = b by expanding computation within neighborhood and utilizing sparsity
• New Monte Carlo method which uses • Samples “local” truncated returning random walks • Convergence is a function of mixing time • Estimate always guaranteed to be an upper bound • Suggest and analyze specific termination criteria
which gives multiplicative error bounds for high πi, and thresholds low πi
[L., Ozdaglar, Shah NIPS 2013]