spectral graph sparsification in nearly-linear time...

1
2D mesh grid (n=40000) with unit edge weight 400 off-tree edges added to the “hair comb” spanning tree is reduced from 6E4 to 1E2: 640X reduction! Spectral Graph Sparsification in Nearly-Linear Time Leveraging Efficient Spectral Perturbation Analysis Zhuo Feng, Department of ECE, Michigan Technological University Goal: find a sparse subgraph (graph sparsifier) to approximate the original graph [1] The sparsifier should have the same set of vertices but much fewer edges Becomes key to designing nearly-linear time numerical & graph algorithms [1][2] Applications: solving PDEs and sparse matrices, graph partitioning (data clustering), semi-supervised learning (SSL), maximum flow of undirected graphs, etc (1) Graph Sparsification Basics The original graph The sparsified graph [1] D.A. Spielman. Algorithms, graph theory, and linear equations in Laplacian matrices. ICM’10 [2] L. Koutis, G. L. Miller and R. Peng. A fast solver for a class of linear systems. Commun. ACM, 2012 Figure from [2] Graph sparsifiers for preserving graph cuts or graph Laplacian spectrums Cut sparsifiers: preserve cuts between vertices (Benczúr & Karger. STOC’96) Spectral sparsifiers: preserve eigenvalues & eigenvectors of graph Laplacians (Teng & Spielman. SIAM J. on Comp.’11) (3) Two Types of Graph Sparsifiers Graphs G and P are -spectrally similar if their Laplacian quadratic forms satisfy: σ T T T P G P xLx xLx xLx σ σ Preconditioned Conjugate Gradient (CG) is widely used for solving PCG needs iterations, where the condition number: Graph sparsifier should result in fast convergence when used as a preconditioner in CG (1) Graph Sparsifier for Solving SDD Matrices Matrix 1 st Eig. Val. 2 nd Eig. Val. 3 rd Eig. Val. 4 th Eig. Val. 5 th Eig. Val. 6 th Eig. Val. Cond. Num. G 26.170 23.182 17.572 11.514 9.373 6.673 135.948 P 23.442 22.271 15.877 9.095 7.0266 4.557 139.643 P -1 G 3.459 2.881 1.431 1.180 1.000 1.000 3.459 1 2 3 4 5 1 9 8 7 6 4 2 4 6 5 4 9 8 1 3 3 Laplacian Matrix G 1 1 8 4 0 3 0 0 0 0 0 4 14 9 0 1 0 0 0 0 0 9 13 0 0 4 0 0 0 3 0 0 16 5 0 8 0 0 0 1 0 5 15 6 0 3 0 0 0 4 0 6 11 0 0 1 0 0 0 8 0 0 12 4 0 0 0 0 0 3 0 4 9 2 0 0 0 0 0 0 1 2 4 1 1 4 2 4 6 5 4 9 8 1 3 3 2 3 6 5 4 7 8 9 Laplacian Matrix P 1 1 5 4 0 0 0 0 0 0 0 4 13 9 0 0 0 0 0 0 0 9 13 0 0 4 0 0 0 0 0 0 13 5 0 8 0 0 0 0 0 5 11 6 0 0 0 0 0 4 0 6 10 0 0 0 0 0 0 8 0 0 12 4 0 0 0 0 0 0 0 4 6 2 0 0 0 0 0 0 0 2 3 1 1 max 1 min ( ) ( ) ( ) PG kPG PG λ λ = ( ) ( ) 1 1 log O kPG ε 1 1 P Gx Pb = (2) Laplacian of A Resistor Network 3.5 1.5 2 1.5 4 2 0.5 2 3 1 0.5 1 3 1.5 2 1.5 3.5 Symmetric Diagonally Dominant (SDD) Laplacian Matrix G (,) ( , ) if ( ,) (,) (,) if otherwise 0 uv E wuv uv E Guv wuv u v = == ( ) 2 (,) (,) () () T uv E x Gx wuv xu xv = Quadratic Laplacian Form (Joule Heat) A Weighted Graph ( , , ) A VEw = 1 2 4 3 5 1.5 2 2 1.5 1 0.5 As shown below, tree preconditioners have very well separated large eigenvalues [1] It is possible to dramatically reduce the largest eigenvalues by adding some off-tree edges 1 λ 2 λ 3 λ k λ An open question: How to efficiently achieve this goal for large graphs? Low-Stretch Spanning Tree Spanning tree edges Off-tree edges Ultra-sparsifier ? I. Background II. Technical Challenges (2) Spanning Tree as a Graph Sparsifier III. Our Approach First-order generalized eigenvalue perturbation analysis: (1) Spectral Perturbation Approach ( ) ( )( )( ) i i i i i i i i i i i i i Gu u P P u u Gu Pu Pu Pu δ λ δλ δ δ δ λ δ δλ λδ + = + + + = + + Perturbed eigenvalues Tree edges Off-tree edges 1 n i ij j j u u δ ζ = = 1, 0, T i j i j u Pu i j = = Perturbed eigenvectors expanded using P-orthogonal eigenvectors 1 1 i i i i i i i n n ij j j i ij j i i i i j j T T i i i i i i i T i i Gu Pu Pu Pu Pu P u Pu Pu u Pu u Pu u Pu δ λ δ δλ λδ ζλ λ ζ δλ λδ δ δλ λ λ δ = = = + + = + + =− =− ( )( ) [ ] ( )( ) , 1 , 1 , : 0 0 1 0 0 e e n T pq p q p q s T p n T T i i pq i p q p q i s P w e e e e where e w u e e e e u δ δλ λ = = = = =− Eigenvalue perturbation due to edges: The eigenvalue perturbation is proportional to resistor’s Joule heat! V V ( ) p i V u p = ( ) q i V u q = 1. Compute the largest generalized eigenvalue and its eigenvector i i i Gu Pu λ = 2. Compute the Joule heat of each off-tree edge with the eigenvector T i i i i u Pu δλ λ δ =− 4. Repeat Steps 1-3 until all large eigenvalues are fixed 3. Include dissimilar edges with large Joule heat to the initial tree Challenge: There can be too many large eigenvalues! (2) Fixing the Largest Eigenvalue 1 λ 2 λ 3 λ k λ Sequence for fixing largest eigenvalues (4) Nearly-Linear Time Complexity Low-stretch spanning trees can be extracted in nearly linear time [1] Generalized power iteration for a spanning tree can be done in linear time 1. Perform 1-step generalized power iteration with a random vector 2. Rank edges according to Joule heat (spectral criticality) levels 3. Recover a few dissimilar “spectrally critical” off-tree edges 0 1 0 1 1 n n i i i i i i i x u x P Gx u α αλ + = = = = = ( ) ( )( ) 2 1 1 1 1 N T i i i i x G Px αλ λ = = (3) Fixing A Group of Eigenvalues IV. Experiment Results max λ 200 100 0 0 100 10 -3 2 4 6 0 200 200 100 0 0 100 0 0.5 1 200 y x (1) Spectral Graph Sparsification Results Total Stretch: ( ) On n n n × 2D mesh x y “hair comb” spanning tree Adding off-tree edges Test Cases # of Nodes # of Non- zeros Direct Method (Cholmod) Ultra-Sparsifier Reduction Iter. Num. Off. Edg. RT (Mem) G3_circuit 1.6E6 7.7E6 45s (2.2G) 37 iter. 8% 5.2s (315M) 45,897X thermal2 1.2E6 8.6E6 16.0s (0.9G) 34 iter. 10% 4.4s (235M) 1,582X ecology2 1.0E6 5.0E6 12.5s (0.7G) 47 iter. 8% 3.6s (183M) 1,728X tmt_sym 0.7E6 5.1E6 11.0s (599M) 30 iter. 10% 2.2s (136M) 796X paraboli_fem 0.5E6 3.7E6 6.3s (481M) 25 iter. 8% 1.2s (97M) 120X max λ CKTs # of Nodes # of Non- zeros Direct Method (Cholmod) Ultra-Sparsifier Reduction Iter. Num. Off. Edg. RT (Mem) Thupg1 5.0E6 2.1E7 75s (4.0G) 27 iter. 2% 10s (0.8G) 34,047X Thupg2 8.9E6 3.9E7 158s (7.6G) 32 iter. 2% 21s (1.5G) 39,426X Thupg3 11.8E6 5.1E7 250s (10.0G) 32 iter. 2% 25s (1.9G) 101,052X Thupg4 15.2E6 6.6E7 N/A 32 iter. 2% 36s (2.5G) 97,550X Thupg5 19.2E6 8.5E7 N/A 33 iter. 2% 47s (3.1G) 136,678X max λ (2) SDD Solver for Power Grid Analysis (3) SDD Solver for UFL Sparse Matrices VI. Acknowledgements NSF CCF CAREER Grant #1350206, NSF CCF SHF Grant #1318694 MTU Research Enhancement Fund PhD students: Xueqian Zhao, Lengfei Han V. Conclusion Spanning trees are critical for building high-quality spectral graph sparsifiers Spectral critical off-tree edges can be efficiently identified by the proposed spectral perturbation approach Nearly linear time algorithms for graph sparsification and solving SDD matrices have been developed showing good results

Upload: others

Post on 05-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Spectral Graph Sparsification in Nearly-Linear Time ...zhuofeng/MTU_VLSI_DA_files/papers/DAC2016_poster.pdf2D mesh grid (n=40000) with unit edge weight 400 off-tree edges added to

2D mesh grid (n=40000) with unit edge weight

400 off-tree edges added to the “hair comb” spanning tree

is reduced from 6E4 to 1E2: 640X reduction!

Spectral Graph Sparsification in Nearly-Linear Time Leveraging Efficient Spectral Perturbation Analysis

Zhuo Feng, Department of ECE, Michigan Technological University

Goal: find a sparse subgraph (graph sparsifier) to approximate the original graph [1]

The sparsifier should have the same set of vertices but much fewer edges

Becomes key to designing nearly-linear time numerical & graph algorithms [1][2]

Applications: solving PDEs and sparse matrices, graph partitioning (data clustering),

semi-supervised learning (SSL), maximum flow of undirected graphs, etc

(1) Graph Sparsification Basics

The original graph The sparsified graph

[1] D.A. Spielman. Algorithms, graph theory, and linear equations in Laplacian matrices. ICM’10[2] L. Koutis, G. L. Miller and R. Peng. A fast solver for a class of linear systems. Commun. ACM, 2012

Figure from [2]

Graph sparsifiers for preserving graph cuts or graph Laplacian spectrums

Cut sparsifiers: preserve cuts between vertices (Benczúr & Karger. STOC’96)

Spectral sparsifiers: preserve eigenvalues & eigenvectors of graph Laplacians (Teng &

Spielman. SIAM J. on Comp.’11)

(3) Two Types of Graph Sparsifiers

Graphs G and P are -spectrally similar if their Laplacian quadratic forms satisfy:

σ TT TP

G Px L x x L x x L xσσ

≤ ≤

Preconditioned Conjugate Gradient (CG) is widely used for solving

PCG needs iterations, where the condition number:

Graph sparsifier should result in fast convergence when used as a preconditioner in CG

(1) Graph Sparsifier for Solving SDD Matrices

Matrix 1st Eig. Val. 2nd Eig. Val. 3rd Eig. Val. 4th Eig. Val. 5th Eig. Val. 6th Eig. Val. Cond. Num.G 26.170 23.182 17.572 11.514 9.373 6.673 135.948P 23.442 22.271 15.877 9.095 7.0266 4.557 139.643

P-1G 3.459 2.881 1.431 1.180 1.000 1.000 3.459

1 2 3

4 5

1

987

6

42

4

6 5

49

8

1 3

3

Laplacian Matrix G1

1

−−−−−

−−−−−

−−−−−−−

−−−−−

−−

8403000004149010000

09130040003001650800

010515603000406110010008001240000030492000000124 1

1

42

4

6 5

49

8

1 3

3

2 3

654

7 8 9

Laplacian Matrix P

1

1

−−−

−−−−

−−−−

−−−−

5400000004139000000

09130040000001350800000511600000406100000008001240000000462000000023

11 max

1min

( )( )( )P Gk P GP G

λλ

−−

−=( )( )1 1logO k P G ε− −

1 1P Gx P b− −=

(2) Laplacian of A Resistor Network

3.5 1.5 21.5 4 2 0.5

2 3 10.5 1 3 1.5

2 1.5 3.5

− − − − − − − − − − − − −

Symmetric Diagonally Dominant (SDD) Laplacian MatrixG

( , )

( , ) if ( , )( , ) ( , ) if

otherwise0u v E

w u v u v EG u v w u v u v

− ∈

= ==

( )2

( , )( , ) ( ) ( )T

u v Ex Gx w u v x u x v

∈= −∑

Quadratic Laplacian Form (Joule Heat)

A Weighted Graph ( , , )A V E w=

1 2

4

3

5

1.52

2

1.5 1

0.5

As shown below, tree preconditioners have very well separated large eigenvalues [1]

It is possible to dramatically reduce the largest eigenvalues by adding some off-tree edges

1λ2λ3λkλ

An open question: How to efficiently achieve this goal for large graphs?

Low-Stretch Spanning Tree

Spanning tree edges Off-tree edges

Ultra-sparsifier

?

I. Background II. Technical Challenges

(2) Spanning Tree as a Graph Sparsifier

III. Our Approach

First-order generalized eigenvalue perturbation analysis:

(1) Spectral Perturbation Approach

( ) ( )( )( )i i i i i i

i i i i i i i

G u u P P u u

G u P u Pu Pu

δ λ δλ δ δ

δ λ δ δλ λδ

+ = + + +

⇒ = + +

Perturbed eigenvalues Tree edges Off-tree edges

1

n

i ij jj

u uδ ζ=

=∑1,0,

Ti j

i ju Pu

i j=

= ≠

Perturbed eigenvectors expanded using P-orthogonal eigenvectors

1 1

i i i i i i i

n n

ij j j i ij j i i i ij j

TTi i

i i i i iTi i

G u P u Pu Pu

Pu P u Pu Pu

u Pu u Puu Pu

δ λ δ δλ λδ

ζ λ λ ζ δλ λδ

δδλ λ λ δ

= =

= + +

⇒ = + +

⇒ = − = −

∑ ∑

( )( )

[ ]

( )( )

,1

,1

,

: 0 0 1 0 0

e

e

n T

p q p q p qs

Tp

n TTi i p q i p q p q i

s

P w e e e e

where e

w u e e e e u

δ

δλ λ

=

=

= − −

=

⇒ = − − −

Eigenvalue perturbation due to edges:

The eigenvalue perturbation is proportional to resistor’s Joule heat!

V V( )p iV u p= ( )q iV u q=

1. Compute the largest generalized eigenvalue and its eigenvector

i i iGu Puλ=

2. Compute the Joule heat of each off-tree edge with the eigenvectorT

i i i iu Puδλ λ δ= −

4. Repeat Steps 1-3 until all large eigenvalues are fixed

3. Include dissimilar edges with large Joule heat to the initial tree

Challenge: There can be too many large eigenvalues!

(2) Fixing the Largest Eigenvalue

1λ2λ3λkλSequence for fixing largest eigenvalues

(4) Nearly-Linear Time Complexity Low-stretch spanning trees can be extracted in nearly linear time [1]

Generalized power iteration for a spanning tree can be done in linear time

1. Perform 1-step generalized power iteration with a random vector

2. Rank edges according to Joule heat (spectral criticality) levels

3. Recover a few dissimilar “spectrally critical” off-tree edges

0 1 01 1

n n

i i i i ii i

x u x P Gx uα α λ+

= =

= ⇒ = =∑ ∑

( ) ( ) ( )21 1

11

NT

i i ii

x G P x α λ λ=

− = −∑

(3) Fixing A Group of Eigenvalues

IV. Experiment Results

maxλ

200

100

00

100

10 -3

2

4

6

0200

200

100

00

100

0

0.5

1

200

y x

(1) Spectral Graph Sparsification Results

Total Stretch: ( )O n n

n n× 2D mesh

x

y

“hair comb” spanning tree

Adding off-tree edgesTest Cases # of

Nodes

# of Non-zeros

Direct Method(Cholmod)

Ultra-SparsifierReductionIter. Num. Off. Edg. RT (Mem)

G3_circuit 1.6E6 7.7E6 45s (2.2G) 37 iter. 8% 5.2s (315M) 45,897X

thermal2 1.2E6 8.6E6 16.0s (0.9G) 34 iter. 10% 4.4s (235M) 1,582X

ecology2 1.0E6 5.0E6 12.5s (0.7G) 47 iter. 8% 3.6s (183M) 1,728X

tmt_sym 0.7E6 5.1E6 11.0s (599M) 30 iter. 10% 2.2s (136M) 796Xparaboli_fem 0.5E6 3.7E6 6.3s (481M) 25 iter. 8% 1.2s (97M) 120X

maxλ

CKTs # of Nodes

# of Non-zeros

Direct Method(Cholmod)

Ultra-SparsifierReductionIter. Num. Off. Edg. RT (Mem)

Thupg1 5.0E6 2.1E7 75s (4.0G) 27 iter. 2% 10s (0.8G) 34,047X

Thupg2 8.9E6 3.9E7 158s (7.6G) 32 iter. 2% 21s (1.5G) 39,426X

Thupg3 11.8E6 5.1E7 250s (10.0G) 32 iter. 2% 25s (1.9G) 101,052X

Thupg4 15.2E6 6.6E7 N/A 32 iter. 2% 36s (2.5G) 97,550X

Thupg5 19.2E6 8.5E7 N/A 33 iter. 2% 47s (3.1G) 136,678X

maxλ

(2) SDD Solver for Power Grid Analysis

(3) SDD Solver for UFL Sparse Matrices VI. Acknowledgements

NSF CCF CAREER Grant #1350206,

NSF CCF SHF Grant #1318694

MTU Research Enhancement Fund

PhD students: Xueqian Zhao, Lengfei Han

V. Conclusion Spanning trees are critical for building

high-quality spectral graph sparsifiers

Spectral critical off-tree edges can be

efficiently identified by the proposed

spectral perturbation approach

Nearly linear time algorithms for graph

sparsification and solving SDD matrices

have been developed showing good results