1 on the eigenvalue power law milena mihail georgia tech christos papadimitriou u.c. berkeley &

26
1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

Upload: benny-lime

Post on 15-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

1

On the Eigenvalue Power Law

Milena Mihail Georgia Tech

Christos PapadimitriouU.C. Berkeley

&

Page 2: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

2

Network and application studies need properties and models of:

Internet graphs & Internet Traffic.

Shift of networking paradigm: Open, decentralized, dynamic.

Intense measurement efforts. Intense modeling efforts.

Internet Measurement and Models

Routers

WWW

P2P

Page 3: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

3

Internet & WWW Graphs

http://www.etc

http://www.XXX.net

http://www.YYY.com

http://www.etc http://www.ZZZ.edu

http://www.XXX.com

http://www.etc

Routers exchanging traffic. Web pages and hyperlinks.

10K – 300K nodesAvrg degree ~ 3

Page 4: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

4

Real Internet Graphs

CAIDA http://www.caida.org

Average Degree = Constant

A Few Degrees VERY LARGE

Degrees not sharply concentrated around their mean.

Page 5: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

5

Degree-Frequency Power Law

degree1 3 4 5 102 100

freq

uen

cy

WWW measurement: Kumar et al 99Internet measurement: Faloutsos et

al 99

E[d] = const., but

No sharp concentration

Page 6: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

6

Degree-Frequency Power Law

1 3 4 5 102 100

freq

uen

cy

E[d] = const., but

No sharp concentration

degree

E[d] = const., but

No sharp concentration

Erdos-Renyi sharp concentration

Models by Kumar et al 00, x Bollobas et al 01, x Fabrikant et al 02

Page 7: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

7

Rank-Degree Power Law

rank

deg

ree

1 2 3 4 5 10

Internet measurement: Faloutsos et al 99

UUNET

SprintC&WUSA

AT&TBBN

Page 8: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

8

Eigenvalue Power Law

rank

eig

en

valu

e

1 2 3 4 5 10

Internet measurement: Faloutsos et al 99

Page 9: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

9

This Paper: Large Degrees & Eigenvalues

rank

eig

en

valu

es

1 2 3 4 5 10

UUNET

SprintC&WUSA

AT&TBBN2

34

2 3 4

deg

ree

s

Page 10: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

10

This Paper: Large Degrees & Eigenvalues

Page 11: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

11

Principal Eigenvector of a Star

11

1

11

1

1

1

d

Page 12: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

12

Large Degrees

2

3

4

Page 13: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

13

Large Eigenvalues

2

3

4

Page 14: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

14

Main Result of the Paper

The largest eigenvalues of the adjacency martix of a graph whose large degrees are power law distributed (Zipf), are also power law distributed.

Explains Internet measurements.

Negative implications for the spectral filtering method in information retrieval.

Page 15: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

15

Random Graph Model

let

Connectivity analyzed by Chung & Lu ‘01

Page 16: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

16

Random Graph Model

Page 17: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

17

Random Graph Model

Page 18: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

18

Theorem :

Ffor large enough

Wwith probability at least

Page 19: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

19

Proof : Step 1. Decomposition

Vertex Disjoint StarsLR-extra

RR

LL

LR =

-

Page 20: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

20

Proof: Step 2: Vertex Disjoint Stars

Degrees of each Vertex Disjoint Stars Sharply Concentrated around its Mean d_iHence Principal Eigenvalue Sharply Concentrated around

Page 21: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

21

Proof: Step 3: LL, RR, LR-extra

LR-extra has max degree

LL has

edges

RR has max degree

Page 22: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

22

Proof: Step 3: LL, RR, LR-extra

LR-extra has max degree

RR has max degree

LL has

edges

Page 23: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

23

Proof: Step 4: Matrix Perturbation Theory

Vertex Disjoint Stars have principal eigenvalues

All other parts have max eigenvalue QED

Page 24: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

24

Implication for Info Retrieval

Spectral filtering, without preprocessing, reveals only the large degrees.

Term-Norm Distribution Problem :

Page 25: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

25

Implication for Info Retrieval

Term-Norm Distribution Problem : Spectral filtering, without preprocessing, reveals only the large degrees.

Local information.

No “latent semantics”.

Page 26: 1 On the Eigenvalue Power Law Milena Mihail Georgia Tech Christos Papadimitriou U.C. Berkeley &

26

Implication for Information Retrieval

Application specific preprocessing (normalization of degrees) reveals clusters:

WWW: related to searching, Kleinberg 97

IR, collaborative filtering, …

Internet: related to congestion, Gkantsidis et al 02

Open : Formalize “preprocessing”.

Term-Norm Distribution Problem :