model reduction for edge-weighted personalized pagerankbindel/present/2016-04-purdue.pdfmodel...

33
Model Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue 11 Apr 2016 1 / 33

Upload: others

Post on 22-Jul-2020

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Model Reduction forEdge-Weighted Personalized PageRank

D. Bindel

11 Apr 2016

D. Bindel Purdue 11 Apr 2016 1 / 33

Page 2: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

The Computational Science & Engineering Picture

Application

Analysis Computation

MEMSSmart gridsNetworks

Linear algebraApproximation theorySymmetry + structure

HPC / cloudSimulatorsSolvers

D. Bindel Purdue 11 Apr 2016 2 / 33

Page 3: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Collaborators

Wenlei Xie LinkedInDavid Bindel CornellJohannes Gehrke MicrosoftAl Demers Cornell

D. Bindel Purdue 11 Apr 2016 3 / 33

Page 4: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

PageRank Problem

Unweighted Node weighted Edge weighted

Goal: Find “important” vertices in a network

Basic approach uses only topology

Weights incorporate prior info about important nodes/edges

D. Bindel Purdue 11 Apr 2016 4 / 33

Page 5: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

PageRank Model

Unweighted Node weighted Edge weighted

Random surfer model: x (t+1) = ↵Px (t) + (1� ↵)v where P = AD�1

Stationary distribution: Mx = b where M = (I � ↵P), b = (1� ↵)v

D. Bindel Purdue 11 Apr 2016 5 / 33

Page 6: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Edge Weight vs Node Weight Personalization

vi = vi (w)

w 2 Rd

�ij = �ij(w)

Introduce personalization parameters w 2 Rd in two ways:Node weights: M x(w) = b(w)Edge weights: M(w) x(w) = b

D. Bindel Purdue 11 Apr 2016 6 / 33

Page 7: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Edge Weight vs Node Weight Personalization

Node weight personalization is well-studied

Topic-sensitive PageRank: fast methods based on linearity

Localized PageRank: fast methods based on sparsity

Some work on edge weight personalization

ObjectRank/ScaleRank: personalize weights for di↵erent edge types

But lots of work incorporates edge weights without personalization

Our goal: General, fast methods for edge weight personalization

D. Bindel Purdue 11 Apr 2016 7 / 33

Page 8: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Edge Weight Parameterizations

Di↵erent ways to personalize =) di↵erent algorithm options

1 Linear: Take an edge of type i with probability ↵wi

P(w) =dX

i=1

wiP(i)

2 Scaled linear: Take an edge with probability / (linear) edge weight

P(w) = A(w)D(w)�1, A(w) =dX

i=1

wiA(i), D(w) =

dX

i=1

wiD(i),

3 Fully nonlinear: Both A and P depend nonlinearly on w

D. Bindel Purdue 11 Apr 2016 8 / 33

Page 9: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Model Reduction

=

Expensive full model(Mx = b)

⇡ U

Reduced basis =

Reduced model(M̃y = b̃)

Approximation ansatz

Model reduction procedure from physical simulation world:

O✏ine: Construct reduced basis U 2 Rn⇥k

O✏ine: Choose � k equations to pick approximation x̂ = Uy

Online: Solve for y(w) given w and reconstruct x̂

D. Bindel Purdue 11 Apr 2016 9 / 33

Page 10: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Reduced Basis Construction: SVD (aka POD/PCA/KL)

⇡ U

⌃ V T

Snapshot matrix

x1 x2 . . . xr

w2wr

w1 Sample points

D. Bindel Purdue 11 Apr 2016 10 / 33

Page 11: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Choosing Good Spaces

What is the best possible approximation x̂ = Uy?

miny

kUy � x(w)k2 �k+1kxk2 + einterp(w)

where

einterp(w) =

������x(w)�

rX

j=1

x(wj)cj(w)

������2

is error in an interpolant.

Pay attention where x has large derivatives!

Also suggests sampling strategies (sparse grids, adaptive methods)

D. Bindel Purdue 11 Apr 2016 11 / 33

Page 12: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Approximation Ansatz

Want r = MUy � b ⇡ 0. Consider two approximation conditions:

Method Ansatz Properties

Bubnov-Galerkin UT r = 0 Good accuracy empiricallyFast for P(w) linear

DEIM min krIk Fast even for nonlinear P(w)(collocation) Complex cost/accuracy tradeo↵

Petrov-Galerkin a bit more accurate than Bubnov-Galerkin – future work.

D. Bindel Purdue 11 Apr 2016 12 / 33

Page 13: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Bubnov-Galerkin Method

UT

M U

y

� b = 0.

Linear case: wi = probability of transition with edge type i

M(w) = I � ↵

X

i

wiP(i)

!, M̃(w) = I � ↵

X

i

wi P̃(i)

!

where we can precompute P̃(i) = UTP(i)U

Nonlinear: Cost to form M̃(w) comparable to cost of PageRank!

D. Bindel Purdue 11 Apr 2016 13 / 33

Page 14: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Discrete Empirical Interpolation Method (DEIM)

M U

y

� b

I

= 0.Equations in I

Ansatz: Minimize krIk for chosen indices IOnly need a few rows of M (and associated rows of U)

If given A(w), also need column sums for normalization.

Di↵erence from physics applications: high-degree nodes!

D. Bindel Purdue 11 Apr 2016 14 / 33

Page 15: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Error Behavior

Similar error analysis framework for both Galerkin and DEIM

Consistency + Stability = Accuracy

Consistency: Does the subspace contain good approximants?

Stability: Is the approximation subproblem far from singular?

Characterize stability by a quasi-optimality condition

kx � Uyk minz

Ckx � Uzk

D. Bindel Purdue 11 Apr 2016 15 / 33

Page 16: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Standard Quasi-Optimality Approach

Define a solution projector:

⇧x = approximate solution when true solution is x

Note that ⇧U = U.

The error projector I � ⇧ maps a true solution to error

e = x � ⇧x = (I � ⇧)x

Note that (I � ⇧)U = 0.

If emin = x � Uz is the smallest norm error in the space, then

e = (I � ⇧)x � (I � ⇧)Uz = (I � ⇧)emin

Therefore, a bound on kI �⇧k 1+ k⇧k establishes quasi-optimality.

D. Bindel Purdue 11 Apr 2016 16 / 33

Page 17: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Quasi-Optimality: Galerkin and DEIM

Galerkin : ⇧ = UM̃�1W TM M̃ ⌘ W TMU

DEIM : ⇧ = UM̃†MI,: M̃ ⌘ MI,:U

Key to stability: M̃ far from singular

Suggests pivoting schemes for “good” I in DEIM

Also helps to explicitly enforceP

i x̂i = 1

Can bound k⇧k o✏ine for Galerkin + linear parameterization.

D. Bindel Purdue 11 Apr 2016 17 / 33

Page 18: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Interpolation Costs

Consider subgraph relevant to one interpolation equation:

i 2 I

Incoming neighbors of i

. . .1/3 1/50

Really care about weights of edges incident on I

Need more edges to normalize (unless A(w) linear)

Cost to include i 2 I: |{j , k : aij 6= 0 and akj 6= 0}|

High in/out degree are expensive but informative

D. Bindel Purdue 11 Apr 2016 18 / 33

Page 19: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Interpolation Cost and Accuracy

Key question: how to choose I to balance cost vs accuracy?

Want to pick I once, so look at rows of

Z =⇥M(w1)U M(w2)U . . .

for sample parameters w (i).

Pivoted QR-like greedy row selection with proxy measures for

Cost: Nonzeros in row (+ assoc columns if normalization required)

Accuracy: Residual when projecting row onto those previously selected

Several heuristics for cost/accuracy tradeo↵ (see paper)

D. Bindel Purdue 11 Apr 2016 19 / 33

Page 20: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Online Costs

If ` = # PR components needed, online costs are:

Form M̃ O(dk2) for B-GMore complex for DEIM

Factor M̃ O(k3)Solve for y O(k2)Form Uy O(k`)

Online costs do not depend on graph size!(unless you want the whole PR vector)

D. Bindel Purdue 11 Apr 2016 20 / 33

Page 21: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Example Networks

DBLP (citation network)3.5M nodes / 18.5M edgesSeven edge types =)seven parametersP(w) linearCompetition: ScaleRank

Weibo (micro-blogging)1.9M nodes / 50.7M edgesWeight edges by topicalsimilarity of postsNumber of parameters =number of topics (5, 10, 20)

(Studied global and local PageRank – see paper for latter.)

D. Bindel Purdue 11 Apr 2016 21 / 33

Page 22: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Singular Value Decay

10-1

100

101

102

103

104

105

106

0 50 100 150 200

Val

ue

ith

Largest Singular Value

DBLP-LWeibo-S5

Weibo-S10Weibo-S20

r = 1000 samples, k = 100

D. Bindel Purdue 11 Apr 2016 22 / 33

Page 23: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

DBLP Accuracy

10-5

10-4

10-3

10-2

10-1

100

Galerkin

DEIM

-100

DEIM

-120

DEIM

-200

ScaleR

ank

Kendall@100

Normalized L1

D. Bindel Purdue 11 Apr 2016 23 / 33

Page 24: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

DBLP Running Times (All Nodes)

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Galerkin

DEIM

-100

DEIM

-120

DEIM

-200

ScaleRank

Runnin

g t

ime

(s)

CoefficientsConstruction

D. Bindel Purdue 11 Apr 2016 24 / 33

Page 25: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Weibo Accuracy

10-4

10-3

10-2

10-1

5 10 20

# Parameters

Kendall@100

Normalized L1

D. Bindel Purdue 11 Apr 2016 25 / 33

Page 26: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Weibo Running Times (All Nodes)

0

0.1

0.2

0.3

0.4

0.5

5 10 20

Ru

nn

ing

tim

e (s

)

# Parameters

CoefficientsConstruction

D. Bindel Purdue 11 Apr 2016 26 / 33

Page 27: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Application: Learning to Rank

Goal: Given T = {(iq, jq)}|T |q=1, find w that mostly ranks iq over j1.

(c.f. Backstrom and Leskovec, WSDM 2011)

Standard idea: Gradient descent

@x

@wj= M(w)�1

↵@P(w)

@wjx(w)

Dominant cost: d + 1 solves with the PageRank system M(w)

One PageRank solve to evaluate a loss function

One PageRank solve per parameter to evaluate gradients

D. Bindel Purdue 11 Apr 2016 27 / 33

Page 28: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Application: Learning to Rank

Goal: Given T = {(iq, jq)}|T |q=1, find w that mostly ranks iq over j1.

(c.f. Backstrom and Leskovec, WSDM 2011)

Standard: Gradient descent on full problem

One PR computation for objective

One PR computation for each gradient component

Costs d + 1 PR computations per step

With model reduction

Rephrase objective in reduced coordinate space

Use factorization to solve PR for objective

Re-use same factorization for gradient

D. Bindel Purdue 11 Apr 2016 28 / 33

Page 29: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

DBLP Learning Task

100

150

200

250

300

350

400

0 2 4 6 8 10 12 14 16 18 20

Obje

ctiv

e F

unct

ion V

alue

Iteration

StandardGalerkin

DEIM-200

(8 papers for training + 7 params)

D. Bindel Purdue 11 Apr 2016 29 / 33

Page 30: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

The Punchline

Test case: DBLP, 3.5M nodes, 18.5M edges, 7 params

Cost per Iteration:

Method Standard Bubnov-Galerkin DEIM-200Time(sec) 159.3 0.002 0.033

D. Bindel Purdue 11 Apr 2016 30 / 33

Page 31: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Roads Not Taken

In the paper (but not the talk)

Selecting interpolation equations for DEIM

Localized PageRank experiments (Weibo and DBLP)

Comparison to BCA for localized PageRank

Room for future work! Analysis, applications, systems, ...

D. Bindel Purdue 11 Apr 2016 31 / 33

Page 32: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Questions?

Edge-Weighted Personalized PageRank:Breaking a Decade-Old Performance Barrier

Wenlei Xie, David Bindel, Johannes Gehrke, and Al Demers

KDD 2015, paper 117

Sponsors:

NSF (IIS-0911036 and IIS-1012593)

iAd Project from the National Research Council of Norway

D. Bindel Purdue 11 Apr 2016 32 / 33

Page 33: Model Reduction for Edge-Weighted Personalized PageRankbindel/present/2016-04-purdue.pdfModel Reduction for Edge-Weighted Personalized PageRank D. Bindel 11 Apr 2016 D. Bindel Purdue

Welcome to SIAM ALA in Hong Kong!

Hong Kong Baptist University, 4-8 May 2018

D. Bindel Purdue 11 Apr 2016 33 / 33