learning to rank typed graph walks: local and global approaches

Learning to Rank Typed Graph Walks:

Local and Global Approaches

Einat Minkov and William W. Cohen

Language Technologies Institute and Machine Learning Department School of Computer ScienceCarnegie Mellon University

Did I forget to invite anyone for this meeting?


What is Jason’s personalemail address ?


What is Jason’s personalemail address ?

Who is “Mike” who is mentioned in this email?

proposal

CMU

CALO

graph

William

6/18/07

6/17/07

Sent To

Has Subject Term

[email protected]

Q: “what are Jason’s email aliases?”

“Jason”

Msg5

Msg18

[email protected]

Sent fromEmail

Sent toEmail

JasonErnst

Sent-to

[email protected]

Similar to

Msg 2

[email protected]

Sent To

einat

Has terminverse

Search via lazy random graph walks An extended similarity measure via graph walks:


Propagate “similarity” from start nodes through edges in the graph – accumulating evidence of similarity over multiple connecting paths.


Fixed probability of halting the walk at every step – i.e., shorter connecting paths have greater importance (exponential decay)




Finite graph walk, applied through sparse matrix multiplication

(estimated via sampling for large graphs)




Finite graph walk, applied through sparse matrix multiplication

(estimated via sampling for large graphs)

The result is a list of nodes, sorted by “similarity” to an input node distribution (final node probabilities).


The graph Graph nodes are typed.

Graph edges - directed and typed (adhering to the graph schema)

Multiple relations may hold between two given nodes.

Every edge type is assigned a fixed weight.

Graph walks

graph walk controlled by edge weights Θ , walk length K and stay probability γ

The probability of reaching y from x in one step: the sum of edge weights from x to y, out of the total outgoing weight from x.

The transition matrix assumes a stay probability at the current node at every time step.

A query language:

Q: { , }

The graph

Nodes

Node type

Edge label

Edge weightx

y2

3

3

Probability of following blue edge out of x is

2/ (2+3+3)

x

y2

3

3

Probability of following blue edge out of x is

2/ (2+3+3)

Returns a list of nodes

(of type ) ranked by

the graph walk probs.

TasksPerson namePerson namedisambiguationdisambiguation

ThreadingThreading

Alias findingAlias finding

[ term “andy” file msgId ]

“person”

[ file msgId ]

“email-file”

What are the adjacent messages in this thread?

A proxi for finding generally related messages.

What are the email-addresses of Jason ?...

[ term Jason ]

“email-address”

Learning to Rank

Typed Graph Walks

Learning settings

Query a

node rank 1

node rank 2

node rank 3

node rank 4

…

node rank 10

node rank 11

node rank 12

…

node rank 50

Query b Query q

node rank 1

node rank 2

node rank 3

node rank 4

…

node rank 10

node rank 11

node rank 12

…

node rank 50

node rank 1

node rank 2

node rank 3

node rank 4

…

node rank 10

node rank 11

node rank 12

…

node rank 50

…

GRAPH WALK

+ Rel. answers a + Rel. answers b + Rel. answers q

Task T (query class)

Graph walk

Weightupdate

Theta*

Learning approachesEdge weight tuning:

Graph walk

Weightupdate

Graph walk


Theta*

task

Graph walk

Graph walk

Feature generation

Weightupdate

Updatere-ranker

Re-rankingfunction

Graph walk


Node re-ordering:

Theta*

task

Graph walk

Graph walk

Feature generation

Weightupdate

Updatere-ranker

Re-rankingfunction

Graph walk

Graph walk

Feature generatio

n

Score byre-ranker


Node re-ordering:

Theta*

task

task

Learning approaches

• Exhaustive local search over edge type (Nie et-al, 05)

• Gradient descent (Chang et-al, 2000)

• Hill climbing error backpropagation (Dilligenti et-al, IJCAI-05)

• Gradient descent approximation for partial order preferences (Agarwal et-al, KDD-06)

• Re-ranking (Minkov, Cohen and NG, SIGIR-06)

Graphparameters’tuning

Nodere-ordering

• Can be adapted from extended PageRank settings to finite graph walks.

• Strong assumption of first-order Markov dependencies

• A discriminative learner, using graph-paths describing features.

• Loses some quantitative data in feature decoding. However, can represent edge sequences.

Error Backpropagation

Cost function:

Weight updates:

Where,

following Dilligenti et-al, 2005

follows closely on (Collins and Koo, Computational Linguistics, 2005)

Scoring function:

Adapt weights to minimize (boosted version):

, where

Re-ranking

Path describing Features

K=0 K=1 K=2

X1

X2

X3

X4

X5

x2 x1 x3

x4 x1 x3

x4 x2 x3

x2 x3

‘Edge unigram’was edge type l used in reaching x from Vq?

‘Edge (n-)bigram’ were edge types l1 and l2 traversed (in that order) in reaching x from Vq?

‘Top edge (n-)bigram’ same, where only the top k contributing paths are considered.

‘Source count’ indicates the number of different source nodes in the set of connecting paths.

Paths [x3, k=2]:

Learning to Rank Typed Graph Walks:

Local vs. Global approaches

Experiments

Gradient descent: Θ0 ΘG

Reranking: R(Θ0)

Combined: R(ΘG)

Methods:

Tasks &Corpora :

The results (MAP)

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

M.game sager Shapiro

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

M.game Farmer Germany

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

Meetings

Namedisambiguation

Threading

Alias finding

MAP

*

*

*

*

*

*

*

*

** *

+

+

+

+ +

*

Nam

ed

isam

big

uati

on

Th

read

ing

Ali

as

fin

din

g

Our Findings Re-ranking often preferable due to ‘global’ features:

Models relation sequences.

e.g., threading: sent-from sent-to-inv

Re-ranking rewards nodes for which the set of connecting paths is diverse.

source-count feature informative for complex queries

The approaches are complementary

Future work:

Re-ranking: large feature space.

Re-ranking requires decoding at run-time.

Domain specific features

Related papersEinat Minkov, William W. Cohen, Andrew Y. Ng Contextual Search and Name Disambiguation in Email using GraphsSIGIR 2006

Einat Minkov, William W. CohenAn Email and Meeting Assistant using Graph Walks CEAS 2006

Alekh Agarwal, Soumen ChakrabartiLearning Random Walks to Rank Nodes in GraphsICML 2007

Hanghang Tong, Yehuda Koren, and Christos Faloutsos Fast Direction-Aware Proximity for Graph Mining KDD 2007

Thanks! Questions?

learning to rank typed graph walks: local and global approaches

Documents

graph edges

graph walksgraph

typed graph walks

evidence of similarity

multiple connecting

shorter connecting paths

fixed probability

list of nodes