cross-graph learning of multi-relational associationshanxiaol/slides/icml2016-liu.pdfstructure...

37
Cross-Graph Learning of Multi-Relational Associations Hanxiao Liu, Yiming Yang Carnegie Mellon University {hanxiaol, yiming}@cs.cmu.edu June 22, 2016 1 / 24

Upload: others

Post on 05-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Cross-Graph Learning of Multi-Relational

Associations

Hanxiao Liu, Yiming YangCarnegie Mellon University

{hanxiaol, yiming}@cs.cmu.edu

June 22, 2016

1 / 24

Page 2: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Outline

Task Description

New Contributions

Framework

Scalable Inference

Empirical Evaluation

Summary

2 / 24

Page 3: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Task Description

Goal: Predict associations among heterogeneous graphs.

ProteinCompound

Structure Similarity Sequence Similarity

Interact

(a) Drug-Target Interaction

Paper

Author Venue

Coauthorship

Citation

Shared Foci

WritePublish

Attend

(b) Citation Network

“John publish a reinforcement learning paper at ICML.”(John,RL Paper,ICML)

3 / 24

Page 4: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Outline

Task Description

New Contributions

Framework

Scalable Inference

Empirical Evaluation

Summary

4 / 24

Page 5: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

New Contributions

I A unified framework to integrating heterogeneousinformation in multiple graphs.

I Transductive learning to leverage both labeled data(sparse) and unlabeled data (massive).

I A convex approximation for the scalable inference overthe combinatorial number of possible tuples.

5 / 24

Page 6: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Outline

Task Description

New Contributions

Framework

Scalable Inference

Empirical Evaluation

Summary

6 / 24

Page 7: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Framework

Notation

I G(1), G(2), . . . , G(J) are individual graphs;

I nj is the #nodes in G(j);

I (i1, i2, . . . , iJ) is a tuple (multi-relation);

I fi1,i2,...,iJ is the predicted score for the tuple;

I f is a tensor in Rn1×n2×···×nJ .

7 / 24

Page 8: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Framework

Product Graph (P) induced from G(1), . . . , G(J).

P

(︸ ︷︷ ︸G(1)

, ︸ ︷︷ ︸G(2)

, ︸ ︷︷ ︸G(3)

)=

Tensor product: P(G(1), G(2), G(3)) = G(1) ⊗G(2) ⊗G(3)

8 / 24

Page 9: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Framework

Product Graph (P) induced from G(1), . . . , G(J).

P

(︸ ︷︷ ︸G(1)

, ︸ ︷︷ ︸G(2)

, ︸ ︷︷ ︸G(3)

)=

Tensor product: P(G(1), G(2), G(3)) = G(1) ⊗G(2) ⊗G(3)

8 / 24

Page 10: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Framework

Why product graph?

I Mapping heterogeneous graphs onto a unified graph forlabel propagation (transductive learning).

9 / 24

Page 11: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Framework

Assumingvec(f) ∼ N (0,P) (1)

which implies:

− log p (f |P) ∝ vec(f)>P−1vec(f) := ‖f‖2P (2)

Optimization problem

minf

`O(f) +γ

2‖f‖2P (3)

10 / 24

Page 12: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Framework

Assumingvec(f) ∼ N (0,P) (1)

which implies:

− log p (f |P) ∝ vec(f)>P−1vec(f) := ‖f‖2P (2)

Optimization problem

minf

`O(f) +γ

2‖f‖2P (3)

10 / 24

Page 13: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Framework

Assumingvec(f) ∼ N (0,P) (1)

which implies:

− log p (f |P) ∝ vec(f)>P−1vec(f) := ‖f‖2P (2)

Optimization problem

minf

`O(f) +γ

2‖f‖2P (3)

10 / 24

Page 14: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Framework

For computational tractability, we focus on the spectralgraph product family of P.

Spectral Graph Product (SGP)The eigensystem of Pκ

(G(1), . . . , G(J)

)is parametrized by

the eigensystems of individual graphs, i.e.,{κ(λi1 , . . . , λiJ

),⊗j

vij

}i1,...,iJ

(4)

λij/vij is the ij-th eigenvalue/eigenvector of the j-th graph.

11 / 24

Page 15: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Framework

Nice properties of SGP:

Subsuming basic operations

κ(x, y) = x× y =⇒ Pκ(G,H) = G⊗H Tensor (5)

κ(x, y) = x+ y =⇒ Pκ(G,H) = G⊕H Cartesian (6)

Supporting graph diffusions

σHeat(Pκ) =I + Pκ +1

2P2

κ + · · · = Peκ (7)

σvon−Neumann(Pκ) =I + Pκ + P2κ + · · · = P 1

1−κ(8)

Order-insensitive: If κ is commutative, then SGP iscommutative (up to graph isomorphism).

12 / 24

Page 16: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Framework

Nice properties of SGP:

Subsuming basic operations

κ(x, y) = x× y =⇒ Pκ(G,H) = G⊗H Tensor (5)

κ(x, y) = x+ y =⇒ Pκ(G,H) = G⊕H Cartesian (6)

Supporting graph diffusions

σHeat(Pκ) =I + Pκ +1

2P2

κ + · · · = Peκ (7)

σvon−Neumann(Pκ) =I + Pκ + P2κ + · · · = P 1

1−κ(8)

Order-insensitive: If κ is commutative, then SGP iscommutative (up to graph isomorphism).

12 / 24

Page 17: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Framework

Nice properties of SGP:

Subsuming basic operations

κ(x, y) = x× y =⇒ Pκ(G,H) = G⊗H Tensor (5)

κ(x, y) = x+ y =⇒ Pκ(G,H) = G⊕H Cartesian (6)

Supporting graph diffusions

σHeat(Pκ) =I + Pκ +1

2P2

κ + · · · = Peκ (7)

σvon−Neumann(Pκ) =I + Pκ + P2κ + · · · = P 1

1−κ(8)

Order-insensitive: If κ is commutative, then SGP iscommutative (up to graph isomorphism).

12 / 24

Page 18: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Outline

Task Description

New Contributions

Framework

Scalable Inference

Empirical Evaluation

Summary

13 / 24

Page 19: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Scalable Inference

For general GP, the semi-norm is computed as

‖f‖2P = vec(f)>P−1vec(f) (9)

For SGP, Pκ no longer has to be explicitly computed.

‖f‖2Pκ=

n1,n2,...,nJ∑i1,i2,...,iJ

f(vi1 , . . . , viJ

)2κ(λi1 , . . . , λiJ

) (10)

I f(vi1 , vi2 , . . . , viJ ) = f ×1 vi1 ×2 vi2 · · · ×J viJI However, even evaluating (10) is expensive.

14 / 24

Page 20: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Scalable Inference

For general GP, the semi-norm is computed as

‖f‖2P = vec(f)>P−1vec(f) (9)

For SGP, Pκ no longer has to be explicitly computed.

‖f‖2Pκ=

n1,n2,...,nJ∑i1,i2,...,iJ

f(vi1 , . . . , viJ

)2κ(λi1 , . . . , λiJ

) (10)

I f(vi1 , vi2 , . . . , viJ ) = f ×1 vi1 ×2 vi2 · · · ×J viJI However, even evaluating (10) is expensive.

14 / 24

Page 21: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Scalable Inference

For general GP, the semi-norm is computed as

‖f‖2P = vec(f)>P−1vec(f) (9)

For SGP, Pκ no longer has to be explicitly computed.

‖f‖2Pκ=

n1,n2,...,nJ∑i1,i2,...,iJ

f(vi1 , . . . , viJ

)2κ(λi1 , . . . , λiJ

) (10)

I f(vi1 , vi2 , . . . , viJ ) = f ×1 vi1 ×2 vi2 · · · ×J viJI However, even evaluating (10) is expensive.

14 / 24

Page 22: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Scalable Inference

Using low-rank SGP

I f lies in the linear span of the eigenvectors of P.

I Eigenvectors of high volatility can be pruned away.

Figure : Eigenvectors of G (blue), H (red) and P(G,H).

15 / 24

Page 23: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Scalable Inference

Using low-rank SGP

I f lies in the linear span of the eigenvectors of P.

I Eigenvectors of high volatility can be pruned away.

Figure : Eigenvectors of G (blue), H (red) and P(G,H).

15 / 24

Page 24: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Scalable Inference

Restrict f in the linear span of “smooth” bases of P.

f(α) =

d1,d2,··· ,dJ∑i1,i2,··· ,iJ=1

αi1,i2,··· ,iJ⊗j

vij (11)

where the core tensor α ∈ Rd1×d2×···×dJ , dj � nj.

The semi-norm becomes

‖f(α)‖2Pκ=

d1,d2,··· ,dJ∑i1,i2,...,iJ=1

α2i1,i2,··· ,iJ

κ(λi1 , λi2 , . . . , λiJ

) (12)

We then optimize w.r.t. α instead of f . Parameter size:∏j nj →

∏j dj.

16 / 24

Page 25: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Scalable Inference

Restrict f in the linear span of “smooth” bases of P.

f(α) =

d1,d2,··· ,dJ∑i1,i2,··· ,iJ=1

αi1,i2,··· ,iJ⊗j

vij (11)

where the core tensor α ∈ Rd1×d2×···×dJ , dj � nj.

The semi-norm becomes

‖f(α)‖2Pκ=

d1,d2,··· ,dJ∑i1,i2,...,iJ=1

α2i1,i2,··· ,iJ

κ(λi1 , λi2 , . . . , λiJ

) (12)

We then optimize w.r.t. α instead of f . Parameter size:∏j nj →

∏j dj.

16 / 24

Page 26: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Scalable Inference

Figure : Tucker Decomposition, where α is the core tensor.

17 / 24

Page 27: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Scalable Inference

Revised optimization objective

minα∈Rd1×d2···×dJ

`O (f(α)) +γ

2‖f(α)‖2Pκ

(13)

Ranking loss function

`O(f) =

∑(i1, . . . , iJ ) ∈ O(i′1, . . . , i

′J ) ∈ O

(fi1...iJ − fi′1...i′J

)2+

|O × O|(14)

∇α =∂`O∂f

(∂fi1,...,iJ∂α

−∂fi′1,...,i′J∂α

)+ γα� κ (15)

Tensor algebras are carried out on GPU.

18 / 24

Page 28: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Scalable Inference

Revised optimization objective

minα∈Rd1×d2···×dJ

`O (f(α)) +γ

2‖f(α)‖2Pκ

(13)

Ranking loss function

`O(f) =

∑(i1, . . . , iJ ) ∈ O(i′1, . . . , i

′J ) ∈ O

(fi1...iJ − fi′1...i′J

)2+

|O × O|(14)

∇α =∂`O∂f

(∂fi1,...,iJ∂α

−∂fi′1,...,i′J∂α

)+ γα� κ (15)

Tensor algebras are carried out on GPU.

18 / 24

Page 29: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Scalable Inference

Revised optimization objective

minα∈Rd1×d2···×dJ

`O (f(α)) +γ

2‖f(α)‖2Pκ

(13)

Ranking loss function

`O(f) =

∑(i1, . . . , iJ ) ∈ O(i′1, . . . , i

′J ) ∈ O

(fi1...iJ − fi′1...i′J

)2+

|O × O|(14)

∇α =∂`O∂f

(∂fi1,...,iJ∂α

−∂fi′1,...,i′J∂α

)+ γα� κ (15)

Tensor algebras are carried out on GPU.

18 / 24

Page 30: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Outline

Task Description

New Contributions

Framework

Scalable Inference

Empirical Evaluation

Summary

19 / 24

Page 31: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Empirical Evaluation

Datasets

Enzyme 445 compounds, 664 proteins.

DBLP 34K authors, 11K papers, 22 venues.

Representative Baselines

TF/GRTF Tensor Factorization/Graph-Regularized TF

NN One-class Nearest Neighbor

RSVM Ranking SVMs

LTKM Low-Rank Tensor Kernel Machines

20 / 24

Page 32: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Empirical Evaluation

Datasets

Enzyme 445 compounds, 664 proteins.

DBLP 34K authors, 11K papers, 22 venues.

Representative Baselines

TF/GRTF Tensor Factorization/Graph-Regularized TF

NN One-class Nearest Neighbor

RSVM Ranking SVMs

LTKM Low-Rank Tensor Kernel Machines

20 / 24

Page 33: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Empirical Evaluation

Our method: “TOP” (blue).

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

12.5 25 50 100

MAP

Training Size (%)

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

12.5 25 50 100AU

CTraining Size (%)

TOPLTKM

NNRSVM

TFGRTF

5

10

15

20

25

30

35

12.5 25 50 100

Hits

@5

(%)

Training Size (%)

Figure : Performance on Enzyme (above) and DBLP (below).

0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

12.5 25 50 100

MAP

Training Size (%)

0.5 0.55

0.6 0.65

0.7 0.75

0.8 0.85

0.9 0.95

12.5 25 50 100

AUC

Training Size (%)

TOPLTKM

NNRSVM

TFGRTF

0

2

4

6

8

10

12

12.5 25 50 100

Hits

@5

(%)

Training Size (%)

21 / 24

Page 34: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Outline

Task Description

New Contributions

Framework

Scalable Inference

Empirical Evaluation

Summary

22 / 24

Page 35: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Summary

Contribution

I A unified framework to integrating heterogeneousinformation in multiple graphs.

I Transductive learning to leverage both labeled data(sparse) and unlabeled data (massive).

I A convex approximation for the scalable inference overthe combinatorial number of possible tuples.

Future/On-going Work

I Learning structured associations.

I Larger problems: Microsoft Academic Graph (37 GB).

23 / 24

Page 36: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Summary

Contribution

I A unified framework to integrating heterogeneousinformation in multiple graphs.

I Transductive learning to leverage both labeled data(sparse) and unlabeled data (massive).

I A convex approximation for the scalable inference overthe combinatorial number of possible tuples.

Future/On-going Work

I Learning structured associations.

I Larger problems: Microsoft Academic Graph (37 GB).

23 / 24

Page 37: Cross-Graph Learning of Multi-Relational Associationshanxiaol/slides/icml2016-liu.pdfStructure Similarity Sequence Similarity Interact (a)Drug-Target Interaction Paper Author Venue

Thank You

24 / 24