machine learning in simple networksmmds.imm.dtu.dk/presentations/hansen.pdf · european physical...

23
Machine Learning in Simple Networks Lars Kai Hansen www.imm.dtu.dk/~lkh

Upload: phamkien

Post on 12-Sep-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Machine Learning in Simple Networks

Lars Kai Hansenwww.imm.dtu.dk/~lkh

Page 2: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Outline

• Communities and link prediction

• Modularity – Modularity as a combinatorial optimization problem

– Gibbs sampling

• Detection threshold – a phase transition?

• Learning community parameters

– The Hofman-Wiggins generative model– Is there a threshold for detection when you learn the parameters and complexity?

Page 3: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Muzeeker

• Wikipedia based common sense• Wikipedia used as a proxy for the

music users mental model• Implementation: Filter retrieval

using Wikipedia’s article/ categories

• Muzeeker.com

• LINK PREDICTION to complete the ontological quality of Wikipedia

Page 4: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Network models

• Nodes/vertices and links/edges– Directed / undirected– Weighted / un-weighted

• Link distributions– Random– Long tail– Hubs and authorities

• Link induced correlations– The Rich club

• Communities– Link prediction

Page 5: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Motivation for community detection• Community structure may mark a non-stationary link distribution with “high and low density” sub-networks, hence summarizing with a single “model” could be misleading

Page 6: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Modularity can be predictive for dynamics

M.E.J. Newman and M. Girvan, Finding and evaluating community structure in networks, Phys. Rev. E 69, 026113 (2004).

Page 7: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Modularity objective functionThe modularity is expressed as a sum over links, such that we penalize

missing links in communities - missing is measured relative to a null distribution P0ij.

( , )2

iji j i jij

AQ PP c c

mδ⎡ ⎤

= −⎢ ⎥⎣ ⎦

Ci is the community assignment of node jand 2m = ΣijAij, ki = ΣjAij

The null is a baseline distribution Pij = kikj/(2m)2

The value of the modularity lies in the range [−1,1]. It is positive if the number of edges within groups exceeds the number expected on the basis of chance

M.E.J. Newman and M. Girvan. Finding and evaluating communitystructure in networks. Physical Review E, 69:026113,2004, cond-mat/0308217.

Page 8: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Potts representation

( , )

( , )2

( , )2 2

1 ( ')2 2

i j ki kjk

ij

ij iji j i j i j ki kjij ij k

ij ki kjijk

c c S S

AP j i

mA A

Q PP c c PP S Sm m

Tr SBSQ B S Sm m

δ

δ

=

=

⎡ ⎤ ⎡ ⎤= − = −⎢ ⎥ ⎢ ⎥

⎣ ⎦ ⎣ ⎦

= =

∑ ∑ ∑

Introduce 0,1 binary variables Skj coding the community assignment: “node j is member of community k”

Page 9: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Spectral optimization• Newman relaxes the optimization problem to the simplex

1 ( ')2 2

( ') ( )2

i j k i k ji jk

T r S B SQ B S Sm m

T r S B SL T r Sm

B S S

= =

= + Λ

= Λ

Page 10: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Combinatorial optimization• We can use a physics analogy Simulated Annealing (Kirkpatrick et al. 1983)

( ) ( ')( | , ) exp( ) exp( )2

Q S Tr SBSP S A TT mT

∝ =

• Gibbs sampling is a Monte Carlo realization of a Markov process in which each variable is randomly assigned according to its marginal distribution

( | , )( | , , )( | , )

j

j jS

P S A TP S S A TP S A T− =

∑S Geman,D Geman, "Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images". IEEE Transactions on Pattern Analysis and Machine Intelligence 6 (6): 721–741 (1984)

Page 11: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Potts model 1-node• Discrete probability distribution on states k = 1,…,K

( )

1

''

( | , ) exp ,

( | , )

exp

exp

k

Kk kk

Sk

k

k

k kk

k

SP S A T

T

P S A T r

TS r

T

ϕ

ϕ

ϕ

=⎛ ⎞⎜ ⎟∝⎜ ⎟⎝ ⎠

=

⎛ ⎞⎜ ⎟⎝ ⎠= =⎛ ⎞⎜ ⎟⎝ ⎠

Page 12: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Gibbs sampling

''

2 2 2 2

exp( / )exp( / )

potts( )

ij ij jiki kj kj kjj j j

kiki

k ik

i i

B A kkS S Sm m m m

TrT

S r

ϕ

ϕϕ

= = −

=

=

∑ ∑ ∑

Page 13: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Deterministic annealing • Instead of drawing Gibbs samples according to the marginals we can average instead, this provides a set of self-consistent equations for the means (for 0,1 Bernoulli variables the mean is the probability μki =P(Ski))

''

exp( / )exp( / )

2 2

kiki

k ik

ij ijki kj kj i j kjj j j

TrT

B Ar r PP r

m m

ϕϕ

ϕ

=

= = −

∑ ∑ ∑

S. Lehmann, L.K. Hansen: Deterministic modularity optimization European Physical Journal B 60(1) 83-88 (2007).

Page 14: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Experimental evaluation• Create a simple testbed with link probability and “noise”

S. Lehmann, L.K. Hansen: Deterministic modularity optimization European Physical Journal B 60(1) 83-88 (2007).

Page 15: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

S. Lehmann, L.K. Hansen: Deterministic modularity optimization European Physical Journal B 60(1) 83-88 (2007).

Page 16: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Generative community model (Hofman & Wiggins, 2008)

( )

( )

,

,

( | , , ) (1 ) (1 )121 (1 )21 121 (1 ) 12

c d e f

ij kj kij i k

ij kj kij i k

ij kj kij i k

ij kj kij i k

P A S p q p p q q

c A S S

d A S S

e A S S

f A S S

= − −

=

= −

= −

= − −

∑ ∑

∑ ∑

Page 17: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Learning parameters of the generative model

• Hofman & Wiggins (2008)– “Variational Bayes”

– Dirichlets/beta prior and posterior distributions for the probabilities

– Very well determined (over kill)

– Independent binomials for the assignment variables (misses correlation)

• Here– Maximum likelihood for the parameters– Gibbs sampling for the assignments

Jake M. Hofman and Chris H. Wiggins, Bayesian Approach to Network ModularityPhys. Rev. Lett. 100, 258701 (2008),

Page 18: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

The community detection thresholdhow many links are needed to detect the structure?

Jorg Reichardt and Michele Leone, Un)detectable Cluster Structure in Sparse NetworksPhys. Rev. Lett. 101, 078701 (2008),

( 1) 1inp SNRP

q C C= =

− −

Page 19: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Experimental design

• Planted solution– N = 1000 nodes– Ctrue = 5– Quality: Mutual information between

• planted assignments and the best identified

• Gibbs sampling– No annealing– Burn-in 200 iterations– Averaging 800 iterations

• Parameter learning– Q = 10 iterations

Page 20: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Community Detection – fully informed on number of communities and probabilities

0 0.01 0.02 0.03 0.04 0.050

0.5

1

1.5

2

2.5

INTRA COMMUNITY LINK PROB (P)

MU

TU

AL

INF

. PLA

NT

ED

CO

MM

UN

ITY

COMMUNITY DETECTION (N =1000, C = 5, SNR = 5)

0 0.01 0.02 0.03 0.04 0.050

0.5

1

1.5

2

2.5

INTRA COMMUNITY LINK PROB (P)

MU

TU

AL

INF

. PLA

NT

ED

CO

MM

UN

ITY

COMMUNITY DETECTION (N =1000, C = 5, SNR = 10)

0 0.01 0.02 0.03 0.04 0.050

0.5

1

1.5

2

2.5

INTRA COMMUNITY LINK PROB (P)

MU

TU

AL

INF

. PLA

NT

ED

CO

MM

UN

ITY

COMMUNITY DETECTION (N =1000, C = 5, SNR = 50)

0 0.01 0.02 0.03 0.04 0.050

0.5

1

1.5

2

2.5

INTRA COMMUNITY LINK PROB (P)

MU

TU

AL

INF

. PLA

NT

ED

CO

MM

UN

ITY

COMMUNITY DETECTION (N =1000, C = 10, SNR = 50)

Page 21: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Now what happens to the phase transition if we learn the parameters … with a too complex model(C > Ctrue = 5) ?

0 0.02 0.04 0.06 0.08 0.10

0.5

1

1.5

2

2.5

INTRA COMMUNITY LINK PROB (P)

MU

TU

AL

INF

. PLA

NT

ED

CO

MM

UN

ITY

COMMUNITY DETECTION (N =1000, C = 10, SNR = 10)

0 0.02 0.04 0.06 0.08 0.10

0.5

1

1.5

2

2.5

INTRA COMMUNITY LINK PROB (P)M

UT

UA

L IN

F. P

LAN

TE

D C

OM

MU

NIT

Y

COMMUNITY DETECTION (N =1000, C = 10, SNR = 5)

1 2 3 4 5 6 7 8 9 100

50

100

150

200

COMMUNITY

ME

MB

ER

SH

IPS

Page 22: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization

Conclusions

•Community detection can be formulated as an inference problem (Hofman & Wiggins, 2008)

•The sampling process for fixed SNR has a phase transition like detection threshold (Richard & Leone, 2008)

•The phase transition remains (sharpens?) if you learn the parameters of a generative model with unknown complexity

Page 23: Machine Learning in Simple Networksmmds.imm.dtu.dk/presentations/hansen.pdf · European Physical Journal B 60(1) 83-88 (2007). S. Lehmann, L.K. Hansen: Deterministic modularity optimization