jure leskovec, cornell/stanford university - cs.cmu.edujure/pub/ncp-cornell-oct08.pdfthe network...

Jure Leskovec, Cornell/Stanford UniversityJoint work with Kevin Lang, Anirban Dasgupta and Michael Mahoney, Yahoo! Research

Network: an interaction graph: • Nodes represent “entities”• Edges represent “interaction” between pairs of entities

2

Are there natural clusters, communities, partitions, etc.?

Concept-based clusters, link-based clusters, density-based clusters, …

3

Bid, click and impression information for “keyword x advertiser” pair

Mine information at query-time to provide new ads

Maximize CTR, RPS, advertiser ROI

4

Find micro-markets by partitioning the “query x advertiser” graph:

advertiser

qu

ery

5

Linear (low-rank) methods: If Gaussian, then low-rank space is good

Kernel (non-linear) methods: If low-dimensional manifold, then kernels are

good Hierarchical methods: Top-down and bottom-up – common in social

sciences Graph partitioning methods: Define “edge counting” metric – conductance,

expansion, modularity, etc. – and optimize!

“It is a matter of common experience that communities exist in networks ... Although not precisely defined, communities are usually thought of as sets of nodes with better connections amongst its members than with the rest of the world.”

6

Communities:

Sets of nodes with lots of connections inside and few to outside (the rest of the network)

Assumption:

Networks are (hierarchically) composed of communities Communities, clusters,

groups, modules7

Communities:

Sets of nodes with lots of connections inside and few to outside (the rest of the network)

Assumption:

Networks are (hierarchically) composed of communities

8

Question:Are large networks

really like this?

Hierarchical community structure

Let A be the adjacency matrix of G=(V,E).

The conductance of a set S of nodes is:

The Network Community Profile (NCP) plot of the graph is:

How community like is a set of nodes? S

S’

9

Score: Φ(S) = # edges cut / # edges inside

What is “best” community of

5 nodes?

10


Bad community

Φ=5/6 = 0.83


5 nodes?

11


Better community

Φ=5/7 = 0.7

Bad community

Φ=2/5 = 0.4


5 nodes?

12


Better community

Φ=5/7 = 0.7

Bad community

Φ=2/5 = 0.4

Best community

Φ=2/8 = 0.25


5 nodes?

13

Network community profile (NCP) plot

Plot the score of best community of size k

14

Community size, log k

log Φ(k)Φ(5)=0.25

Φ(7)=0.18

k=5 k=7

Idea: Use approximation algorithms for NP-hard graph partitioning problems as experimental probes of network structure. Spectral (quadratic approx): confuses “long paths” with “deep

cuts” Multi-commodity flow (log(n) approx): difficulty with expanders SDP (sqrt(log(n)) approx): best in theory Metis (multi-resolution heuristic): common in practice X+MQI: post-processing step on, e.g., MQI of Metis

Local Spectral - connected and tighter sets (empirically) Metis+MQI - best conductance (empirically)

We are not interested in partitions per se, but in probing network structure

15

d-dimensional meshes California road network

16

Zachary’s university karate club social network

During the study club split into 2

The split (squares vs. circles) corresponds to cut B

17

Collaborations between scientists in Networks [Newman, 2005]

18

19

[Ravasz&Barabasi, 2003]

[Clauset,Moore&Newman, 2008]

Previously researchers examined community structure of small networks (~100 nodes)

We examined more than 100 different largenetworks

Large networks look very different!

20

Typical example:General relativity collaboration network (4,158 nodes, 13,422 edges)

21

Φ(k

), (c

on

du

ctan

ce)

k, (community size)

Better and better communities

Communities get worse and worse

Best community has ~100 nodes

22

Definition: Whisker is a maximal set of nodes connected to the network by a single edge

Whiskers are responsible for downward slope of NCP plot

NCP plot

Largest whisker

23

Network structure: Core-periphery

(jellyfish, octopus)

Whiskers are responsible for

good communities

Denser and denser core of

the network

Core contains ~60% nodes and

~80% edges

24

Each new edge inside the community costs more

NCP plot

Φ=2/4 = 0.5

Φ=8/6 = 1.3

Φ=64/14 = 4.5

Each node has twice as many children

Φ=1/3 = 0.33

25

26

Whiskers:

Edge to cut

Whiskers in real networks are non-trivial (richer than trees)

27

WhiskersWhiskers in real networks are larger than

expected based on density and degree sequence

29

Nothing happens!Now we have 2-edge connected whiskers to

deal with. Indicates the recursiveness of our core-periphery structure: as we remove the

periphery, the core itself breaks into core and the periphery

What if we allow cuts that give disconnected communities?

•Cut all whiskers •Compose communities out of whiskers•How good “community” do we get?

30

LiveJournal

Rewired network

Local spectral

Bag-of-whiskers

Metis+MQI

31

Regularization properties: spectral embeddings stretch along directions in which the random-walk mixes slowly Resulting hyperplane cuts have "good" conductance

cuts, but may not yield the optimal cuts

spectral embedding flow based embedding

32

Metis+MQI (red) gives sets with better conductance.

Local Spectral (blue) gives tighter and more well-rounded sets.

33

ext/

int

Do

ts a

re c

on

nec

ted

clu

ster

s

Two ca. 500 node communities from Local Spectral:

Two ca. 500 node communities from Metis+MQI:

34

... can be computed from:

Spectral embedding (independent of balance)

SDP-based methods (for volume-balanced partitions)

35

What is a good model that explains such network structure?

None of the existing models work

Pref. attachment Small World Geometric Pref. Attachment

FlatDown and Flat

Flat and Down

36

37

Note: Sparsity is the issue, not heavy-tails per se. (Power laws with 2< <3 give us the appropriate sparsity)

Forest Fire [LKF05]:connections spread like a fire New node joins the network Selects a seed node Connects to some of its neighbors Continue recursively

38

Notes:• Preferential attachment flavor - second neighbor is not uniform at random.• Copying flavor - since burn seed’s neighbors.• Hierarchical flavor - seed is parent.• “Local” flavor - burn “near” --in a diffusion sense -- the seed vertex.

As community grows it blendsinto the core of

the network

rewired

network

Bag of whiskers

39

Whiskers:

Largest whisker has ~100 nodes

Whisker size is independent of network size

Core:

60% of the nodes, 80% edges

Core has little structure (hard to cut)

Still more structure than the random network

40

The Dunbar number 150 individuals is maximum community size On-line communities have 60 members and break down at around

80, military, churches, divisions, etc. – all close to the Dunbar's 150

Common bond vs. common identity theory Common bond (people are attached to individual community

members) are smaller and more cohesive Common identity (people are attached to the group as a whole)

focused around common interest and tend to be larger and more diverse

What edges “mean” and community identification social networks - reasons an individual adds a link to a friend can

vary enormously citation networks or web graphs - links are more “expensive” and

are more semantically uniform41

Networks with “ground truth” communities:

LiveJournal12: users create and explicitly join on-line groups

DBLP co-authorships: publication venues can be viewed as communities

Amazon product co-purchasing: each item belongs to one or more hierarchically organized

categories, as defined by Amazon

IMDB collaboration: countries of production and languages may be viewed as

communities42

LiveJournal DBLP

Amazon IMDB

Rewired

NetworkGround truth

43

NCP plot is a way to analyze network community structure

Our results agree with previous work on small networks (people did not hit the Dunbar’s limit)

But large networks are different:

Whiskers + Core (core-periphery) structure

Small well isolated communities blend into the core of the networks as they grow

44

Assume a recursive Kronecker model. Fit it to G. We get K =

What does this tell about the network structure?

46

0.9 0.5

0.5 0.1

Core0.9 edges

Periphery0.1 edges

0.5 edges

0.5 edges

As opposed to:

0.9 0.1

0.1 0.9

which gives a hierarchyCore-peripheryNo communities

No good cuts 0.9 0.9

0.1

0.1

Assume a recursive Kronecker model. Fit it to G. We get K =

What does this tell about the network structure?

47

0.9 0.5

0.5 0.1

Core0.9 edges

Periphery0.1 edges

0.5 edges

0.5 edges

As opposed to:

0.9 0.1

0.1 0.9

which gives a hierarchy

0.9 0.9

0.1

0.1

jure leskovec, cornell/stanford university - cs.cmu.edujure/pub/ncp-cornell-oct08.pdfthe network...

Documents