dynamic models of on-line social networks

56
On-line Social Networks - Anthony Bonato 1 Dynamic Models of On-line Social Networks Anthony Bonato Ryerson University ICMCM’09 December, 2009

Upload: derex

Post on 24-Feb-2016

60 views

Category:

Documents


0 download

DESCRIPTION

ICMCM’09 December, 2009. Dynamic Models of On-line Social Networks. Anthony Bonato Ryerson University. Toronto in December…. Complex Networks. web graph, social networks, biological networks, internet networks , …. nodes : web pages edges : links - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

1

Dynamic Models of On-line Social Networks

Anthony BonatoRyerson University

ICMCM’09 December, 2009

Page 2: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

2

Toronto in December…

Page 3: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

3

Complex Networks• web graph, social networks, biological networks, internet

networks, …

Page 4: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

4

The web graph

• nodes: web pages

• edges: links

• over 1 trillion nodes, with billions of nodes added each day

Page 5: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

5

Social Networks

nodes: people

edges: social interaction(eg friendship)

Page 6: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

6

On-line Social Networks (OSNs)Facebook, Twitter, Orkut, LinkedIn, GupShup…

Page 7: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

7

A new paradigm• half of all users of internet

on some OSN– 250 million users on

Facebook, 45 million on Twitter

• unprecedented, massive record of social interaction

• unprecedented access to information/news/gossip

Page 8: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

8

Properties of Complex Networks• observed properties:

– massive, power law, small world, decentralized

(Broder et al, 01)

Page 9: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

9

Small World Property

• small world networks introduced by social scientists Watts & Strogatz in 1998– low diameter/average

distance (“6 degrees of separation”)

– globally sparse, locally dense (high clustering coefficient)

Page 10: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

10

Paths in Twitter

Dalai Lama

Ashton Kutcher

Christianne Amanpour

Queen Rania of Jordan

Arnold Schwarzenegger

Page 11: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

11

Why model complex networks?• uncover the generative mechanisms

underlying complex networks• models are a predictive tool• nice mathematical challenges• models can uncover the hidden reality of

networks– in OSNs:

• community detection• advertising• security

Page 12: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

12

Many different models

Page 13: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

13

Social network analysis• Milgram (67): average distance between two Americans is 6• Watts and Strogatz (98): introduced small world property • Adamic et al. (03): early study of on-line social networks• Liben-Nowell et al. (05): small world property in LiveJournal• Kumar et al. (06): Flickr, Yahoo!360; average distances

decrease with time• Golder et al. (06): studied 4 million users of Facebook• Ahn et al. (07): studied Cyworld in South Korea, along with

MySpace and Orkut• Mislove et al. (07): studied Flickr, YouTube, LiveJournal,

Orkut• Java et al. (07): studied Twitter: power laws, small world

On-line

Page 14: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

14

Key parameters• power law degree

distributions:

• average distance:

• clustering coefficient:

tiNb bti

, ,2

)(,

1

2),()(

GVvu

tvudGL

)(

1-1

)()( ,2

)deg(|))((|)(

GVxxctGC

xxNExc

Wiener index, W(G)

Page 15: Dynamic  Models of  On-line  Social Networks

Power laws in OSNs

On-line Social Networks - Anthony Bonato

15

Page 16: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

16

Flickr and Yahoo!360

• (Kumar et al,06): shrinking diameters

Page 17: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

17

Sample data: Flickr, YouTube, LiveJournal, Orkut

• (Mislove et al,07): short average distances and high clustering coefficients

Page 18: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

18

(Leskovec, Kleinberg, Faloutsos,05):– many complex networks (including on-line social

networks) obey two additional laws:1. Densification Power Law

– networks are becoming more dense over time; i.e. average degree is increasing

e(t) ≈ n(t)a

where 1 < a ≤ 2: densification exponent– a=1: linear growth – constant average degree, such

as in web graph models– a=2: quadratic growth – cliques

Page 19: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

19

Densification – Physics Citations

n(t)

e(t)

1.69

Page 20: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

20

Densification – Autonomous Systems

n(t)

e(t)

1.18

Page 21: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

21

2. Decreasing distances

• distances (diameter and/or average distances) decrease with time

– noted by Kumar et al. in Flickr and Yahoo!360• Preferential attachment model (Barabási, Albert, 99),

(Bollobás et al, 01)– diameter O(log t)

• Random power law graph model (Chung, Lu, 02)– average distance O(log log t)

Page 22: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

22

Diameter – ArXiv citation graph

time [years]

diameter

Page 23: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

23

Diameter – Autonomous Systems

number of nodes

diameter

Page 24: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

24

Models for the laws• (Leskovec, Kleinberg, Faloutsos,

05, 07):– Forest Fire model

• stochastic• densification power law,

decreasing diameter, power law degree distribution

• (Leskovec, Chakrabarti, Kleinberg,Faloutsos, 05, 07):– Kronecker Multiplication

• deterministic• densification power law,

decreasing diameter, power law degree distribution

Page 25: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

25

Models of OSNs

• many models exist for general complex networks

• few models for on-line social networks

• goal: find a model which simulates many of the observed properties of OSNs– must be simple and evolve in a natural way– must be different than previous complex network

models: densification and constant diameter!

Page 26: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

26

“All models are wrong, but some are more useful.” – G.P.E. Box

Page 27: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

27

Iterated Local Transitivity (ILT) model(Bonato, Hadi, Horn, Prałat, Wang, 08)

• key paradigm is transitivity: friends of friends are more likely friends; eg (Girvan and Newman, 03)– iterative cloning of closed neighbour sets

• deterministic: amenable to analysis • local: nodes often only have local influence• evolves over time, but retains memory of initial

graph

Page 28: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

28

ILT model

• parameter: finite simple undirected graph G = G0

• to form the graph Gt+1 for each vertex x from time t, add a vertex x’, the clone of x, so that xx’ is an edge, and x’ is joined to each neighbour of x

• order of Gt is 2tn0

Page 29: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

29

G0 = C4

Page 30: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

30

Properties of ILT model• average degree increasing to ∞ with time• average distance bounded by constant and

converging, and in many cases decreasing with time; diameter does not change

• clustering higher than in a random generated graph with same average degree

• bad expansion: small gaps between 1st and 2nd eigenvalues in adjacency and normalized Laplacian matrices of Gt

Page 31: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

31

Densification

• nt = order of Gt, et = size of Gt

Lemma: For t > 0, nt = 2tn0, et = 3t(e0+n0) - 2tn0.

→ densification power law:et ≈ nt

a, where a = log(3)/log(2).

Page 32: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

32

Average distanceTheorem 2: If t > 0, then

• average distance bounded by a constant, and converges; for many initial graphs (large cycles) it decreases

• diameter does not change from time 0

Page 33: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

33

Clustering Coefficient

Theorem 3: If t > 0, thenc(Gt) = nt

log(7/8)+o(1).• higher clustering than in a random graph

G(nt,p) with same order and average degree as Gt, which satisfies

c(G(nt,p)) = ntlog(3/4)+o(1)

Page 34: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

34

Sketch of proof of lower bound• each node x at time t has a binary sequence

corresponding to descendants from time 0, with a clone indicated by 1

• let e(x,t) be the number of edges in N(x) at time t• we may show that

e(x,t+1) = 3e(x,t) + 2degt(x)e(x’,t+1) = e(x,t) + degt(x)

• if there are k many 0’s in the binary sequence of x, then

e(x,t) ≥ 3k-2e(x,2) = Ω(3k)

Page 35: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

35

Sketch of proof, continued• there are many nodes with k many 0’s in their binary sequence• hence,

kt

n0

2

2

0

20 0

87

2431

2

43

)( tt

n

tkt

n

GCt

t

t

t

kt

k

t

Page 36: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

36

• Wayne Zachary’s Ph.D. thesis (1970-72): observed social ties and rivalries in a university karate club (34 nodes,78 edges)

• during his observation, conflicts intensified and group split

Example of community structure

Page 37: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

37

Adjacency matrix, A

eigenvalue spectrum: (-2)41531

Page 38: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

38

Spectral results• the spectral gap λ of G is defined by

min{λ1, 2 - λn-1}, where 0 = λ0 ≤ λ1 ≤ … ≤ λn-1 ≤ 2 are the eigenvalues of

the normalized Laplacian of G: I-D-1/2AD1/2 (Chung, 97)• for random graphs, λ tends to 1 as order grows• in the ILT model, λ < ½• bad expansion/small spectral gaps in the ILT model

found in social networks but not in the web graph (Estrada, 06) – in social networks, there are a higher number of intra-

rather than inter-community links

Page 39: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

39

Random ILT model• randomize the ILT model

– add random edges independently to new nodes, with probability a function of t

– makes densification tunable• densification exponent becomes

log(3 + ε) / log(2),where ε is any fixed real number in (0,1)

– gives any exponent in (log(3)/log(2), 2)• similar (or better) distance, clustering and spectral

results as in deterministic case

Page 40: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

40

Degree distribution–generate power

law graphs from ILT?• deterministic

ILT model gives a binomial-type distribution

Page 41: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

41

Geometric model for social networks

• OSNs live in social space: proximity of nodes depends on common attributes (such as geography, gender, age, etc.)

• IDEA: embed OSN in m-dimensional Euclidean space

Page 42: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

42

Dimension of an OSN• dimension of OSN: minimum number of

attributes needed to classify nodes

• like game of “20 Questions”: each question narrows range of possibilities

• what is a credible mathematical formula for the dimension of an OSN?

Page 43: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

43

Random geometric graphs• nodes are randomly

distributed in Euclidean space according to a given distribution

• nodes are joined by an edge if and only if their distance is less than a threshold value

(Penrose, 03)

Page 44: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

44

Spatial model for OSNs• we consider a spatial model

of OSNs, where– nodes are embedded in

m-dimensional Euclidean space

– number of nodes is static– threshold value variable:

a function of ranking of nodes

Page 45: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

45

Prestige-Based Spatial (PBS) Model(Bonato, Janssen, Prałat, 09)

• parameters: α, β in (0,1), α+β < 1; positive integer m• nodes live in hypercube of dimension m, measure 1• each node is ranked 1,2, …, n by some function r

– 1 is best, n is worst – we use random initial ranking

• at each time-step, one new node v is born, one node chosen u.a.r. dies (and ranking is updated)

• each existing node u has a region of influence with volume

• add edge uv if v is in the region of influence of u nr

Page 46: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

46

Notes on PBS model

• models uses both geometry and ranking• dynamical system: gives rise to ergodic

(therefore, convergent) Markov chain– users join and leave OSNs

• number of nodes is static: fixed at n– order of OSNs has ceiling

• top ranked nodes have larger regions of influence

Page 47: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

47

Properties of the PBS model (Bonato, Janssen, Prałat, 09)

• with high probability, the PBS model generates graphs with the following properties:– power law degree distribution with exponent b = 1+1/α– average degree d = (1+o(1))n(1-α-β)/21-α

• dense graph• tends to infinity with n

– diameter D = (1+o(1))nβ/(1-α)m

• depends on dimension m• m = clog n, then diameter is a constant

Page 48: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

48

Dimension of an OSN, continued

• given the order of the network n, power law exponent b, average degree d, and diameter D, we can calculate m

• gives formula for dimension of OSN:

Dd

n

mbb

log2

log21

Page 49: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

49

Uncovering the hidden reality• reverse engineering approach

– given network data (n, b, d, D), dimension of an OSN gives smallest number of attributes needed to identify users

• that is, given the graph structure, we can (theoretically) recover the social space

Page 50: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

50

Examples

OSN Dimension

Facebook 6MySpace 8

Twitter 4Flickr 4

Page 51: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

51

Future directions• what is a community in an OSN?

– (Porter, Onnela, Mucha,09): a set of graph partitions obtained by some “reasonable” iterative hierarchical partitioning algorithm

– motifs– Pott’s method from statistical mechanics– betweeness centrality

• lack of a formal definition, and few theorems

Page 52: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

52

Spatial ranking models• rigorously analyze spatial

model with ranking by– age– degree

• simulate PBS model– fit model to data– is theoretical estimate

of the dimension of an OSN accurate?

Page 53: Dynamic  Models of  On-line  Social Networks

Who is popular?• how to find popular users?• PageRank in OSNs• domination number

– constant in ILT model– in OSN data, domination

number is large (end-vertices)

– which is the correct graph parameter to consider?

On-line Social Networks - Anthony Bonato

53

Page 54: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

54

• preprints, reprints, contact:Google: “Anthony Bonato”

Page 55: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

55

WOSN’2010

Page 56: Dynamic  Models of  On-line  Social Networks

On-line Social Networks - Anthony Bonato

56