dynamic models of on-line social networks
DESCRIPTION
ICMCM’09 December, 2009. Dynamic Models of On-line Social Networks. Anthony Bonato Ryerson University. Toronto in December…. Complex Networks. web graph, social networks, biological networks, internet networks , …. nodes : web pages edges : links - PowerPoint PPT PresentationTRANSCRIPT
On-line Social Networks - Anthony Bonato
1
Dynamic Models of On-line Social Networks
Anthony BonatoRyerson University
ICMCM’09 December, 2009
On-line Social Networks - Anthony Bonato
2
Toronto in December…
On-line Social Networks - Anthony Bonato
3
Complex Networks• web graph, social networks, biological networks, internet
networks, …
On-line Social Networks - Anthony Bonato
4
The web graph
• nodes: web pages
• edges: links
• over 1 trillion nodes, with billions of nodes added each day
On-line Social Networks - Anthony Bonato
5
Social Networks
nodes: people
edges: social interaction(eg friendship)
On-line Social Networks - Anthony Bonato
6
On-line Social Networks (OSNs)Facebook, Twitter, Orkut, LinkedIn, GupShup…
On-line Social Networks - Anthony Bonato
7
A new paradigm• half of all users of internet
on some OSN– 250 million users on
Facebook, 45 million on Twitter
• unprecedented, massive record of social interaction
• unprecedented access to information/news/gossip
On-line Social Networks - Anthony Bonato
8
Properties of Complex Networks• observed properties:
– massive, power law, small world, decentralized
(Broder et al, 01)
On-line Social Networks - Anthony Bonato
9
Small World Property
• small world networks introduced by social scientists Watts & Strogatz in 1998– low diameter/average
distance (“6 degrees of separation”)
– globally sparse, locally dense (high clustering coefficient)
On-line Social Networks - Anthony Bonato
10
Paths in Twitter
Dalai Lama
Ashton Kutcher
Christianne Amanpour
Queen Rania of Jordan
Arnold Schwarzenegger
On-line Social Networks - Anthony Bonato
11
Why model complex networks?• uncover the generative mechanisms
underlying complex networks• models are a predictive tool• nice mathematical challenges• models can uncover the hidden reality of
networks– in OSNs:
• community detection• advertising• security
On-line Social Networks - Anthony Bonato
12
Many different models
On-line Social Networks - Anthony Bonato
13
Social network analysis• Milgram (67): average distance between two Americans is 6• Watts and Strogatz (98): introduced small world property • Adamic et al. (03): early study of on-line social networks• Liben-Nowell et al. (05): small world property in LiveJournal• Kumar et al. (06): Flickr, Yahoo!360; average distances
decrease with time• Golder et al. (06): studied 4 million users of Facebook• Ahn et al. (07): studied Cyworld in South Korea, along with
MySpace and Orkut• Mislove et al. (07): studied Flickr, YouTube, LiveJournal,
Orkut• Java et al. (07): studied Twitter: power laws, small world
On-line
On-line Social Networks - Anthony Bonato
14
Key parameters• power law degree
distributions:
• average distance:
• clustering coefficient:
tiNb bti
, ,2
)(,
1
2),()(
GVvu
tvudGL
)(
1-1
)()( ,2
)deg(|))((|)(
GVxxctGC
xxNExc
Wiener index, W(G)
Power laws in OSNs
On-line Social Networks - Anthony Bonato
15
On-line Social Networks - Anthony Bonato
16
Flickr and Yahoo!360
• (Kumar et al,06): shrinking diameters
On-line Social Networks - Anthony Bonato
17
Sample data: Flickr, YouTube, LiveJournal, Orkut
• (Mislove et al,07): short average distances and high clustering coefficients
On-line Social Networks - Anthony Bonato
18
(Leskovec, Kleinberg, Faloutsos,05):– many complex networks (including on-line social
networks) obey two additional laws:1. Densification Power Law
– networks are becoming more dense over time; i.e. average degree is increasing
e(t) ≈ n(t)a
where 1 < a ≤ 2: densification exponent– a=1: linear growth – constant average degree, such
as in web graph models– a=2: quadratic growth – cliques
On-line Social Networks - Anthony Bonato
19
Densification – Physics Citations
n(t)
e(t)
1.69
On-line Social Networks - Anthony Bonato
20
Densification – Autonomous Systems
n(t)
e(t)
1.18
On-line Social Networks - Anthony Bonato
21
2. Decreasing distances
• distances (diameter and/or average distances) decrease with time
– noted by Kumar et al. in Flickr and Yahoo!360• Preferential attachment model (Barabási, Albert, 99),
(Bollobás et al, 01)– diameter O(log t)
• Random power law graph model (Chung, Lu, 02)– average distance O(log log t)
On-line Social Networks - Anthony Bonato
22
Diameter – ArXiv citation graph
time [years]
diameter
On-line Social Networks - Anthony Bonato
23
Diameter – Autonomous Systems
number of nodes
diameter
On-line Social Networks - Anthony Bonato
24
Models for the laws• (Leskovec, Kleinberg, Faloutsos,
05, 07):– Forest Fire model
• stochastic• densification power law,
decreasing diameter, power law degree distribution
• (Leskovec, Chakrabarti, Kleinberg,Faloutsos, 05, 07):– Kronecker Multiplication
• deterministic• densification power law,
decreasing diameter, power law degree distribution
On-line Social Networks - Anthony Bonato
25
Models of OSNs
• many models exist for general complex networks
• few models for on-line social networks
• goal: find a model which simulates many of the observed properties of OSNs– must be simple and evolve in a natural way– must be different than previous complex network
models: densification and constant diameter!
On-line Social Networks - Anthony Bonato
26
“All models are wrong, but some are more useful.” – G.P.E. Box
On-line Social Networks - Anthony Bonato
27
Iterated Local Transitivity (ILT) model(Bonato, Hadi, Horn, Prałat, Wang, 08)
• key paradigm is transitivity: friends of friends are more likely friends; eg (Girvan and Newman, 03)– iterative cloning of closed neighbour sets
• deterministic: amenable to analysis • local: nodes often only have local influence• evolves over time, but retains memory of initial
graph
On-line Social Networks - Anthony Bonato
28
ILT model
• parameter: finite simple undirected graph G = G0
• to form the graph Gt+1 for each vertex x from time t, add a vertex x’, the clone of x, so that xx’ is an edge, and x’ is joined to each neighbour of x
• order of Gt is 2tn0
On-line Social Networks - Anthony Bonato
29
G0 = C4
On-line Social Networks - Anthony Bonato
30
Properties of ILT model• average degree increasing to ∞ with time• average distance bounded by constant and
converging, and in many cases decreasing with time; diameter does not change
• clustering higher than in a random generated graph with same average degree
• bad expansion: small gaps between 1st and 2nd eigenvalues in adjacency and normalized Laplacian matrices of Gt
On-line Social Networks - Anthony Bonato
31
Densification
• nt = order of Gt, et = size of Gt
Lemma: For t > 0, nt = 2tn0, et = 3t(e0+n0) - 2tn0.
→ densification power law:et ≈ nt
a, where a = log(3)/log(2).
On-line Social Networks - Anthony Bonato
32
Average distanceTheorem 2: If t > 0, then
• average distance bounded by a constant, and converges; for many initial graphs (large cycles) it decreases
• diameter does not change from time 0
On-line Social Networks - Anthony Bonato
33
Clustering Coefficient
Theorem 3: If t > 0, thenc(Gt) = nt
log(7/8)+o(1).• higher clustering than in a random graph
G(nt,p) with same order and average degree as Gt, which satisfies
c(G(nt,p)) = ntlog(3/4)+o(1)
On-line Social Networks - Anthony Bonato
34
Sketch of proof of lower bound• each node x at time t has a binary sequence
corresponding to descendants from time 0, with a clone indicated by 1
• let e(x,t) be the number of edges in N(x) at time t• we may show that
e(x,t+1) = 3e(x,t) + 2degt(x)e(x’,t+1) = e(x,t) + degt(x)
• if there are k many 0’s in the binary sequence of x, then
e(x,t) ≥ 3k-2e(x,2) = Ω(3k)
On-line Social Networks - Anthony Bonato
35
Sketch of proof, continued• there are many nodes with k many 0’s in their binary sequence• hence,
kt
n0
2
2
0
20 0
87
2431
2
43
)( tt
n
tkt
n
GCt
t
t
t
kt
k
t
On-line Social Networks - Anthony Bonato
36
• Wayne Zachary’s Ph.D. thesis (1970-72): observed social ties and rivalries in a university karate club (34 nodes,78 edges)
• during his observation, conflicts intensified and group split
Example of community structure
On-line Social Networks - Anthony Bonato
37
Adjacency matrix, A
eigenvalue spectrum: (-2)41531
On-line Social Networks - Anthony Bonato
38
Spectral results• the spectral gap λ of G is defined by
min{λ1, 2 - λn-1}, where 0 = λ0 ≤ λ1 ≤ … ≤ λn-1 ≤ 2 are the eigenvalues of
the normalized Laplacian of G: I-D-1/2AD1/2 (Chung, 97)• for random graphs, λ tends to 1 as order grows• in the ILT model, λ < ½• bad expansion/small spectral gaps in the ILT model
found in social networks but not in the web graph (Estrada, 06) – in social networks, there are a higher number of intra-
rather than inter-community links
On-line Social Networks - Anthony Bonato
39
Random ILT model• randomize the ILT model
– add random edges independently to new nodes, with probability a function of t
– makes densification tunable• densification exponent becomes
log(3 + ε) / log(2),where ε is any fixed real number in (0,1)
– gives any exponent in (log(3)/log(2), 2)• similar (or better) distance, clustering and spectral
results as in deterministic case
On-line Social Networks - Anthony Bonato
40
Degree distribution–generate power
law graphs from ILT?• deterministic
ILT model gives a binomial-type distribution
On-line Social Networks - Anthony Bonato
41
Geometric model for social networks
• OSNs live in social space: proximity of nodes depends on common attributes (such as geography, gender, age, etc.)
• IDEA: embed OSN in m-dimensional Euclidean space
On-line Social Networks - Anthony Bonato
42
Dimension of an OSN• dimension of OSN: minimum number of
attributes needed to classify nodes
• like game of “20 Questions”: each question narrows range of possibilities
• what is a credible mathematical formula for the dimension of an OSN?
On-line Social Networks - Anthony Bonato
43
Random geometric graphs• nodes are randomly
distributed in Euclidean space according to a given distribution
• nodes are joined by an edge if and only if their distance is less than a threshold value
(Penrose, 03)
On-line Social Networks - Anthony Bonato
44
Spatial model for OSNs• we consider a spatial model
of OSNs, where– nodes are embedded in
m-dimensional Euclidean space
– number of nodes is static– threshold value variable:
a function of ranking of nodes
On-line Social Networks - Anthony Bonato
45
Prestige-Based Spatial (PBS) Model(Bonato, Janssen, Prałat, 09)
• parameters: α, β in (0,1), α+β < 1; positive integer m• nodes live in hypercube of dimension m, measure 1• each node is ranked 1,2, …, n by some function r
– 1 is best, n is worst – we use random initial ranking
• at each time-step, one new node v is born, one node chosen u.a.r. dies (and ranking is updated)
• each existing node u has a region of influence with volume
• add edge uv if v is in the region of influence of u nr
On-line Social Networks - Anthony Bonato
46
Notes on PBS model
• models uses both geometry and ranking• dynamical system: gives rise to ergodic
(therefore, convergent) Markov chain– users join and leave OSNs
• number of nodes is static: fixed at n– order of OSNs has ceiling
• top ranked nodes have larger regions of influence
On-line Social Networks - Anthony Bonato
47
Properties of the PBS model (Bonato, Janssen, Prałat, 09)
• with high probability, the PBS model generates graphs with the following properties:– power law degree distribution with exponent b = 1+1/α– average degree d = (1+o(1))n(1-α-β)/21-α
• dense graph• tends to infinity with n
– diameter D = (1+o(1))nβ/(1-α)m
• depends on dimension m• m = clog n, then diameter is a constant
On-line Social Networks - Anthony Bonato
48
Dimension of an OSN, continued
• given the order of the network n, power law exponent b, average degree d, and diameter D, we can calculate m
• gives formula for dimension of OSN:
Dd
n
mbb
log2
log21
On-line Social Networks - Anthony Bonato
49
Uncovering the hidden reality• reverse engineering approach
– given network data (n, b, d, D), dimension of an OSN gives smallest number of attributes needed to identify users
• that is, given the graph structure, we can (theoretically) recover the social space
On-line Social Networks - Anthony Bonato
50
Examples
OSN Dimension
Facebook 6MySpace 8
Twitter 4Flickr 4
On-line Social Networks - Anthony Bonato
51
Future directions• what is a community in an OSN?
– (Porter, Onnela, Mucha,09): a set of graph partitions obtained by some “reasonable” iterative hierarchical partitioning algorithm
– motifs– Pott’s method from statistical mechanics– betweeness centrality
• lack of a formal definition, and few theorems
On-line Social Networks - Anthony Bonato
52
Spatial ranking models• rigorously analyze spatial
model with ranking by– age– degree
• simulate PBS model– fit model to data– is theoretical estimate
of the dimension of an OSN accurate?
Who is popular?• how to find popular users?• PageRank in OSNs• domination number
– constant in ILT model– in OSN data, domination
number is large (end-vertices)
– which is the correct graph parameter to consider?
On-line Social Networks - Anthony Bonato
53
On-line Social Networks - Anthony Bonato
54
• preprints, reprints, contact:Google: “Anthony Bonato”
On-line Social Networks - Anthony Bonato
55
WOSN’2010
On-line Social Networks - Anthony Bonato
56