05-smallworlds - stanford universitysnap.stanford.edu › class › cs224w-2015 › slides ›...
TRANSCRIPT
Small world networks
CS 224W
Outline
¤ Small world phenomenon¤ Milgram’s small world experiment
¤ Local structure¤ clustering coefficient¤ motifs
¤ Small world network models:¤ Watts & Strogatz (clustering & short paths)¤ Kleinberg (geographical)¤ Kleinberg, Watts/Dodds/Newman (hierarchical)
¤ Small world networks: why do they arise?
NE
MA
Small world phenomenon:Milgram’s experiment
¤ “Six degrees of separation”
Instructions:Given a target individual (stockbroker in Boston), pass the message to a person you correspond with who is “closest” to the target.
Milgram’s experiment
Outcome:
20% of initiated chains reached targetaverage chain length = 6.5
email experiment Dodds, Muhamad, Watts, Science 301, (2003)
(optional reading)
•18 targets•13 different countries
•60,000+ participants•24,163 message chains •384 reached their targets•average path length 4.0
Source: NASA, U.S. Government;; http://visibleearth.nasa.gov/view_rec.php?id=2429
Milgram’s experiment repeated
Interpreting Milgram’s experiment
n Is 6 is a surprising number?n In the 1960s? Today? Why?
n Pool and Kochen in (1978 established that the average person has between 500 and 1500 acquaintances)
Quiz Q:
¤ Ignore for the time being the fact that many of your friends’ friends are your friends as well. If everyone has 500 friends, the average person would have how many friends of friends?¤ 500¤ 1,000¤ 5,000¤ 250,000
Quiz Q:
¤With an average degree of 500, a node in a random network would have this many friends-of-friends-of-friends (3rd
degree neighbors):¤ 5,000¤ 500,000¤ 1,000,000¤ 125,000,000
Interpreting Milgram’s experiment
n Is 6 is a surprising number?n In the 1960s? Today? Why?
n If social networks were random… ?n Pool and Kochen (1978) - ~500-1500 acquaintances/personn ~ 500 choices 1st linkn ~ 5002 = 250,000 potential 2nd degree neighborsn ~ 5003 = 125,000,000 potential 3rd degree neighbors
n If networks are completely cliquish?n all my friends’ friends are my friendsn what would happen?
Quiz Q:
¤ If the network were completely cliquish, that is all of your friends of friends were also directly your friends, what would be true:¤ (a) None of your friendship edges would be
part of a triangle (closed triad)¤ (b) It would be impossible to reach any node
outside the clique by following directed edges
¤ (c) Your shortest path to your friends’ friends would be 2
complete cliquishness
¤ If all your friends of friends were also your friends, you would be part of an isolated clique.
Uncompleted chains and distance
n Is 6 an accurate number?
n What bias is introduced by uncompleted chains?n are longer or shorter chains more likely to be completed?
average
95 % confidence interval
probability of passing on message
position in chain
Source: An Experimental Study of Search in Global Social Networks: Peter Sheridan Dodds, Roby Muhamad, and Duncan J. Watts (8 August 2003); Science 301 (5634), 827.
Attrition
Quiz Q:
n if each intermediate person in the chain has 0.5 probability of passing the letter on, what is the likelihood of a chain being completedn of length 2?n of length 5?
chain of length 2sends for sure receives
passes on with probability 0.5
Quiz Q:
n if each intermediate person in the chain has 0.5 probability of passing the letter on, what is the likelihood of a chain of length 5 being completed¤ (a) ½¤ (b) ¼¤ (c) 1/8¤ (d) 1/16
‘recovered’histogram of path lengths
Source: An Experimental Study of Search in Global Social Networks: Peter Sheridan Dodds, Roby Muhamad, and Duncan J. Watts (8 August 2003); Science 301 (5634), 827.
Estimating the true distance
inter-countryintra-country
observed chain lengths
¤Is 6 an accurate number?
¤Do people find the shortest paths?¤ Killworth, McCarty ,Bernard, & House (2005):¤ less than optimal choice for next link in chain is made ½ of the time
Navigation and accuracy
Small worlds & networking
What does it mean to be 1, 2, 3 hops apart on Facebook, Twitter, LinkedIn, Google Plus?
Transitivity, triadic closure, clustering
¤Transitivity: ¤ if A is connected to B and B is connected to C
what is the probability that A is connected to C?
¤ my friends’ friends are likely to be my friends
A
B
C?
Clustering
C =
¤Global clustering coefficient3 x number of triangles in the graphnumber of connected triples of vertices
3 x number of triangles in the graph
number of connected triples
Local clustering coefficient (Watts&Strogatz 1998)
¤For a vertex i¤ The fraction pairs of neighbors of the node that
are themselves connected¤ Let ni be the number of neighbors of vertex i
Ci =
Ci directed =
Ci undirected =
# of connections between i’s neighborsmax # of possible connections between i’s neighbors
# directed connections between i’s neighborsni * (ni -1)
# undirected connections between i’s neighborsni * (ni -1)/2
Local clustering coefficient (Watts&Strogatz 1998)
¤Average over all n vertices
∑=i
iCnC 1
i
ni = 4max number of connections:4*3/2 = 63 connections presentCi = 3/6 = 0.5
link absentlink present
Quiz Q:
¤The clustering coefficient for vertex i is:
i
(a)0(b)1/3(c)1/2(d)2/3
Explanation
¤ni = 3
¤there are 2 connections present out of max of 3 possible
¤Ci = 2/3
i
Small world phenomenon:
high clustering
low average shortest path
beyond social networks
)ln(network Nl ≈
graph randomnetwork CC >>
what other networks can you think of with these characteristics?
Comparison with “random graph” used to determine whether real-world network is “small world”
Network size av. shortest path
Shortest path in fitted random graph
Clustering(averaged over vertices)
Clustering in random graph
Film actors 225,226 3.65 2.99 0.79 0.00027
MEDLINE co-authorship
1,520,251 4.6 4.91 0.56 1.8 x 10-4
E.Coli substrate graph
282 2.9 3.04 0.32 0.026
C.Elegans 282 2.65 2.25 0.28 0.05
Reconciling two observations:• High clustering: my friends’ friends tend to be my friends• Short average paths
Small world phenomenon:Watts/Strogatz model
Source: Watts, D.J., Strogatz, S.H.(1998) Collective dynamics of 'small-world' networks. Nature 393:440-442.
n As in many network generating algorithmsn Disallow self-edgesn Disallow multiple edges
Select a fraction p of edgesReposition on of their endpoints
Add a fraction p of additionaledges leaving underlying latticeintact
Watts-Strogatz model:Generating small world graphs
Source: Watts, D.J., Strogatz, S.H.(1998) Collective dynamics of 'small-world' networks. Nature 393:440-442.
¤Each node has K>=4 nearest neighbors (local)
¤tunable: vary the probability p of rewiring any given edge
¤small p: regular lattice
¤ large p: classical random graph
Watts-Strogatz model:Generating small world graphs
Quiz question:
¤ Which of the following is a result of a higher rewiring probability?
(a) Left (b) Right (c) insufficient information
What happens in between?
¤Small shortest path means low clustering?
¤Large shortest path means high clustering?
¤Through numerical simulation¤ As we increase p from 0 to 1
¤ Fast decrease of mean distance¤ Slow decrease in clustering
Clust coeff. and ASP as rewiring increases
10% of links rewired1% of links rewired
Source: Watts, D.J., Strogatz, S.H.(1998) Collective dynamics of 'small-world' networks. Nature 393:440-442.
Trying this with NetLogohttp://web.stanford.edu/class/cs224w/NetLogo/SmallWorldWS.nlogo
WS model clustering coefficient
¤ The probability that a connected triple stays connected after rewiring¤ probability that none of the 3 edges were rewired (1-p)3¤ probability that edges were rewired back to each other very small, can ignore
¤ Clustering coefficient = C(p) = C(p=0)*(1-p)3
0.2 0.4 0.6 0.8 1
0.2
0.4
0.6
0.8
1
C(p)/C(0)
pSource: Watts, D.J., Strogatz, S.H.(1998) Collective dynamics of 'small-world' networks. Nature 393:440-442.
Quiz Q
nWhich of the following is a descriptionmatching a small-world network?
(a)Its average shortest path is close to that of an Erdos-Renyi graph
(b)It has many closed triads(c)It has a high clustering coefficient(d)It has a short average path length
WS Model: What’s missing?
n Long range links not as likely as short range ones
nHierarchical structure / groupsnHubs
Ties and geography
“The geographic movement of the [message] from Nebraska to Massachusetts is striking. There is a progressive closing in on the target area as each new person is added to the chain”
S.Milgram ‘The small world problem’, Psychology Today 1,61,1967
NE
MA
nodes are placed on a lattice andconnect to nearest neighbors
additional links placed withp(link between u and v) = (distance(u,v))-r
Kleinberg’s geographical small world model
Source: Kleinberg, ‘The Small World Phenomenon, An Algorithmic Perspective’ (Nature 2000).
exponent that will determine navigability
NetLogo demo
¤how does the probability of long-range links affect search?
http://web.stanford.edu/class/cs224w/NetLogo/SmallWorldSearch.nlogo
When r=0, links are randomly distributed, ASP ~ log(n), n size of gridWhen r=0, any decentralized algorithm is at least a0n2/3
geographical search when network lacks locality
When r<2, expected time at least αrn(2-r)/3
0~p p
Overly localized links on a latticeWhen r>2 expected search time ~ N(r-2)/(r-1)
41~pd
When r=2, expected time of a DA is at most C (log N)2
21~pd
Just the right balance
Navigability
T
S
Rλ2|R|<|R’|<λ|R|
k = c log2n calculate probability that s fails to have a link in R’
R’
Quiz Q:
¤ What is true about a network where the probability of a tie falls off as distance-2
(a)Large networks cannot be navigated(b)A simple greedy strategy (pass the message to the
neighbor who is closest to the target) is sufficient(c)There are fewer long range ties than short range
ones(d)If the number of nodes doubles, the average
shortest path will be twice as long
Origins of small worlds:group affiliations
Source: Kleinberg, ‘Small-World Phenomena and the Dynamics of Information’ NIPS 14, 2001.
Hierarchical network models:
Individuals classified into a hierarchy, hij = height of the least common ancestor.
Group structure models:Individuals belong to nested groupsq = size of smallest group that v,w belong to
f(q) ~ q-α
ijhijp b α−:
h b=3
e.g. state-county-city-neighborhoodindustry-corporation-division-group
hierarchical small-world models: Kleinberg
Watts, Dodds, Newman (Science, 2001)individuals belong to hierarchically nested groups
multiple independent hierarchies h=1,2,..,H coexist corresponding to occupation, geography, hobbies, religion…
pij ~ exp(-α x)
Source: Identity and Search in Social Networks: Duncan J. Watts, Peter Sheridan Dodds, and M. E. J. Newman; Science 17 May 2002 296: 1302-1305. < http://arxiv.org/abs/cond-mat/0205383v1 >
hierarchical small-world models: WDN
Navigability and search strategy:Reverse small world experiment
¤ Killworth & Bernard (1978):¤ Given hypothetical targets (name, occupation, location, hobbies, religion…) participants choose an acquaintance for each target
¤ based on (most often) occupation, geography¤ only 7% because they “know a lot of people”¤ Simple greedy algorithm: most similar acquaintance¤ two-step strategy rare
Source: 1978 Peter D. Killworth and H. Russell Bernard. The Reverse Small World Experiment Social Networks 1:159–92.
Successful chains disproportionately used• weak ties (Granovetter)• professional ties (34% vs. 13%)• ties originating at work/college• target's work (65% vs. 40%)
. . . and disproportionately avoided• hubs (8% vs. 1%) (+ no evidence of funnels)• family/friendship ties (60% vs. 83%)
Strategy: Geography -> Work
Navigability and search strategy:Small world experiment @ Columbia
MotivationPower-law (PL) networks, social and P2P
Analysis of scaling of search strategies in PL networks
Simulationartificial power-law topologies, real Gnutella networks
2
Search in power-law networks
Mary
Bob
Jane
Who couldintroduce me toRichard Gere?
How do we search?
AT&T Call Graph
Aiello et al. STOC ‘00
# o
f tel
epho
ne n
umbe
rsfro
m w
hich
cal
ls w
ere
mad
e
# of telephone numbers called
100 101
100
101
102
number of neighbors
prop
ortio
n of
nod
es datapower-law fit τ = 2.07
Gnutella network
power-law link distribution
summer 2000,data provided by Clip2
Preferential attachment model
Nodes join at different times
The more connections a node has, the more likely it is to acquirenew connections
Growth process produces power-law network
host cache
pingping
file sharing w/o a central index
queries broadcast to every node within radius ttl⇒ as network grows, encounter a bandwidth barrier (dial up modems cannot keep up with query traffic, fragmenting the network)
Gnutella and the bandwidth barrier
Clip 2 reportGnutella: To the Bandwidth Barrier and Beyondhttp://www.clip2.com/gnutella.html#q17
16
54
6367
2
94
number ofnodes found
power-law graph
93
number ofnodes found
13
711
1519
Poisson graph
Search with knowledge of 2nd neighbors
Outline of search strategy
pass query onto only one neighbor at each step
requires that nodes sign query- avoid passing message onto a node twice
requires knowledge of one’s neighbors degree- pass to the highest degree node
requires knowledge of one’s neighbors neighbors- route to 2nd degree neighbors
OPTIONS
Generating functions
¤M.E.J. Newman, S.H. Strogatz, and D.J. Watts
¤‘Random graphs with arbitrary degree distributions and their applications’, PRE, cond-mat/0007235
¤Generating functions for degree distributions
¤Useful for computing moments of degree distribution,
¤component sizes, and average path lengths
∞
=
=∑00
( ) kk
kG x p x
Introducing cutoffs
< −max 1k N a node cannot have more connections than there are other nodes
This is important for exponents close to 2
τ τ
∞ ∞
= =∑ ∑1 1
1 1kp Cx π=2 2
6C
τ∞
∑> = =1000
( 1000, 2) ~ 0.001kp k pProbability that none of the nodes in a 1,000 node graph has 1000 or more neighbors:
τ− > = 1000(1 ( 1000, 2)) ~ 0.36p kwithout a cutoff, for τ = 2have > 50% chance of observing a node with more neighbors than there are nodes
for τ = 2.1, have a 25% chance
# of sites linking to the site
prop
ortio
n of
site
s w/ s
o m
any
links
1000
Selecting from a variety of cutoffs
Nk <max1.
2. κτ /kk eCkp −−= Newman et al.
3.⎩⎨⎧
=−
0
τCkpk
( ) τ1CNk <
otherwise
Generating Function
( )( )
kCN
kxkCxG ∑
=
−=τ
τ
1
10
Aiello et al.
1 million websites (~ 1997)
N
Aiello’s ‘conservative’ vs. Havlin’s ‘natural’ cutoff
τ
τ
− −
=
= 1
1
* 1
~
kN p
Ck N
k N
cutoff where expectednumber of nodes of degreek is 1
k
n(k)
k
n(k)
1
1
cutoff so thatexpected number of nodesof degree > k is 1
τ
τ
τ
∞
=
∞− −
=
− −
−
=∑
∫
max
max
1
1 1max
11
max
* 1
~
~
~
kk k
k k
N p
ck N
k N
k N
The imposed cutoff can have a dramaticeffect on the properties of the graphdegrees drawn at random, for τ = 2, and N = 1000
∞
=
=∑00
( ) kk
kG x p x
is the probability that a randomlychosen vertex has degree k
~kp k τ−
is a generating function
'0(1)k
kk kp G< >= =∑ is the expected degree of a
randomly chosen vertex
( ) ( )( )
'0
1 '0 1
G xG x
G= is the distribution of remaining
outgoing edges following and edge
assuming neighbors don’t share edges
( ) ( )11 '1
'02 GGz = is the expected number of second
degree neighbors
22
2
2 2
2
1
11
Generating functions for degree distributionsRandom graphs with arbitrary degree distributions and their applicationsby Newman, Strogatz & Watts
search with knowledge of first neighbors
( )
max
max
maxmax
max
max
01
' 1 10 0
1
' 1 1 20 max
1 1'
' 1 101 ' '
10 0
1 2'
203 2
' max1 '
0
( )
( ) ( )
1(1) 12
( )( )(1) (1)
( 1)(1)
( 2) 21(1)(1)
kk
kk
kk
kk
kk
G x c k x
G x G x c k xx
G k c k k dk k
G x cG x k xG G x
c k k xG
kGG
τ
τ
τ τ τ
τ
τ
τ τ
τ
τ
−
− −
− − −
− −
− −
− −
=
∂= =∂
=< >= = −−
∂= =
∂
= −
− −=
∑
∑
∑ ∫
∑
∑
:
2max( 1) (3 )
( 2)(3 )k ττ τ
τ τ
−− + −
− −
Generating function with cutoff
Average degree of vertex
constant in Nfor 2<τ<3, and kmax~Na, decreaseswith N
Average number of neighborsfollowing an edge
search with knowledge of first neighbors (cont’d)
τ ττ
τ
ττ τ
− −−
−
−= =
− − −: :
3 3' 3max max
1 1 max' 20 max
1 2(1)(1) (3 ) 1 (3 )B
k kz G kG k
In the limit τ->2, ' max1
max
(1)log( )kGk
:
Let’s for the moment ignore the fact that as we do a random walk, we encounter neighborsthat we’ve seen before
s = number of steps =1B
Nz
Search time with different cutoffs
: 0.18(2.1)s N
τ τττ τ
−−
−= = < <: 3 3
m
2
ax
( ) ,2 3N Nsk N
NIf kmax = N,
τ= =: max
max
log(log( ) , 2)N ks Nk
τττ
τττ τ
−−−
−−= = < <:221
33max 1
( ) ,2 3NN Nsk
NIf kmax = N1/(τ-1),
=: max
max
log( )(2) log( )N ksk
N
: 0.1(2.1) Ns
2 3 /133max
,2 3( )
N Ns Nk
N
ττ
ττ
τ−−
−
= = < <:If kmax = N1/τ,
So the best we can do is for exponents close to 2 N
2nd neighbor random walk, ignoring overlap:
( ) 15.0~1.2, NNS =τ( ) ( )ττ 213~, −NNS
( )
( )
= 2
2
~
s B
B
n z NNS
z N
search with knowledge of first neighbors (cont’d)
τ
τ
ττ
−
−=
⎡ ⎤∂ −⎡ ⎤ ⎡ ⎤= = = ⎢ ⎥⎢ ⎥ ⎣ ⎦∂ − −⎣ ⎦ ⎣ ⎦
232' max2 1 1 1 2
1 max
2( ( )) (1)1 (3 )B
x
kz G G x Gx k
Following the degree sequence
a ~ s = # of steps taken
( ) 1.0deg 1.2, NNS ==τ
2nd neighbors, ignoring overlap:
Go to highest degree node, then next highest, … etc.
τ τ− −
−= ∫
max
max
1 11 max~
k
D k az Nk dk Nak
τ
τ τ
−
− −
max
max
' 2(2 )1 1
2( 2) 2 4 /
( ) ~
~ ~Dz G x Nak
s k N
0 10 20 30 40 50 60 70 80 90 100
1
τ = 2.00τ = 2.25τ = 2.50τ = 2.75τ = 3.00τ = 3.25τ = 3.50τ = 3.75
degree of node
degr
ee o
f nei
ghbo
r -1
degr
ee o
f nod
e
2
20
10
5
Ratio of the degree of a node to the expected degree of its highest degree neighbor for 10,000 node power-law graphs of varying exponents
Actor collaboration graph(imdb database)
τ ~ 2.0-2.2
Exponents τ close to 2 required to search effectively
World Wide Web, τ ~ 2.0-2.3,high degree nodes: directories, search engines
Social networks, AT&T call graph τ ~ 2.1
Gnutella
100 101 102 103 104100
101
102
103
104
105
number of costars
num
ber o
f act
ors/
actre
sses
actors, τ = 2actresses, τ = 2.1
15
50
18 17
8
109
6
Following the degree sequence
Complications
¤Should not visit same node more than once
¤Many neighbors of current node being visited were also neighbors of previously visited nodes, and there is a bias toward high degree nodes being ‘seen’ over and over again
0 100 200 300 400 500 6000
5
10
15
20
25
30
step
degr
ee o
f nod
e
not visitedvisitedneighborsvisited
Status and degree of node visited
1
10-2
0.1
1
step
prop
ortio
n of
nod
es fo
und
at st
ep
random walk
10 102 103 104 105 106
10-3
10-4
0 20 40 60 80 100
0.2
0.4
0.6
0.8
1
step
cum
ulat
ive
node
s fou
nd a
t ste
p
random walkdegree sequence
seeking high degree nodesspeeds up the search process
about 50% of a 10,000 node graphis explored in the first 12 steps
Progress of exploration in a 10,000 node graph knowing2nd degree neighbors
12
degree sequence
101 102 103 104 105100
101
102
103
size of graph
cove
rtim
e fo
r hal
f the
nod
es
random walkα = 0.37 fitdegree sequenceα = 0.24 fit
Scaling of search time with size of graph
Comparison with a Poisson graph
100
101
102
103
100
101
102
degr
ee o
f cur
rent
nod
e
step
Poissonpower-law
101
102
104
106
100
101
102
103
104
105
number of nodes in graph
cove
r tim
e fo
r 1/2
of g
raph
constant av. deg. = 3.4γ = 1.0 fit
( ) ( )10
−= xzexG
( ) ( ) ( )xGxGzxxG 001 =ʹ′=
expected degree and expecteddegree following a link are equal
scaling is linear
0 20 40 60 80 1000
0.2
0.4
0.6
0.8
1
step
cum
ulat
ive
node
s fou
nd a
t ste
p
high degree seeking 1st neighborshigh degree seeking 2nd neighbors
50% of the files in a 700 node network can be found in < 8 steps
Gnutella network
Expander graphsTime permitting
Def: Random k-Regular Graphs
¤We need to define two concepts
¤1) Define: Random k-Regular graph¤ Assume each node has k spokes (half-edges)¤ Randomly pair them up!
¤2) Define: Expansion¤ Graph G(V, E) has expansion α:
if∀ S ⊆ V: #edges leaving S ≥ α⋅ min(|S|,|V\S|)
¤ Or equivalently:
Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 81
|)\||,min(|#min SVS
SleavingedgesVS⊆
=α
S V \ S
Expansion: Intuition
82
S nodes ≥ α·S edges
S’ nodes ≥ α·S’ edges
(A big) graph with “good” expansion|)\||,min(|
#min SVSSleavingedges
VS⊆=α
Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
Expansion: k-Regular Graphs
¤ k-regular graph (every node has degree k):¤ Expansion is at most k (when S is a single node)
¤ Is there a graph on n nodes (n→∞), of fixed max deg. k, so that expansion α remains const?
Examples:¤ n×n grid: k=4: α =2n/(n2/4)→0
(S=n/2 × n/2 square in the center)
¤ Complete binary tree:α →0 for |S|=(n/2)-1
¤ Fact: For a random 3-regular graph on n nodes, there is some const α (α >0, independent. of n) such that w.h.p. the expansion of the graph is ≥ α (In fact, α=d/2 as d→∞)
83
S
S
|)\||,min(|#min SVS
SleavingedgesVS⊆
=α
Make this into 6x6 grid!
Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
Diameter of 3-Regular Rnd. Graph
¤Fact: In a graph on n nodes with expansion α, for all pairs of nodes s and t there is a path of O((log n) / α) edges connecting them.
¤Proof:¤ Proof strategy:
¤ We want to show that from any node s there is a path of length O((log n)/α) to any other node t
¤ Let Sj be a set of all nodes found within j steps of BFS from s.
¤ How does Sj increase as a function of j?
84
s
S0
S1
S2
Make this into a 3-ary tree
Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
Diameter of 3-Regular Rnd. Graph
¤Proof (continued):¤ Let Sj be a set of all nodes found
within j steps of BFS from s. ¤ We want to relate Sj and Sj+1
85
=+≥+ kS
SS jjj
α1
At most k edges “collide” at a node
Expansion
1
01 11+
+ ⎟⎠
⎞⎜⎝
⎛ +=⎟⎠
⎞⎜⎝
⎛ +≥j
jj kS
kSS αα
s
S0
S1
S2
|Sj|nodes
At least α|Sj| edges
|Sj+1|nodes
Each ofdegree k
where S0=1
Make this into a 3-ary tree
Diameter of 3-Regular Rnd. Graph
¤Proof (continued):¤ In how many steps of BFS
do we reach >n/2 nodes?¤ Need j so that:
¤ Let’s set:¤ Then:
¤ In 2k/α·log n steps |Sj| grows to Θ(n). So, the diameter of G is O(log(n)/ α)
86
s21 nk
Sj
j ≥⎟⎠
⎞⎜⎝
⎛ +=α
t
αnkj 2log
=
221 2
2
log
log
nnk
n
nk
>=≥⎟⎠
⎞⎜⎝
⎛ +αα
n
nk
k2
2
log
log
21 ≥⎟⎠
⎞⎜⎝
⎛ +αα
Claim:
Remember n>0, α ≤ k then:
In log(n) steps, we reach >n/2 nodes
In log(n) steps, wereach >n/2 nodes
( )
nnnx
nn
ex
xk
22
2
22
logloglog
loglog11
211 and
: then 0 if
211:k if
>=⎟⎠
⎞⎜⎝
⎛ +
∞→=→
=+=
αα
α
In j steps, we reach >n/2 nodes
In j steps, wereach >n/2 nodes ⇒ Diameter = 2·j
⇒ Diameter = 2 log(n)
x
x xe ⎟
⎠
⎞⎜⎝
⎛ +=∞→
11lim
Make this into a 3-ary tree
Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
Summary
¤Small world phenomenon:¤ Local structure (e.g. clustering)¤ Short average shortest path
¤The Watts-Strogatz captures both
¤Other models create navigable small-world models
¤Power-law networks are navigable due to presence of hubs