05-smallworlds - stanford universitysnap.stanford.edu › class › cs224w-2015 › slides ›...

87
Small world networks CS 224W

Upload: others

Post on 26-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Small world networks

CS 224W

Page 2: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Outline

¤ Small world phenomenon¤ Milgram’s small world experiment

¤ Local structure¤ clustering coefficient¤ motifs

¤ Small world network models:¤ Watts & Strogatz (clustering & short paths)¤ Kleinberg (geographical)¤ Kleinberg, Watts/Dodds/Newman (hierarchical)

¤ Small world networks: why do they arise?

Page 3: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

NE

MA

Small world phenomenon:Milgram’s experiment

Page 4: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

¤ “Six degrees of separation”

Instructions:Given a target individual (stockbroker in Boston), pass the message to a person you correspond with who is “closest” to the target.

Milgram’s experiment

Outcome:

20% of initiated chains reached targetaverage chain length = 6.5

Page 5: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

email experiment Dodds, Muhamad, Watts, Science 301, (2003)

(optional reading)

•18 targets•13 different countries

•60,000+ participants•24,163 message chains •384 reached their targets•average path length 4.0

Source: NASA, U.S. Government;; http://visibleearth.nasa.gov/view_rec.php?id=2429

Milgram’s experiment repeated

Page 6: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Interpreting Milgram’s experiment

n Is 6 is a surprising number?n In the 1960s? Today? Why?

n Pool and Kochen in (1978 established that the average person has between 500 and 1500 acquaintances)

Page 7: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Quiz Q:

¤ Ignore for the time being the fact that many of your friends’ friends are your friends as well. If everyone has 500 friends, the average person would have how many friends of friends?¤ 500¤ 1,000¤ 5,000¤ 250,000

Page 8: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Quiz Q:

¤With an average degree of 500, a node in a random network would have this many friends-of-friends-of-friends (3rd

degree neighbors):¤ 5,000¤ 500,000¤ 1,000,000¤ 125,000,000

Page 9: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Interpreting Milgram’s experiment

n Is 6 is a surprising number?n In the 1960s? Today? Why?

n If social networks were random… ?n Pool and Kochen (1978) -­ ~500-­1500 acquaintances/personn ~ 500 choices 1st linkn ~ 5002 = 250,000 potential 2nd degree neighborsn ~ 5003 = 125,000,000 potential 3rd degree neighbors

n If networks are completely cliquish?n all my friends’ friends are my friendsn what would happen?

Page 10: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Quiz Q:

¤ If the network were completely cliquish, that is all of your friends of friends were also directly your friends, what would be true:¤ (a) None of your friendship edges would be

part of a triangle (closed triad)¤ (b) It would be impossible to reach any node

outside the clique by following directed edges

¤ (c) Your shortest path to your friends’ friends would be 2

Page 11: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

complete cliquishness

¤ If all your friends of friends were also your friends, you would be part of an isolated clique.

Page 12: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Uncompleted chains and distance

n Is 6 an accurate number?

n What bias is introduced by uncompleted chains?n are longer or shorter chains more likely to be completed?

Page 13: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

average

95 % confidence interval

probability of passing on message

position in chain

Source: An Experimental Study of Search in Global Social Networks: Peter Sheridan Dodds, Roby Muhamad, and Duncan J. Watts (8 August 2003); Science 301 (5634), 827.

Attrition

Page 14: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Quiz Q:

n if each intermediate person in the chain has 0.5 probability of passing the letter on, what is the likelihood of a chain being completedn of length 2?n of length 5?

chain of length 2sends for sure receives

passes on with probability 0.5

Page 15: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Quiz Q:

n if each intermediate person in the chain has 0.5 probability of passing the letter on, what is the likelihood of a chain of length 5 being completed¤ (a) ½¤ (b) ¼¤ (c) 1/8¤ (d) 1/16

Page 16: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

‘recovered’histogram of path lengths

Source: An Experimental Study of Search in Global Social Networks: Peter Sheridan Dodds, Roby Muhamad, and Duncan J. Watts (8 August 2003); Science 301 (5634), 827.

Estimating the true distance

inter-­countryintra-­country

observed chain lengths

Page 17: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

¤Is 6 an accurate number?

¤Do people find the shortest paths?¤ Killworth, McCarty ,Bernard, & House (2005):¤ less than optimal choice for next link in chain is made ½ of the time

Navigation and accuracy

Page 18: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Small worlds & networking

What does it mean to be 1, 2, 3 hops apart on Facebook, Twitter, LinkedIn, Google Plus?

Page 19: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Transitivity, triadic closure, clustering

¤Transitivity: ¤ if A is connected to B and B is connected to C

what is the probability that A is connected to C?

¤ my friends’ friends are likely to be my friends

A

B

C?

Page 20: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Clustering

C =

¤Global clustering coefficient3 x number of triangles in the graphnumber of connected triples of vertices

3 x number of triangles in the graph

number of connected triples

Page 21: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Local clustering coefficient (Watts&Strogatz 1998)

¤For a vertex i¤ The fraction pairs of neighbors of the node that

are themselves connected¤ Let ni be the number of neighbors of vertex i

Ci =

Ci directed =

Ci undirected =

# of connections between i’s neighborsmax # of possible connections between i’s neighbors

# directed connections between i’s neighborsni * (ni -1)

# undirected connections between i’s neighborsni * (ni -1)/2

Page 22: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Local clustering coefficient (Watts&Strogatz 1998)

¤Average over all n vertices

∑=i

iCnC 1

i

ni = 4max number of connections:4*3/2 = 63 connections presentCi = 3/6 = 0.5

link absentlink present

Page 23: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Quiz Q:

¤The clustering coefficient for vertex i is:

i

(a)0(b)1/3(c)1/2(d)2/3

Page 24: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Explanation

¤ni = 3

¤there are 2 connections present out of max of 3 possible

¤Ci = 2/3

i

Page 25: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Small world phenomenon:

high clustering

low average shortest path

beyond social networks

)ln(network Nl ≈

graph randomnetwork CC >>

what other networks can you think of with these characteristics?

Page 26: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Comparison with “random graph” used to determine whether real-world network is “small world”

Network size av. shortest path

Shortest path in fitted random graph

Clustering(averaged over vertices)

Clustering in random graph

Film actors 225,226 3.65 2.99 0.79 0.00027

MEDLINE co-­authorship

1,520,251 4.6 4.91 0.56 1.8 x 10-­4

E.Coli substrate graph

282 2.9 3.04 0.32 0.026

C.Elegans 282 2.65 2.25 0.28 0.05

Page 27: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Reconciling two observations:• High clustering: my friends’ friends tend to be my friends• Short average paths

Small world phenomenon:Watts/Strogatz model

Source: Watts, D.J., Strogatz, S.H.(1998) Collective dynamics of 'small-world' networks. Nature 393:440-442.

Page 28: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

n As in many network generating algorithmsn Disallow self-edgesn Disallow multiple edges

Select a fraction p of edgesReposition on of their endpoints

Add a fraction p of additionaledges leaving underlying latticeintact

Watts-­Strogatz model:Generating small world graphs

Source: Watts, D.J., Strogatz, S.H.(1998) Collective dynamics of 'small-world' networks. Nature 393:440-442.

Page 29: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

¤Each node has K>=4 nearest neighbors (local)

¤tunable: vary the probability p of rewiring any given edge

¤small p: regular lattice

¤ large p: classical random graph

Watts-­Strogatz model:Generating small world graphs

Page 30: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Quiz question:

¤ Which of the following is a result of a higher rewiring probability?

(a) Left (b) Right (c) insufficient information

Page 31: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

What happens in between?

¤Small shortest path means low clustering?

¤Large shortest path means high clustering?

¤Through numerical simulation¤ As we increase p from 0 to 1

¤ Fast decrease of mean distance¤ Slow decrease in clustering

Page 32: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Clust coeff. and ASP as rewiring increases

10% of links rewired1% of links rewired

Source: Watts, D.J., Strogatz, S.H.(1998) Collective dynamics of 'small-world' networks. Nature 393:440-442.

Page 33: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Trying this with NetLogohttp://web.stanford.edu/class/cs224w/NetLogo/SmallWorldWS.nlogo

Page 34: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

WS model clustering coefficient

¤ The probability that a connected triple stays connected after rewiring¤ probability that none of the 3 edges were rewired (1-­p)3¤ probability that edges were rewired back to each other very small, can ignore

¤ Clustering coefficient = C(p) = C(p=0)*(1-­p)3

0.2 0.4 0.6 0.8 1

0.2

0.4

0.6

0.8

1

C(p)/C(0)

pSource: Watts, D.J., Strogatz, S.H.(1998) Collective dynamics of 'small-world' networks. Nature 393:440-442.

Page 35: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Quiz Q

nWhich of the following is a descriptionmatching a small-­world network?

(a)Its average shortest path is close to that of an Erdos-Renyi graph

(b)It has many closed triads(c)It has a high clustering coefficient(d)It has a short average path length

Page 36: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

WS Model: What’s missing?

n Long range links not as likely as short range ones

nHierarchical structure / groupsnHubs

Page 37: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Ties and geography

“The geographic movement of the [message] from Nebraska to Massachusetts is striking. There is a progressive closing in on the target area as each new person is added to the chain”

S.Milgram ‘The small world problem’, Psychology Today 1,61,1967

NE

MA

Page 38: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

nodes are placed on a lattice andconnect to nearest neighbors

additional links placed withp(link between u and v) = (distance(u,v))-­r

Kleinberg’s geographical small world model

Source: Kleinberg, ‘The Small World Phenomenon, An Algorithmic Perspective’ (Nature 2000).

exponent that will determine navigability

Page 39: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

NetLogo demo

¤how does the probability of long-­range links affect search?

http://web.stanford.edu/class/cs224w/NetLogo/SmallWorldSearch.nlogo

Page 40: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

When r=0, links are randomly distributed, ASP ~ log(n), n size of gridWhen r=0, any decentralized algorithm is at least a0n2/3

geographical search when network lacks locality

When r<2, expected time at least αrn(2-r)/3

0~p p

Page 41: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Overly localized links on a latticeWhen r>2 expected search time ~ N(r-2)/(r-1)

41~pd

Page 42: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

When r=2, expected time of a DA is at most C (log N)2

21~pd

Just the right balance

Page 43: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Navigability

T

S

Rλ2|R|<|R’|<λ|R|

k = c log2n calculate probability that s fails to have a link in R’

R’

Page 44: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Quiz Q:

¤ What is true about a network where the probability of a tie falls off as distance-2

(a)Large networks cannot be navigated(b)A simple greedy strategy (pass the message to the

neighbor who is closest to the target) is sufficient(c)There are fewer long range ties than short range

ones(d)If the number of nodes doubles, the average

shortest path will be twice as long

Page 45: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Origins of small worlds:group affiliations

Page 46: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Source: Kleinberg, ‘Small-­World Phenomena and the Dynamics of Information’ NIPS 14, 2001.

Hierarchical network models:

Individuals classified into a hierarchy, hij = height of the least common ancestor.

Group structure models:Individuals belong to nested groupsq = size of smallest group that v,w belong to

f(q) ~ q-­α

ijhijp b α−:

h b=3

e.g. state-­county-­city-­neighborhoodindustry-­corporation-­division-­group

hierarchical small-world models: Kleinberg

Page 47: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Watts, Dodds, Newman (Science, 2001)individuals belong to hierarchically nested groups

multiple independent hierarchies h=1,2,..,H coexist corresponding to occupation, geography, hobbies, religion…

pij ~ exp(-α x)

Source: Identity and Search in Social Networks: Duncan J. Watts, Peter Sheridan Dodds, and M. E. J. Newman; Science 17 May 2002 296: 1302-1305. < http://arxiv.org/abs/cond-mat/0205383v1 >

hierarchical small-world models: WDN

Page 48: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Navigability and search strategy:Reverse small world experiment

¤ Killworth & Bernard (1978):¤ Given hypothetical targets (name, occupation, location, hobbies, religion…) participants choose an acquaintance for each target

¤ based on (most often) occupation, geography¤ only 7% because they “know a lot of people”¤ Simple greedy algorithm: most similar acquaintance¤ two-­step strategy rare

Source: 1978 Peter D. Killworth and H. Russell Bernard. The Reverse Small World Experiment Social Networks 1:159–92.

Page 49: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Successful chains disproportionately used• weak ties (Granovetter)• professional ties (34% vs. 13%)• ties originating at work/college• target's work (65% vs. 40%)

. . . and disproportionately avoided• hubs (8% vs. 1%) (+ no evidence of funnels)• family/friendship ties (60% vs. 83%)

Strategy: Geography -> Work

Navigability and search strategy:Small world experiment @ Columbia

Page 50: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

MotivationPower-law (PL) networks, social and P2P

Analysis of scaling of search strategies in PL networks

Simulationartificial power-law topologies, real Gnutella networks

2

Search in power-law networks

Page 51: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Mary

Bob

Jane

Who couldintroduce me toRichard Gere?

How do we search?

Page 52: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

AT&T Call Graph

Aiello et al. STOC ‘00

# o

f tel

epho

ne n

umbe

rsfro

m w

hich

cal

ls w

ere

mad

e

# of telephone numbers called

Page 53: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

100 101

100

101

102

number of neighbors

prop

ortio

n of

nod

es datapower-law fit τ = 2.07

Gnutella network

power-law link distribution

summer 2000,data provided by Clip2

Page 54: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Preferential attachment model

Nodes join at different times

The more connections a node has, the more likely it is to acquirenew connections

Growth process produces power-law network

host cache

pingping

Page 55: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

file sharing w/o a central index

queries broadcast to every node within radius ttl⇒ as network grows, encounter a bandwidth barrier (dial up modems cannot keep up with query traffic, fragmenting the network)

Gnutella and the bandwidth barrier

Clip 2 reportGnutella: To the Bandwidth Barrier and Beyondhttp://www.clip2.com/gnutella.html#q17

Page 56: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

16

54

6367

2

94

number ofnodes found

power-law graph

Page 57: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

93

number ofnodes found

13

711

1519

Poisson graph

Page 58: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Search with knowledge of 2nd neighbors

Page 59: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Outline of search strategy

pass query onto only one neighbor at each step

requires that nodes sign query- avoid passing message onto a node twice

requires knowledge of one’s neighbors degree- pass to the highest degree node

requires knowledge of one’s neighbors neighbors- route to 2nd degree neighbors

OPTIONS

Page 60: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Generating functions

¤M.E.J. Newman, S.H. Strogatz, and D.J. Watts

¤‘Random graphs with arbitrary degree distributions and their applications’, PRE, cond-mat/0007235

¤Generating functions for degree distributions

¤Useful for computing moments of degree distribution,

¤component sizes, and average path lengths

=

=∑00

( ) kk

kG x p x

Page 61: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Introducing cutoffs

< −max 1k N a node cannot have more connections than there are other nodes

This is important for exponents close to 2

τ τ

∞ ∞

= =∑ ∑1 1

1 1kp Cx π=2 2

6C

τ∞

∑> = =1000

( 1000, 2) ~ 0.001kp k pProbability that none of the nodes in a 1,000 node graph has 1000 or more neighbors:

τ− > = 1000(1 ( 1000, 2)) ~ 0.36p kwithout a cutoff, for τ = 2have > 50% chance of observing a node with more neighbors than there are nodes

for τ = 2.1, have a 25% chance

Page 62: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

# of sites linking to the site

prop

ortio

n of

site

s w/ s

o m

any

links

1000

Selecting from a variety of cutoffs

Nk <max1.

2. κτ /kk eCkp −−= Newman et al.

3.⎩⎨⎧

=−

0

τCkpk

( ) τ1CNk <

otherwise

Generating Function

( )( )

kCN

kxkCxG ∑

=

−=τ

τ

1

10

Aiello et al.

1 million websites (~ 1997)

N

Page 63: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Aiello’s ‘conservative’ vs. Havlin’s ‘natural’ cutoff

τ

τ

− −

=

= 1

1

* 1

~

kN p

Ck N

k N

cutoff where expectednumber of nodes of degreek is 1

k

n(k)

k

n(k)

1

1

cutoff so thatexpected number of nodesof degree > k is 1

τ

τ

τ

=

∞− −

=

− −

=∑

max

max

1

1 1max

11

max

* 1

~

~

~

kk k

k k

N p

ck N

k N

k N

Page 64: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

The imposed cutoff can have a dramaticeffect on the properties of the graphdegrees drawn at random, for τ = 2, and N = 1000

Page 65: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

=

=∑00

( ) kk

kG x p x

is the probability that a randomlychosen vertex has degree k

~kp k τ−

is a generating function

'0(1)k

kk kp G< >= =∑ is the expected degree of a

randomly chosen vertex

( ) ( )( )

'0

1 '0 1

G xG x

G= is the distribution of remaining

outgoing edges following and edge

assuming neighbors don’t share edges

( ) ( )11 '1

'02 GGz = is the expected number of second

degree neighbors

22

2

2 2

2

1

11

Generating functions for degree distributionsRandom graphs with arbitrary degree distributions and their applicationsby Newman, Strogatz & Watts

Page 66: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

search with knowledge of first neighbors

( )

max

max

maxmax

max

max

01

' 1 10 0

1

' 1 1 20 max

1 1'

' 1 101 ' '

10 0

1 2'

203 2

' max1 '

0

( )

( ) ( )

1(1) 12

( )( )(1) (1)

( 1)(1)

( 2) 21(1)(1)

kk

kk

kk

kk

kk

G x c k x

G x G x c k xx

G k c k k dk k

G x cG x k xG G x

c k k xG

kGG

τ

τ

τ τ τ

τ

τ

τ τ

τ

τ

− −

− − −

− −

− −

− −

=

∂= =∂

=< >= = −−

∂= =

= −

− −=

∑ ∫

:

2max( 1) (3 )

( 2)(3 )k ττ τ

τ τ

−− + −

− −

Generating function with cutoff

Average degree of vertex

constant in Nfor 2<τ<3, and kmax~Na, decreaseswith N

Average number of neighborsfollowing an edge

Page 67: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

search with knowledge of first neighbors (cont’d)

τ ττ

τ

ττ τ

− −−

−= =

− − −: :

3 3' 3max max

1 1 max' 20 max

1 2(1)(1) (3 ) 1 (3 )B

k kz G kG k

In the limit τ->2, ' max1

max

(1)log( )kGk

:

Let’s for the moment ignore the fact that as we do a random walk, we encounter neighborsthat we’ve seen before

s = number of steps =1B

Nz

Page 68: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Search time with different cutoffs

: 0.18(2.1)s N

τ τττ τ

−−

−= = < <: 3 3

m

2

ax

( ) ,2 3N Nsk N

NIf kmax = N,

τ= =: max

max

log(log( ) , 2)N ks Nk

τττ

τττ τ

−−−

−−= = < <:221

33max 1

( ) ,2 3NN Nsk

NIf kmax = N1/(τ-1),

=: max

max

log( )(2) log( )N ksk

N

: 0.1(2.1) Ns

Page 69: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

2 3 /133max

,2 3( )

N Ns Nk

N

ττ

ττ

τ−−

= = < <:If kmax = N1/τ,

So the best we can do is for exponents close to 2 N

2nd neighbor random walk, ignoring overlap:

( ) 15.0~1.2, NNS =τ( ) ( )ττ 213~, −NNS

( )

( )

= 2

2

~

s B

B

n z NNS

z N

search with knowledge of first neighbors (cont’d)

τ

τ

ττ

−=

⎡ ⎤∂ −⎡ ⎤ ⎡ ⎤= = = ⎢ ⎥⎢ ⎥ ⎣ ⎦∂ − −⎣ ⎦ ⎣ ⎦

232' max2 1 1 1 2

1 max

2( ( )) (1)1 (3 )B

x

kz G G x Gx k

Page 70: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Following the degree sequence

a ~ s = # of steps taken

( ) 1.0deg 1.2, NNS ==τ

2nd neighbors, ignoring overlap:

Go to highest degree node, then next highest, … etc.

τ τ− −

−= ∫

max

max

1 11 max~

k

D k az Nk dk Nak

τ

τ τ

− −

max

max

' 2(2 )1 1

2( 2) 2 4 /

( ) ~

~ ~Dz G x Nak

s k N

Page 71: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

0 10 20 30 40 50 60 70 80 90 100

1

τ = 2.00τ = 2.25τ = 2.50τ = 2.75τ = 3.00τ = 3.25τ = 3.50τ = 3.75

degree of node

degr

ee o

f nei

ghbo

r -1

degr

ee o

f nod

e

2

20

10

5

Ratio of the degree of a node to the expected degree of its highest degree neighbor for 10,000 node power-law graphs of varying exponents

Page 72: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Actor collaboration graph(imdb database)

τ ~ 2.0-2.2

Exponents τ close to 2 required to search effectively

World Wide Web, τ ~ 2.0-2.3,high degree nodes: directories, search engines

Social networks, AT&T call graph τ ~ 2.1

Gnutella

100 101 102 103 104100

101

102

103

104

105

number of costars

num

ber o

f act

ors/

actre

sses

actors, τ = 2actresses, τ = 2.1

Page 73: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

15

50

18 17

8

109

6

Following the degree sequence

Page 74: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Complications

¤Should not visit same node more than once

¤Many neighbors of current node being visited were also neighbors of previously visited nodes, and there is a bias toward high degree nodes being ‘seen’ over and over again

Page 75: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

0 100 200 300 400 500 6000

5

10

15

20

25

30

step

degr

ee o

f nod

e

not visitedvisitedneighborsvisited

Status and degree of node visited

Page 76: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

1

10-2

0.1

1

step

prop

ortio

n of

nod

es fo

und

at st

ep

random walk

10 102 103 104 105 106

10-3

10-4

0 20 40 60 80 100

0.2

0.4

0.6

0.8

1

step

cum

ulat

ive

node

s fou

nd a

t ste

p

random walkdegree sequence

seeking high degree nodesspeeds up the search process

about 50% of a 10,000 node graphis explored in the first 12 steps

Progress of exploration in a 10,000 node graph knowing2nd degree neighbors

12

degree sequence

Page 77: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

101 102 103 104 105100

101

102

103

size of graph

cove

rtim

e fo

r hal

f the

nod

es

random walkα = 0.37 fitdegree sequenceα = 0.24 fit

Scaling of search time with size of graph

Page 78: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Comparison with a Poisson graph

100

101

102

103

100

101

102

degr

ee o

f cur

rent

nod

e

step

Poissonpower-law

101

102

104

106

100

101

102

103

104

105

number of nodes in graph

cove

r tim

e fo

r 1/2

of g

raph

constant av. deg. = 3.4γ = 1.0 fit

( ) ( )10

−= xzexG

( ) ( ) ( )xGxGzxxG 001 =ʹ′=

expected degree and expecteddegree following a link are equal

scaling is linear

Page 79: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

0 20 40 60 80 1000

0.2

0.4

0.6

0.8

1

step

cum

ulat

ive

node

s fou

nd a

t ste

p

high degree seeking 1st neighborshigh degree seeking 2nd neighbors

50% of the files in a 700 node network can be found in < 8 steps

Gnutella network

Page 80: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Expander graphsTime permitting

Page 81: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Def: Random k-Regular Graphs

¤We need to define two concepts

¤1) Define: Random k-Regular graph¤ Assume each node has k spokes (half-edges)¤ Randomly pair them up!

¤2) Define: Expansion¤ Graph G(V, E) has expansion α:

if∀ S ⊆ V: #edges leaving S ≥ α⋅ min(|S|,|V\S|)

¤ Or equivalently:

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 81

|)\||,min(|#min SVS

SleavingedgesVS⊆

S V \ S

Page 82: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Expansion: Intuition

82

S nodes ≥ α·S edges

S’ nodes ≥ α·S’ edges

(A big) graph with “good” expansion|)\||,min(|

#min SVSSleavingedges

VS⊆=α

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Page 83: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Expansion: k-Regular Graphs

¤ k-regular graph (every node has degree k):¤ Expansion is at most k (when S is a single node)

¤ Is there a graph on n nodes (n→∞), of fixed max deg. k, so that expansion α remains const?

Examples:¤ n×n grid: k=4: α =2n/(n2/4)→0

(S=n/2 × n/2 square in the center)

¤ Complete binary tree:α →0 for |S|=(n/2)-1

¤ Fact: For a random 3-regular graph on n nodes, there is some const α (α >0, independent. of n) such that w.h.p. the expansion of the graph is ≥ α (In fact, α=d/2 as d→∞)

83

S

S

|)\||,min(|#min SVS

SleavingedgesVS⊆

Make this into 6x6 grid!

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Page 84: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Diameter of 3-Regular Rnd. Graph

¤Fact: In a graph on n nodes with expansion α, for all pairs of nodes s and t there is a path of O((log n) / α) edges connecting them.

¤Proof:¤ Proof strategy:

¤ We want to show that from any node s there is a path of length O((log n)/α) to any other node t

¤ Let Sj be a set of all nodes found within j steps of BFS from s.

¤ How does Sj increase as a function of j?

84

s

S0

S1

S2

Make this into a 3-ary tree

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Page 85: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Diameter of 3-Regular Rnd. Graph

¤Proof (continued):¤ Let Sj be a set of all nodes found

within j steps of BFS from s. ¤ We want to relate Sj and Sj+1

85

=+≥+ kS

SS jjj

α1

At most k edges “collide” at a node

Expansion

1

01 11+

+ ⎟⎠

⎞⎜⎝

⎛ +=⎟⎠

⎞⎜⎝

⎛ +≥j

jj kS

kSS αα

s

S0

S1

S2

|Sj|nodes

At least α|Sj| edges

|Sj+1|nodes

Each ofdegree k

where S0=1

Make this into a 3-ary tree

Page 86: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Diameter of 3-Regular Rnd. Graph

¤Proof (continued):¤ In how many steps of BFS

do we reach >n/2 nodes?¤ Need j so that:

¤ Let’s set:¤ Then:

¤ In 2k/α·log n steps |Sj| grows to Θ(n). So, the diameter of G is O(log(n)/ α)

86

s21 nk

Sj

j ≥⎟⎠

⎞⎜⎝

⎛ +=α

t

αnkj 2log

=

221 2

2

log

log

nnk

n

nk

>=≥⎟⎠

⎞⎜⎝

⎛ +αα

n

nk

k2

2

log

log

21 ≥⎟⎠

⎞⎜⎝

⎛ +αα

Claim:

Remember n>0, α ≤ k then:

In log(n) steps, we reach >n/2 nodes

In log(n) steps, wereach >n/2 nodes

( )

nnnx

nn

ex

xk

22

2

22

logloglog

loglog11

211 and

: then 0 if

211:k if

>=⎟⎠

⎞⎜⎝

⎛ +

∞→=→

=+=

αα

α

In j steps, we reach >n/2 nodes

In j steps, wereach >n/2 nodes ⇒ Diameter = 2·j

⇒ Diameter = 2 log(n)

x

x xe ⎟

⎞⎜⎝

⎛ +=∞→

11lim

Make this into a 3-ary tree

Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu

Page 87: 05-smallworlds - Stanford Universitysnap.stanford.edu › class › cs224w-2015 › slides › 05-smallworlds.pdf · Quiz Q:! What is true about a network where the probability of

Summary

¤Small world phenomenon:¤ Local structure (e.g. clustering)¤ Short average shortest path

¤The Watts-Strogatz captures both

¤Other models create navigable small-world models

¤Power-law networks are navigable due to presence of hubs