ecs289m spring, 2008 social network models s. felix wu computer science department university of...

67
ecs289m Spring, 2008 Social Network Models S. Felix Wu Computer Science Department University of California, Davis [email protected] http://www.cs.ucdavis.edu/ ~wu/

Post on 21-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

ecs289m Spring, 2008

Social Network Models

S. Felix WuComputer Science DepartmentUniversity of California, Davis

[email protected]://www.cs.ucdavis.edu/~wu/

SOURCE: Brandes, Raab and Wagner (2001)

<http://www.inf.uni-konstanz.de/~brandes/publications/brw-envsd-01.pdf>

Organization Chart

Activities of Actual Advice Seeking

Who is the most power?Can you determine that for OSN?

Real Social Organization

OECD Trade Flows 1981-1992

SOURCE: Lothar Krempel http://www.mpi-fg-koeln.mpg.de/~lk/netvis.html

9-11 Hijackers Network

SOURCE: Valdis Krebs http://www.orgnet.com/

03/14/2008 Davis Social Links 7

The Web ???

Social Network Analysis

“Structural relationships” as explanations:

• Network

• Formation

• Influence and collective actions

03/14/2008 Davis Social Links 9

Social Network Analysis

1. Degree Centrality: The number of direct connections a node has. What really matters is where those connections lead to and how they connect the otherwise unconnected.

2. Betweenness Centrality: A node with high betweenness has great influence over what flows in the network indicating important links and single point of failure.

3. Closeness Centrality: The measure of closeness of a node which are close to everyone else. The pattern of the direct and indirect ties allows the nodes any other node in the network more quickly than anyone else. They have the shortest paths to all others.

4. Eigenvector Centrality: It assigns relative scores to all nodes in the network based on the principle that connections to high-scoring nodes contribute more to the score of the node in question than equal connections to low-scoring nodes.

03/14/2008 Davis Social Links 10

Random Graphs

• G(n, p): n nodes and each edge with prob p

03/14/2008 Davis Social Links 11

Random Graphs

• G(n, p): n nodes and each edge with prob p

• When p < 1/n, disconnected components

• When p is sufficiently large, 1 giant component

• How about diameter?– The maximum distance (in hops) between

any two nodes.

03/14/2008 Davis Social Links 12

Random Graph (Erdos/Renyi)

• Probabilistically, each node has (N-1)p direct neighbors ~ Z

• ZD = N (D is the diameter)• D = logN / logZ

• In two hops, each node will have Z2 neighbors in (equal) probability?

03/14/2008 Davis Social Links 13

Small World Model

• Low Diameter– Logarithmic or poly-logarithmic to N

• “High” Cluster Coefficient– cluster coefficient: the portion of X’s

neighbors directly connecting to one of X’s other neighbors

03/14/2008 Davis Social Links 14

Cluster Coefficient

• Mesh network: Ccluster = 1

• Lattice Network (with degree K): Ccluster = 0– E.g., a linear line

• How about Ccluster for Random Graph?

03/14/2008 Davis Social Links 15

Re-wiring (Watts/Strogatz)

Trade off between D and Ccluster !

Structured/Clustered

03/14/2008 Davis Social Links 16

A Cycle plus a Random Matching

• A dual combinatorial problem:– For given integers n and k, find a graph on n

vertices with maximum degree k.

– For givens integers k and D, find a graph, with bounded degree k and diameter at most D, having as many vertices as possible.

• How?

03/14/2008 Davis Social Links 17

A Cycle plus a Random Matching

• Cycle & Random “disjointed” match

Bollobas/Chung: (logN) < D(G) < (logN + loglogN)

03/14/2008 Davis Social Links 18

Degree Centrality

• Degree distribution and the expected number of neighbors– Random graph (Poisson Distribution)

• Lower-law Tail for real world networks– P(k) ~ k-r

– Scale-free: invariant to the size of N

03/14/2008 Davis Social Links 19

Exponential Distribution

03/14/2008 Davis Social Links 20

Power Law (function or dist.)

f(x) = axk + o(xk)f(cx) = ?

03/14/2008 Davis Social Links 21

Zipf’s law

• Discrete Power-Law• Ranking in the frequency table

– {“the” (7%), “of” (3.5%), “and”, …}

• f(k;s,N) = k-s/(sum[n=1-N] n-s)

03/14/2008 Davis Social Links 22

Re-wiring (Watts/Strogatz)

Trade off between D and Ccluster !

Structured/Clustered

03/14/2008 Davis Social Links 23

Two Issues about Low Diameters

• Why should there exist short chains of acquaintances linking together arbitrary pairs of strangers?

• Why should arbitrary pairs of strangers be able to find the short chains of acquaintances that link them together?

03/14/2008 Davis Social Links 24

Kleinberg’s Basic setting

03/14/2008 Davis Social Links 25

p, q, r

• p: lattice distance between one node and all its local neighbors

• q: number of long range contacts• r: inverse probability [d(u,v)]-r

– What is the intuition about r?– What about r = 0

03/14/2008 Davis Social Links 26

Kleinberg’s results

A decentralized routing problem– For nodes s,t with known lattice

coordinates, find a short path from s to t. – At any step, can only use local

information, – Kleinberg suggests a simple greedy

algorithm and analyzes it:

03/14/2008 Davis Social Links 27

Local Information

• Local contacts• Coordinate for the target• The locations and long-range contacts

of all nodes that have come in contact with the message.

03/14/2008 Davis Social Links 28

Results

• If r = 0, expected delivery time is at least a0n2/3.– Lower bound

• If r = 2, p = q = 1, a2(log n)2

– Martel/Nguyen’s newer results

• 0 <= r < 2 ~ arn(2-r)/3

• r > 2 ~ arn(r-2)(r-1)

03/14/2008 Davis Social Links 29

Skip Lists

• The basic idea:

• Keep a doubly-linked list of elements– Min, max, successor, predecessor: O(1) time– Delete is O(1) time, Insert is O(1)+Search time

• During insert, add each level-i element to level i+1 with probability p (e.g., p = 1/2 or p = 1/4)

level 1

3 9 12 18 29 35 37

level 2

level 3

03/14/2008 Davis Social Links 30

Skip Graphs

• Based on “skip list”: – A randomized balanced tree structure organized as a

tower of increasingly sparse linked lists– All nodes join the link list of level 0– For other levels, each node joins with a fixed

probability p– Each node has 2/(1-p) pointers– Average search time: O(log(n/((1-p)*log1/p)))

03/14/2008 Davis Social Links 31

Skip Graph:

• Skip List is not suitable for P2P environment– No redundancy, Hotspot problem– Vulnerable to failure and contention

• Skip Graph: Extension of Skip List– Level 0 link list builds a Chord ring– Multiple (max 2i) lists for level i (i = 1, … logn)– Each node participate in all levels, but different lists– Membership vector m(x): decide which list to join– Every node sees its own skip list

03/14/2008 Davis Social Links 32

Degree Optimal P2P Routing

• Different routing schemes– Viceroy [MNR02]: emulates the butterfly network

• Constant degree• O(log n) hops for routing

– Constructions emulating De-Bruijn graphs• Can achieve any degree/number of hops tradeoff

– In particular degree O(log n) and O(log n/ log log n) hops

• Routing is not greedy– Recent construction [AM] fixes that.

• Even if target and source are close in label space message might be routed away

• No (natural) prefix search– Random keys are necessary.

03/14/2008 Davis Social Links 33

Skip – Graphs [AS02],[HDJ+03]

• Each node (resource) has a name.• Nodes are arranged on a line sorted by name.

• Each node chooses a random string of bits.• An edge is established if two nodes share a prefix which is

not shared by the nodes between them.• Allows prefix search.

0 1 110011

1 1100 00

0 1 0

a b c fed

03/14/2008 Davis Social Links 34

Routing in Skip – Graphs

• Greedy Routing – use longest edge possible.• Path length is (log n) w.h.p.

• The NoN algorithm optimizes over two hops.

0 1 110011

1 1100 00

0 1 0

Theorem: Using the NoN algorithm, the expected path length of any lookup is .

03/14/2008 Davis Social Links 35

Kleinberg’s Lattice Model

• Graph embedded in a metric space (e.g., 2D lattice)

• “Search efficiently” using only Local information + long range contact(s)– ~ inverse probability [d(u,v)]-r

– r = 2, a special case

03/14/2008 Davis Social Links 36

Some Extensions

• Hierarchical Network Models• Group Structure Models• Constant Number of Out-Links

“Small World Phenomena and the Dynamics of Information” by J. Kleinberg, NIPS, 2001

03/14/2008 Davis Social Links 37

Generation & Search

• There is a data structure behind and among all the social peers– Lattice, Tree, Group/Community

• The link probability depends on this “social data structure”– And, using it to generate the social network

• Searching may use “direct contacts” plus the knowledge about the social data structure

03/14/2008 Davis Social Links 38

Hierarchical Network Models

• Representation– a complete b-ary tree, T– All social nodes are “leaves”

• Distance and Link Probability– = the height of the least common

ancestor of v and w in T– probability proportional– normalization in probability

– out-degree in graph

f (h(v,w))

f (h(v,x))x≠v

∑€

f (h(v,w))€

h(v,w)

k = c log2 n

03/14/2008 Davis Social Links 39

the Critical Value

h →∞lim

f (h)

b− ′ α h= 0,∀ ′ α < α

h →∞lim

b− ′ ′ α h

f (h)= 0,∀ ′ ′ α > α

f (h(v,w)) ~ b−αh(v,w )

03/14/2008 Davis Social Links 40

Interpretation (1)

• /Science/Computer_Science/Algorithms

• /Arts/Music/Opera

• /Science/Computer_Science/Machine_Learning

03/14/2008 Davis Social Links 41

Interpretation (2)

• Target: “stock broker @ Boston, MA”

• Next hop:– “bishop @ Cambridge, MA”– “banker @ New York City, NY”

03/14/2008 Davis Social Links 42

Results

• Otherwise, no polylogarithmic search

α =1⇒ Ο(logn)

03/14/2008 Davis Social Links 43

How to Search in HNM??

f (h(v,w)) ~ b−h(v,w )

f (h(v,w))

f (h(v,x))x≠v

∑€

h(v,w)

k = c log2 n

03/14/2008 Davis Social Links 44

Useful Neighbor

v → t

v, t ∈ T

commonAncestor(v, t) = u

Height( ′ T ) = i,u∈ ′ T ,root( ′ T ) = u

Height( ′ ′ T ) = (i −1), t ∈ ′ ′ T ∧t ∉ ′ ′ T

Is “v” useful to reach “t”?

v t

T

03/14/2008 Davis Social Links 45

Useful Neighbor

v → t

v, t ∈ T

commonAncestor(v, t) = u

Height( ′ T ) = i,u∈ ′ T ,root( ′ T ) = u

Height( ′ ′ T ) = (i −1), t ∈ ′ ′ T ∧t ∉ ′ ′ T

Is “v” useful to reach “t”?

v

u

t

T

′ T

03/14/2008 Davis Social Links 46

Useful Neighbor

v → t

v, t ∈ T

commonAncestor(v, t) = u

Height( ′ T ) = i,u∈ ′ T ,root( ′ T ) = u

Height( ′ ′ T ) = (i −1), t ∈ ′ ′ T ∧t ∉ ′ ′ T

Is “v” useful to reach “t”?

v

u

t

T

′ T

′ ′ T

w

03/14/2008 Davis Social Links 47

Useful Neighbor

v → t

v, t ∈ T

commonAncestor(v, t) = u

Height( ′ T ) = i,u∈ ′ T ,root( ′ T ) = u

Height( ′ ′ T ) = (i −1), t ∈ ′ ′ T ∧t ∉ ′ ′ T

Is “v” useful to reach “t”?

v

u

t

T

′ T

′ ′ T

w

03/14/2008 Davis Social Links 48

Useful Neighbor Recursively

v → t

v, t ∈ T

commonAncestor(v, t) = u

Height( ′ T ) = i,u∈ ′ T ,root( ′ T ) = u

Height( ′ ′ T ) = (i −1), t ∈ ′ ′ T ∧t ∉ ′ ′ T

Is “v” useful to reach “t”?

v

u

T

′ T

′ ′ T

w t

03/14/2008 Davis Social Links 49

Search

• Find one “useful” neighbor in G as the next step

• What happens if NO useful neighbor?• Expected steps to reach “t”.

03/14/2008 Davis Social Links 50

Probability to have 1 U.N.

Z = b−h(v,x )

x≠v

∑ = (b −1)b j−1

j=1

log n

∑ b− j ≤ logn

bi−1leaves∈ ′ ′ T

b−i

logn

bi−1 ×b−i

logn=

1

b log n

(1−1

b log n)c log2 n ≤ n−θ

One leave

All out-links

03/14/2008 Davis Social Links 51

HNM

• High probability to be useful• How about “constant links”?

03/14/2008 Davis Social Links 52

Group Structures

• R is a group; R’ is a strict smaller subgroup

• R1, R2,R3,… all contain v, then

• q(v,w): minimum size of a group containing both v and w

q = R ≥ 2,v ∈ R ⇒ (v ∈ ′ R ⊆R)∧(q = R > ′ R > λq)

∀i,( Ri ≤ q)∧(v ∈ Ri)⇒i

URi ≤ βq

03/14/2008 Davis Social Links 53

How to Search in Group Structure??

f (q(v,w)) ~ q(v,w)−α

f (q(v,w))

f (q(v,x))x≠v

∑€

q(v,w)

k = c log2 n

03/14/2008 Davis Social Links 54

Idea

• (v, t)• R is the minimum-sized group containing both

v and t.• With property (1)

• Then:

q = R ≥ 2,v ∈ R ⇒ (v ∈ ′ R ⊆R)∧(q = R > ′ R > λq)

∃ ′ R ⇒ (t ∈ ′ R )∧(λ2 R < ′ R < λ R )

How to define “usefulness” of v?

03/14/2008 Davis Social Links 55

Usefulness of v

• (v, t)• R is the minimum-sized group containing both

v and t.• With property (1)

• Then:

q = R ≥ 2,v ∈ R ⇒ (v ∈ ′ R ⊆R)∧(q = R > ′ R > λq)

∃ ′ R ⇒ (t ∈ ′ R )∧(λ2 R < ′ R < λ R )

∃x,(l(v, x) =1)∧(x ∈ ′ R )

03/14/2008 Davis Social Links 56

Probability to have 1 U.N.

Z = b−h(v,x )

x≠v

∑ = (b −1)b j−1

j=1

log n

∑ b− j ≤ logn

bi−1leaves∈ ′ ′ T

b−i

logn

bi−1 ×b−i

logn=

1

b log n

(1−1

b log n)c log2 n ≤ n−θ

One leave

All out-links

03/14/2008 Davis Social Links 57

Probability to have 1 U.N.

Z =1

q(v,x)x≠v

∑ ≤ β j +1

j=1

log n

∑ β −( j−1) = β 2 logβ n

(1−λ2

β 2 logβ n)c log2 n ≤ n−θ

03/14/2008 Davis Social Links 58

Results

• Otherwise, no polylogarithmic search

α =1⇒ Ο(logn)

03/14/2008 Davis Social Links 59

Fixed Number of Out-Links

• Relax “t” to “a cluster of t”

v t

T

Cl Cl

T

tx

vw€

m = L

r = Cluster

n = m × r

r: Resolution

03/14/2008 Davis Social Links 60

Question #1

• Why can’t we just treat “Cluster” as “Super Node” and we go home (by applying the HNM results)?

Cl Cl

T

tx

vw€

m = L

r = Cluster

n = m × r

03/14/2008 Davis Social Links 61

Not necessarily

Cl Cl

tx

vw

Cl

pq

03/14/2008 Davis Social Links 62

Probability

f (h(v,w)) ~ (h(v,w) +1)−2b−h(v,w )

Z ≤ 2r

03/14/2008 Davis Social Links 63

Question #2

• For any out-link of v, what is the probability that the end point of the out-link is in the same cluster of v?

03/14/2008 Davis Social Links 64

Answer

(0 +1)−2b−0 =1

1× r

Z≥

r

2r=

1

2

03/14/2008 Davis Social Links 65

Results

• If the resolution is polylogarithmic, the the search is polylogarithmic if alpha = 1.

03/14/2008 Davis Social Links 66

A “Similar” Process

v

u

T

′ T

′ ′ T

w t

Coloring the Links

03/14/2008 Davis Social Links 67

Reading

• “Small World Phenomena and the Dynamics of Information” by J. Kleinberg, NIPS, 2001