ecs289m spring, 2008 social network models s. felix wu computer science department university of...
Post on 21-Dec-2015
219 views
TRANSCRIPT
ecs289m Spring, 2008
Social Network Models
S. Felix WuComputer Science DepartmentUniversity of California, Davis
[email protected]://www.cs.ucdavis.edu/~wu/
SOURCE: Brandes, Raab and Wagner (2001)
<http://www.inf.uni-konstanz.de/~brandes/publications/brw-envsd-01.pdf>
Organization Chart
Social Network Analysis
“Structural relationships” as explanations:
• Network
• Formation
• Influence and collective actions
03/14/2008 Davis Social Links 9
Social Network Analysis
1. Degree Centrality: The number of direct connections a node has. What really matters is where those connections lead to and how they connect the otherwise unconnected.
2. Betweenness Centrality: A node with high betweenness has great influence over what flows in the network indicating important links and single point of failure.
3. Closeness Centrality: The measure of closeness of a node which are close to everyone else. The pattern of the direct and indirect ties allows the nodes any other node in the network more quickly than anyone else. They have the shortest paths to all others.
4. Eigenvector Centrality: It assigns relative scores to all nodes in the network based on the principle that connections to high-scoring nodes contribute more to the score of the node in question than equal connections to low-scoring nodes.
03/14/2008 Davis Social Links 11
Random Graphs
• G(n, p): n nodes and each edge with prob p
• When p < 1/n, disconnected components
• When p is sufficiently large, 1 giant component
• How about diameter?– The maximum distance (in hops) between
any two nodes.
03/14/2008 Davis Social Links 12
Random Graph (Erdos/Renyi)
• Probabilistically, each node has (N-1)p direct neighbors ~ Z
• ZD = N (D is the diameter)• D = logN / logZ
• In two hops, each node will have Z2 neighbors in (equal) probability?
03/14/2008 Davis Social Links 13
Small World Model
• Low Diameter– Logarithmic or poly-logarithmic to N
• “High” Cluster Coefficient– cluster coefficient: the portion of X’s
neighbors directly connecting to one of X’s other neighbors
03/14/2008 Davis Social Links 14
Cluster Coefficient
• Mesh network: Ccluster = 1
• Lattice Network (with degree K): Ccluster = 0– E.g., a linear line
• How about Ccluster for Random Graph?
03/14/2008 Davis Social Links 15
Re-wiring (Watts/Strogatz)
Trade off between D and Ccluster !
Structured/Clustered
03/14/2008 Davis Social Links 16
A Cycle plus a Random Matching
• A dual combinatorial problem:– For given integers n and k, find a graph on n
vertices with maximum degree k.
– For givens integers k and D, find a graph, with bounded degree k and diameter at most D, having as many vertices as possible.
• How?
03/14/2008 Davis Social Links 17
A Cycle plus a Random Matching
• Cycle & Random “disjointed” match
Bollobas/Chung: (logN) < D(G) < (logN + loglogN)
03/14/2008 Davis Social Links 18
Degree Centrality
• Degree distribution and the expected number of neighbors– Random graph (Poisson Distribution)
• Lower-law Tail for real world networks– P(k) ~ k-r
– Scale-free: invariant to the size of N
03/14/2008 Davis Social Links 21
Zipf’s law
• Discrete Power-Law• Ranking in the frequency table
– {“the” (7%), “of” (3.5%), “and”, …}
• f(k;s,N) = k-s/(sum[n=1-N] n-s)
03/14/2008 Davis Social Links 22
Re-wiring (Watts/Strogatz)
Trade off between D and Ccluster !
Structured/Clustered
03/14/2008 Davis Social Links 23
Two Issues about Low Diameters
• Why should there exist short chains of acquaintances linking together arbitrary pairs of strangers?
• Why should arbitrary pairs of strangers be able to find the short chains of acquaintances that link them together?
03/14/2008 Davis Social Links 25
p, q, r
• p: lattice distance between one node and all its local neighbors
• q: number of long range contacts• r: inverse probability [d(u,v)]-r
– What is the intuition about r?– What about r = 0
03/14/2008 Davis Social Links 26
Kleinberg’s results
A decentralized routing problem– For nodes s,t with known lattice
coordinates, find a short path from s to t. – At any step, can only use local
information, – Kleinberg suggests a simple greedy
algorithm and analyzes it:
03/14/2008 Davis Social Links 27
Local Information
• Local contacts• Coordinate for the target• The locations and long-range contacts
of all nodes that have come in contact with the message.
03/14/2008 Davis Social Links 28
Results
• If r = 0, expected delivery time is at least a0n2/3.– Lower bound
• If r = 2, p = q = 1, a2(log n)2
– Martel/Nguyen’s newer results
• 0 <= r < 2 ~ arn(2-r)/3
• r > 2 ~ arn(r-2)(r-1)
03/14/2008 Davis Social Links 29
Skip Lists
• The basic idea:
• Keep a doubly-linked list of elements– Min, max, successor, predecessor: O(1) time– Delete is O(1) time, Insert is O(1)+Search time
• During insert, add each level-i element to level i+1 with probability p (e.g., p = 1/2 or p = 1/4)
level 1
3 9 12 18 29 35 37
level 2
level 3
03/14/2008 Davis Social Links 30
Skip Graphs
• Based on “skip list”: – A randomized balanced tree structure organized as a
tower of increasingly sparse linked lists– All nodes join the link list of level 0– For other levels, each node joins with a fixed
probability p– Each node has 2/(1-p) pointers– Average search time: O(log(n/((1-p)*log1/p)))
03/14/2008 Davis Social Links 31
Skip Graph:
• Skip List is not suitable for P2P environment– No redundancy, Hotspot problem– Vulnerable to failure and contention
• Skip Graph: Extension of Skip List– Level 0 link list builds a Chord ring– Multiple (max 2i) lists for level i (i = 1, … logn)– Each node participate in all levels, but different lists– Membership vector m(x): decide which list to join– Every node sees its own skip list
03/14/2008 Davis Social Links 32
Degree Optimal P2P Routing
• Different routing schemes– Viceroy [MNR02]: emulates the butterfly network
• Constant degree• O(log n) hops for routing
– Constructions emulating De-Bruijn graphs• Can achieve any degree/number of hops tradeoff
– In particular degree O(log n) and O(log n/ log log n) hops
• Routing is not greedy– Recent construction [AM] fixes that.
• Even if target and source are close in label space message might be routed away
• No (natural) prefix search– Random keys are necessary.
03/14/2008 Davis Social Links 33
Skip – Graphs [AS02],[HDJ+03]
• Each node (resource) has a name.• Nodes are arranged on a line sorted by name.
• Each node chooses a random string of bits.• An edge is established if two nodes share a prefix which is
not shared by the nodes between them.• Allows prefix search.
0 1 110011
1 1100 00
0 1 0
a b c fed
03/14/2008 Davis Social Links 34
Routing in Skip – Graphs
• Greedy Routing – use longest edge possible.• Path length is (log n) w.h.p.
• The NoN algorithm optimizes over two hops.
0 1 110011
1 1100 00
0 1 0
Theorem: Using the NoN algorithm, the expected path length of any lookup is .
03/14/2008 Davis Social Links 35
Kleinberg’s Lattice Model
• Graph embedded in a metric space (e.g., 2D lattice)
• “Search efficiently” using only Local information + long range contact(s)– ~ inverse probability [d(u,v)]-r
– r = 2, a special case
03/14/2008 Davis Social Links 36
Some Extensions
• Hierarchical Network Models• Group Structure Models• Constant Number of Out-Links
“Small World Phenomena and the Dynamics of Information” by J. Kleinberg, NIPS, 2001
03/14/2008 Davis Social Links 37
Generation & Search
• There is a data structure behind and among all the social peers– Lattice, Tree, Group/Community
• The link probability depends on this “social data structure”– And, using it to generate the social network
• Searching may use “direct contacts” plus the knowledge about the social data structure
03/14/2008 Davis Social Links 38
Hierarchical Network Models
• Representation– a complete b-ary tree, T– All social nodes are “leaves”
• Distance and Link Probability– = the height of the least common
ancestor of v and w in T– probability proportional– normalization in probability
– out-degree in graph
€
f (h(v,w))
f (h(v,x))x≠v
∑€
f (h(v,w))€
h(v,w)
€
k = c log2 n
03/14/2008 Davis Social Links 39
the Critical Value
€
h →∞lim
f (h)
b− ′ α h= 0,∀ ′ α < α
€
h →∞lim
b− ′ ′ α h
f (h)= 0,∀ ′ ′ α > α
€
f (h(v,w)) ~ b−αh(v,w )
03/14/2008 Davis Social Links 40
Interpretation (1)
• /Science/Computer_Science/Algorithms
• /Arts/Music/Opera
• /Science/Computer_Science/Machine_Learning
03/14/2008 Davis Social Links 41
Interpretation (2)
• Target: “stock broker @ Boston, MA”
• Next hop:– “bishop @ Cambridge, MA”– “banker @ New York City, NY”
03/14/2008 Davis Social Links 43
How to Search in HNM??
€
f (h(v,w)) ~ b−h(v,w )
€
f (h(v,w))
f (h(v,x))x≠v
∑€
h(v,w)
€
k = c log2 n
03/14/2008 Davis Social Links 44
Useful Neighbor
€
v → t
v, t ∈ T
commonAncestor(v, t) = u
Height( ′ T ) = i,u∈ ′ T ,root( ′ T ) = u
Height( ′ ′ T ) = (i −1), t ∈ ′ ′ T ∧t ∉ ′ ′ T
Is “v” useful to reach “t”?
v t
€
T
03/14/2008 Davis Social Links 45
Useful Neighbor
€
v → t
v, t ∈ T
commonAncestor(v, t) = u
Height( ′ T ) = i,u∈ ′ T ,root( ′ T ) = u
Height( ′ ′ T ) = (i −1), t ∈ ′ ′ T ∧t ∉ ′ ′ T
Is “v” useful to reach “t”?
v
u
t
€
T
€
′ T
03/14/2008 Davis Social Links 46
Useful Neighbor
€
v → t
v, t ∈ T
commonAncestor(v, t) = u
Height( ′ T ) = i,u∈ ′ T ,root( ′ T ) = u
Height( ′ ′ T ) = (i −1), t ∈ ′ ′ T ∧t ∉ ′ ′ T
Is “v” useful to reach “t”?
v
u
t
€
T
€
′ T
€
′ ′ T
w
03/14/2008 Davis Social Links 47
Useful Neighbor
€
v → t
v, t ∈ T
commonAncestor(v, t) = u
Height( ′ T ) = i,u∈ ′ T ,root( ′ T ) = u
Height( ′ ′ T ) = (i −1), t ∈ ′ ′ T ∧t ∉ ′ ′ T
Is “v” useful to reach “t”?
v
u
t
€
T
€
′ T
€
′ ′ T
w
03/14/2008 Davis Social Links 48
Useful Neighbor Recursively
€
v → t
v, t ∈ T
commonAncestor(v, t) = u
Height( ′ T ) = i,u∈ ′ T ,root( ′ T ) = u
Height( ′ ′ T ) = (i −1), t ∈ ′ ′ T ∧t ∉ ′ ′ T
Is “v” useful to reach “t”?
v
u
€
T
€
′ T
€
′ ′ T
w t
03/14/2008 Davis Social Links 49
Search
• Find one “useful” neighbor in G as the next step
• What happens if NO useful neighbor?• Expected steps to reach “t”.
03/14/2008 Davis Social Links 50
Probability to have 1 U.N.
€
Z = b−h(v,x )
x≠v
∑ = (b −1)b j−1
j=1
log n
∑ b− j ≤ logn
bi−1leaves∈ ′ ′ T
b−i
logn
bi−1 ×b−i
logn=
1
b log n
(1−1
b log n)c log2 n ≤ n−θ
One leave
All out-links
03/14/2008 Davis Social Links 52
Group Structures
• R is a group; R’ is a strict smaller subgroup
• R1, R2,R3,… all contain v, then
• q(v,w): minimum size of a group containing both v and w
€
q = R ≥ 2,v ∈ R ⇒ (v ∈ ′ R ⊆R)∧(q = R > ′ R > λq)
€
∀i,( Ri ≤ q)∧(v ∈ Ri)⇒i
URi ≤ βq
03/14/2008 Davis Social Links 53
How to Search in Group Structure??
€
f (q(v,w)) ~ q(v,w)−α
€
f (q(v,w))
f (q(v,x))x≠v
∑€
q(v,w)
€
k = c log2 n
03/14/2008 Davis Social Links 54
Idea
• (v, t)• R is the minimum-sized group containing both
v and t.• With property (1)
• Then:
€
q = R ≥ 2,v ∈ R ⇒ (v ∈ ′ R ⊆R)∧(q = R > ′ R > λq)
€
∃ ′ R ⇒ (t ∈ ′ R )∧(λ2 R < ′ R < λ R )
How to define “usefulness” of v?
03/14/2008 Davis Social Links 55
Usefulness of v
• (v, t)• R is the minimum-sized group containing both
v and t.• With property (1)
• Then:
€
q = R ≥ 2,v ∈ R ⇒ (v ∈ ′ R ⊆R)∧(q = R > ′ R > λq)
€
∃ ′ R ⇒ (t ∈ ′ R )∧(λ2 R < ′ R < λ R )
€
∃x,(l(v, x) =1)∧(x ∈ ′ R )
03/14/2008 Davis Social Links 56
Probability to have 1 U.N.
€
Z = b−h(v,x )
x≠v
∑ = (b −1)b j−1
j=1
log n
∑ b− j ≤ logn
bi−1leaves∈ ′ ′ T
b−i
logn
bi−1 ×b−i
logn=
1
b log n
(1−1
b log n)c log2 n ≤ n−θ
One leave
All out-links
03/14/2008 Davis Social Links 57
Probability to have 1 U.N.
€
Z =1
q(v,x)x≠v
∑ ≤ β j +1
j=1
log n
∑ β −( j−1) = β 2 logβ n
(1−λ2
β 2 logβ n)c log2 n ≤ n−θ
03/14/2008 Davis Social Links 59
Fixed Number of Out-Links
• Relax “t” to “a cluster of t”
v t
€
T
Cl Cl
€
T
tx
vw€
m = L
r = Cluster
n = m × r
r: Resolution
03/14/2008 Davis Social Links 60
Question #1
• Why can’t we just treat “Cluster” as “Super Node” and we go home (by applying the HNM results)?
Cl Cl
€
T
tx
vw€
m = L
r = Cluster
n = m × r
03/14/2008 Davis Social Links 63
Question #2
• For any out-link of v, what is the probability that the end point of the out-link is in the same cluster of v?
03/14/2008 Davis Social Links 65
Results
• If the resolution is polylogarithmic, the the search is polylogarithmic if alpha = 1.