minicourse on network science

A Mini-Course on Network Science Pavel Loskot [email protected]

Upload: pavel-loskot

Post on 23-Jan-2017




0 download


Page 1: Minicourse on Network Science

A Mini-Course on Network Science

Pavel [email protected]

Page 2: Minicourse on Network Science

Pavel Loskot c©2014 1/3

Course Outline

1. Introduction

• fundamentals of complex systems and graph theory

2. Structure

• sub-graphs, centrality measures, weighted networks, community

3. Random Models

• random, small world and scale free networks

4. Robustness

• some definitions and metrics

5. Processes

• epidemic spreading and information cascades

6. Algorithms

• max flow and min cut, routing, search and navigation

7. Software

• using Matlab and Python, available software, few demos from YouTube

Page 3: Minicourse on Network Science

Pavel Loskot c©2014 2/3

Used Resources

Ernesto Estrada

The Structure of Complex Networks: Theory and Applications

Oxford University Press, 2011

Cecilia Mascolo

Social and Technological Network Analysis

Course at University of Cambridge, UK

Jari Saramaki

Introduction to Complex Networks

Aalto University, Finland

Animesh Mukherjee

Complex Network Theory

IIT Kharagpur, India

Page 4: Minicourse on Network Science

Pavel Loskot c©2014 3/3

Used Resources

Robert Leese

An Introduction to Clustering

Industrial Mathematics Knowledge Transfer Network

Kevin Wayne

Max Flow, Min Cut

Princeton University, USA

James F. Kurose and Keith W. Ross

Computer Networking, A Top-Down Approach

Pearson Education, 2012


various topics

Page 5: Minicourse on Network Science

Networks: Introduction

Page 6: Minicourse on Network Science

Pavel Loskot c©2014 1/22

Complex SystemsEmergence of complexity:

• locally simple rules, and yetglobally complex behavior

• systems evolve, are dynamic andadapt to the environment


• infinitely many possibilities

• normally data-driven, but what datato collect?

Emergence of stochasticity:

God doesn’t play dice with the world.

• many entities, complex interactions

• often useful to describe observationsstatistically (joint PDF, correlations)

• human beings are living at the edgeof stochastic and deterministic world

Page 7: Minicourse on Network Science

Pavel Loskot c©2014 2/22

Illustration of Complexity

Simple idea:

• send packets between two nodes


• how to distinguish end-nodes?

• how to find the route?

• how to share network (resources)among billions end-nodes?

• how to deal with lost and delayedpackets?

• how to deal with mobility and nodesleaving and arriving?


• evolution - solve problems iteratively

• separation - divide and conquer

• new problems emerge as networkgrowths: scalability, stability, security

Page 8: Minicourse on Network Science

Pavel Loskot c©2014 3/22

Emergence of Order

Differing (spatial-temporal) perspectives

• insider: interacting with immediate neighbors (immediate, local)

• outsider: system level perception (average, global)

Page 9: Minicourse on Network Science

Pavel Loskot c©2014 4/22

Description of Networks

1. Complete: everybody connected with everybody else

2. Random: connections selected arbitrarily at random

3. Random tree: connections selected arbitrarily at random, no cycles allowed

4. Real-world networks:

• exponential degree distribution and strongly disassortative• small average path length and high clustering coefficient• several nodes with high (degree, closeness and betweenness) centrality• several main communities. . . and many other distinctive characteristics

Challenge: how to synthesize real-world networks with all these properties?

Page 10: Minicourse on Network Science

Pavel Loskot c©2014 5/22

Formal DefinitionsNetwork

• graph model of functional and/or structural relationships of a complex system

Time-invariant network

• graph G = (V, E) where set of nodes V = {v1, . . . , vN}, and set of edges (links)E =⊂ V ⊗ V = {e1, . . . , eL}, i.e., every edge el ∈ E is associated with one pair(vi, v j) ∈ V ⊗ V, or in other words, E is a set of (un)ordered pairs from V

• let’s not allow self-edges ([vi, vi] < E) and duplicate-edges (E has uniqueelements)

• nodes and edges are objects, but for analysis and evaluation purposes, weneed numbers, i.e., assign numbers (called weights) to nodes and edges

vn → Wv(n) el → We(l) el = [vi, v j] = Wv(i, j)

Dynamic networks

• graphs (nodes, edges) aswell as weights can varyover time

Page 11: Minicourse on Network Science

Pavel Loskot c©2014 6/22

Formal Definitions (cont.)

Graph edges (in structural models)

• only if two nodes communicate; this communication can be implemented inmany different ways (radiation, material transport flows, . . .)

• communicating nodes interact i.e. influence each other’s behavior

• communications are, first of all, information flows:

Two nodes communicate

• if there is enough information delivered (just sent is not enough) over a giventime-window i.e. communication is integral (average) quantity

• delivered information may be ignored, not recognized, or misinterpreted

Page 12: Minicourse on Network Science

Pavel Loskot c©2014 7/22

Fundamentals (of Graph Theory)(Un)directed graphs

• for directed graphs, E is a set of ordered pairs [u, v] ∈ V ⊗ V

Neighbors, degrees

• u is neighbor of v if (u, v) ∈ E, then u and v are said to be adjacent nodes

• (u,w), (v,w), (y,w) and (x,w) are adjacent edges which are incident at w

• in-degree kin, out-degree kout and k = kin + kout degree distributions are

important statistics (this assumes all edges counted with unit weights)

Page 13: Minicourse on Network Science

Pavel Loskot c©2014 8/22


Isomorphic graphs

• G1 and G2 are isomorphic if one-to-one mapping of vertices and (possiblydirected) edges (i.e., different visualizations of the same graph)

Edge (or connection) density

ρ =|E|(

|V |


) =


|V |(|V | − 1)


|V |



=|V |(|V |−1)

2is the maximum possible number of edges

• ρ = 1 if fully connected, real-world network ρ ≪ 1 (i.e. sparse)

• graph is sparse if |E| ≈ |V |, graph is dense if |E| ≈ |V |2


• G = (V , E) is a subgraph of G = (V, E) if V ⊆ V and E ⊆ E

• clique is a maximal, completely connected subgraph of the graph

• N-clique is a fully connected subgraph with N vertices

• clique number is the size of the largest clique in the graph

Page 14: Minicourse on Network Science

Pavel Loskot c©2014 9/22


Path, walk, trial

• path from v1 to vL is an ordered sequence of edges between ordered list of

vertices such that no vertex is visited twice

• length of path is the number of its edges (i.e. assuming edges of unit length)

• if there is no path between two vertices, their path length is infinite

• distance of two vertices is their shortest path (having the smallest length)

• walk of length L from v1 to vL+1 is a sequence [v1, . . . , vL+1] where twosubsequent (only those) vertices are required to be different

• trial is a walk with no repeated edge

• cycle is a path that starts and ends at the same vertex

Diameter (d) and radius (r) of a graph

• the longest and the smallest shortest path, respectively:

d = maxu,v∈V

distance(u, v) r = minu,v∈V

distance(u, v)

Page 15: Minicourse on Network Science

Pavel Loskot c©2014 10/22


Average path length

• it is the average shortest path between all pairs of vertices

d =1


|V |




distance(u, v)

• if some vertices u and v are disconnected (i.e., no path connecting u and v),the average path length is harmonic mean instead (its reciprocal)

d =



|V |





distance(u, v)


Graph coloring

• assign labels to vertices, so that no adjacent vertices get the same label

• chromatic number is the minimum number of colors to solve the coloring

Page 16: Minicourse on Network Science

Pavel Loskot c©2014 11/22



• connected component is a subgraph where there is a path between every

pair of vertices; for directed graphs, the directions can be ignored

• connected graph if there is a path between every pair of its vertices; in other

words, the graph contains a single connected component

• (sub)graphs not connected are disconnected

• node connectivity is the smallest number of vertices when (they are)

removed, the graph becomes disconnected

• edge connectivity is the smallest number of edges when (they are) removed,

the graph becomes disconnected

• strongly connected component if its every vertex is reachable from any other

of its vertex (i.e., edge directions matter here)


• vertex cutset is a set of vertices when removed disconnect the graph (i.e.increases the number of graph components); they are also known asarticulation points or brokers (in social networks)

• edge cutset is a set of edges when removed disconnect the graph

Page 17: Minicourse on Network Science

Pavel Loskot c©2014 12/22


• vertices G and H in graph below are cutset vertices

• bridges: if removed, the number of graph components increases

Basic graphs

Page 18: Minicourse on Network Science

Pavel Loskot c©2014 13/22



• connected graph with no cycles (adding only one link creates a cycle)

• becomes disconnected by removing any single link

• any pair of nodes is connected by exactly one path

• spanning tree is subgraph of a network including all its nodes and it is a tree

R-regular graph

• all vertices have degree R and there are |E| = R|V |/2 edges

Planar graph

• can be drawn in a 2D plane such that no two edges intersect

• among all complete graphs Cn, only C1, C2, C3 and C4 are planar

• example of embeddings of C4

Page 19: Minicourse on Network Science

Pavel Loskot c©2014 14/22

Basic Graphs

Bipartite networks

• two sets of vertices, only edges between vertices in these two sets allowed

• graph is bipartite if it does not contain any odd cycles

• generalization to more than two sets of vertices is possible

Graph matching

• given graph G = (V, E), a matching M ⊆ E in G is a set of edges not sharingany common vertex

• maximal matching any edge

added to M will violate matching

• maximum matching contains the

largest number of edges

Page 20: Minicourse on Network Science

Pavel Loskot c©2014 15/22

Adjacency Matrix

• (binary) adjacency matrix [A]i j =


1 if [vi, v j] ∈ E

0 otherwise

• for undirected graphs, A is symmetric (i.e. A = AT )

k = (1TA)T= A1 degree distribution

• for directed graphs, A is asymmetric

kin= (1TA)T in-degree distribution

kout= A1 out-degree distribution

• average degree of a graph

k =2|E|

|V |=


|V |=


|V |=

u∈V k(u)

|V |

Page 21: Minicourse on Network Science

Pavel Loskot c©2014 16/22

Adjacency Matrix

• for undirected bipartite graphs with vertices V = V1 ∪ V2, |V1| = n1, |V2| = n2,

A =






• generally, An, n = 1, 2, . . . denotes the number of paths of length n in graph,i.e., [An]i j is the number of distinct n-hop paths between vertices i and j

• [ATA]i j and [AAT ]i j is the number of vertices connected to/from thevertices vi and v j at the same time, respectively

• tr{


/6 is the number of triangles in the matrix

• open and closed triangles

• closed triangle represents 6 closed triplets (starting at each of 3 vertices in2 directions)

Incidence matrix

• |V | × |E| matrix, [B]i j =


1 if vi ∈ e j

0 otherwise

• degree matrix is a diagonal matrix D = diag(

k1, . . . , k|V |)

• adjacency matrix can be also expressed as A = BBT −D

Page 22: Minicourse on Network Science

Pavel Loskot c©2014 17/22

Adjacency Matrix

Graph spectrum

• recall that for undirected graph, adjacency matrix is symmetric, so itseigenvalues are real-valued and referred to as graph spectrum

• eigenvalue λ and eigenvector v satisfy Av = λv, i.e., (λI −A)v = 0

• characteristic polynomial pA(t) = det(tI −A) =∏

i(t − λi) of matrix A has

the roots the eigenvalues of A

• Laplacian matrix of graph G is L =D−A (degree minus adjacency matrix):

[L]i j =

[D]ii = k(i) i = j

−1 i , j and [vi, v j] ∈ E

0 otherwise

and the spectrum of graph G are eigenvalues of L (rather than of A)

Properties of Laplacian

• multiplicity of λ0 = 0 of L is the number of connected components of G

• eigenvalue λ0 = 0 corresponds to eigenvector v0 = [1, . . . , 1]T , i.e., Lv0 = 0

• L = BBT where B is |V | × |E| incidence matrix of graph G = (V, E)

Page 23: Minicourse on Network Science

Pavel Loskot c©2014 18/22

Power-Law Distribution

• long-tail (right) with many low-connected vertices (left) (80-20 rule)

• many real-world networks experience this degree distribution, so they havestar-like topology

• also known as scale-free distribution of scale-free networks; these networksare self-similar at different (spatial-temporal) scales

p(k) = A k−γ → p(c · k) = A c−γ · p(k)

• cumulative degree distribution (CDD)

P(k) =



p(k′) ≈ k−(γ−1)

(the probability the degree at least k)

Page 24: Minicourse on Network Science

Pavel Loskot c©2014 19/22

Analyzing Degree Distributions

Degree-degree correlations

• assortivity coefficient or Pearson correlation coefficient (r)

Assortative mixing (r > 0)

• bias towards connections between nodes with similar characteristics (hubstend to connect to each other)

• useful, e.g. to understand spread of diseases and their treatment

Disassortative mixing (r < 0)

• dissimilar nodes tend to connect to each other (hubs avoid each other)

Neutral mixing (r = 0)

• connections follow some probability distribution

Page 25: Minicourse on Network Science

Pavel Loskot c©2014 20/22

Analyzing Degree Distributions


• let pi j be the probability of edge to have degrees ki and k j at both ends∑

i j

pi j = 1∑


pi j = q j =k jp j

j k jp j

• perfectly assortative networks have pi j = qiδi j (only nodes of the same

degree connect)

• if degrees independent, then pi j = qiq j

• Pearson coefficient, −1 ≤ r ≤ 1

r =E[

(ki − E[ki])(k j − E[

k j



σ2(ki)σ2(k j)=

i, j kik j(pi j − qiq j)


i, j kik j(pi j − qiq j)=

i, j kik j(pi j − qiq j)∑

i, j kik j(δi j − qi)q j

Page 26: Minicourse on Network Science

Pavel Loskot c©2014 21/22

Degree Distributions

Degree-degree correlation

• directed graphs (networks)


• for large graphs, edges (topology) can be considered statistically

• degree distribution is partial statistical description (of topology)

• degree-degree correlation is more informative, but still incomplete info

Page 27: Minicourse on Network Science

Pavel Loskot c©2014 22/22

Take-Home Messages

Complex Systems

• consists of large number of interacting components

• graphs are very good mathematical models of these systems; they are verygeneric objects with many specific instances (trees, lists, tables etc.)

• availability of observations (measurements data) is a strong driving force

• a common systematic framework to study these systems: Network Science

History of modern science

Problems of simplicity (1600-1800) understanding influence of one

variable over another

Problems of disorganized (1900-1950) number of variables is very large

complexity but system as a whole has well-

defined average behavior

Problems of organized (1950-today) simultaneously dealing with number

complexity of factors forming whole system

- W. Weaver, 1948

Page 28: Minicourse on Network Science

Networks: Structure

Page 29: Minicourse on Network Science

Pavel Loskot c©2014 1/32

Similarity of Networks

• the Nature is built up of complex networks

• there is need to have a common framework for systematically describing,analyzing and eventually synthesizing networks to mimic the Nature

Page 30: Minicourse on Network Science

Pavel Loskot c©2014 2/32

Comparing Networks

Similarity of (static) networks

1. calculate and compare (a vector of) metrics for each network; N.B. we canonly compare scalar values (e.g. Euclidean distances between vectors)

OR2. identify distinctive subgraphs at certain granularity, and compare those

Graphlets [Przulj, 2004]

• pictured right: 30 subgraphs of2-5 nodes of 73 possible types

• generalizes vector of nodedegrees to graphlet degrees; itis a vector of 73 components ofthe number of nodes of giventype in the network


• quantitative analysis relies oncorrelations between fragmentstatistics in the network and thenetwork properties

Page 31: Minicourse on Network Science

Pavel Loskot c©2014 3/32

Comparing Networks

Motifs [Milo, 2002]

• subgraphs having the statistical significance of occurrence much larger thanif the network was created completely at random

• network randomization:1. select two links at random, 2. exchange their end-points, 3. repeat

• a motif in the real network occurs much more often than (on average) in anensemble of random networks having the same degree distribution

• we require that the probability of motif appearing in an ensemble of randomnetworks at least the number of times as in real network is small

• this is quantified by the Z-score (N denotes the number of occurrences)

Z =Nreal − E[Nrandom]


(Nrandom − E[Nrandom])2]

• motifs are network specific, although families of networks can share thesame motifs

• importance of motifs can be evaluated as the significance profile (SP) vector

SP =


i Z2i


i Z2i

, · · ·

Page 32: Minicourse on Network Science

Pavel Loskot c©2014 4/32

Comparing Networks

Motif examples

Relative abundance of fragments

• assume that ensemble of random networks has the same nodes degrees asthe real-world network

α =Nreal − E[Nrandom]

Nreal + E[Nrandom]

Page 33: Minicourse on Network Science

Pavel Loskot c©2014 5/32

Comparing Networks

Relative abundance examples

• ratios of the number of fragments occurrences are also useful to characterizethe network structure as shown next

Page 34: Minicourse on Network Science

Pavel Loskot c©2014 6/32

Transitivity MeasuresClustering coefficient

• recall that every triangle represents three connected (open or closed) triples

• let |T3| be the number of triangles and |P2| the number of 2-paths (connectedtriples with 2 or 3 ties); the clustering coefficient (a.k.a. network transitivity):

C3 =3|T3||P2|

where |T3| = tr{


/6, and |P2| =1



i j

[A2]i j − tr{A})

• a network can be highly clustered locally, but not globally (i.e., consideringaverage of local clusterings across all nodes is not sufficient)

• clustering tends to be much larger for real-world than random networks


• A’s friends: B,C,D and E

• all possible edges among A’s friends: B-C, B-D, B-E, C-D, C-E, D-E, i.e., 6 in total and out ofwhich only 1 (C-D) exists

• thus, clustering coefficient of A is 1/6


• any subgraph, ratio of actual to maximum

possible number of its occurrences: Cn =n|Tn||Pn|

Page 35: Minicourse on Network Science

Pavel Loskot c©2014 7/32

Centrality MeasuresAim

• quantify importance of nodes in a network (so-called positional advantage),

i.e. how nodes contribute to the overall structural properties of the network

• e.g. important nodes disseminate information faster, can stop spreadingepidemics, can protect network from breaking and so on

Degree centrality

• hubs are likely to have the largest influence (e.g. number of friends to help)

• a transitivity measure since it is ratio of single (neighboring) node fragments

• for a network of N nodes, i-th node of degree ki has degree centrality

C1(i) =|T1||P1|= CD(i) =


N − 1

Network centralization (centrality) (σ2C1




N − 1




C1(i) − C1

)2where C1 =






• star topology has the maximum while linetopology has the minimum centralization

Page 36: Minicourse on Network Science

Pavel Loskot c©2014 8/32

Centrality Measures

Freeman’s degree centrality

• quantify variations in node degree centrality in the whole network

CD =

∑Ni=1(k∗ − ki)

(N − 1)(N − 2)

where k∗ = maxi ki, and max∑N

i=1(k∗ − ki) = (N − 1)(N − 2) for a star network

Betweenness centrality (beyond nearest neighbors)

• quantify node importance in communications between pairs of other nodes

• ability to broker between groups, likelihood of intercepting information etc.

• thus, it is the likeliness of node w to be involved in communications

CB(w) =1





ρ(u,w, v)

ρ(u, v)(normalization optional)

ρ(u,w, v) number of shortest paths between u and v via w

ρ(u, v) number of all shortest paths between u and v


ρ(u,w, v) maximum flow from u to v through w

ρ(u, v) total maximum flow from u to v

Page 37: Minicourse on Network Science

Pavel Loskot c©2014 9/32

Centrality Measures

Example (betweenness centrality)

• A and E are not in-between any pairs, B and D are in-between 3 pairs, andC is in-between 4 pairs

Closeness centrality

• measure of how much the node is in “middle of things”

• let d(u, v) be the shortest path length between nodes u and v

CC(u) =


N − 1


d(u, v)


(normalization optional)

Example (closeness centrality)



)−1= 0.4

Page 38: Minicourse on Network Science

Pavel Loskot c©2014 10/32

Centrality Measures

Information centrality

CIC(i) =





Ii j


Eigenvector centrality (xu)

• account for connections that are (or not) isolated; important nodes are likelyconnected to other important nodes

• let B(u) be the neighbors of node u

xu =1



xv =1


v∈V[A]u,v xv ⇒ Ax = λx

algorithm: initialize xu = 1 ∀u, re-calculate xu ∀u, λ = maxu xu, repeat

Katz centrality

• instead of counting shortest paths (as in closeness centrality), count all paths

• let 1 < α < λ1 (largest eigenvalue of A)

CK(i) = [Z · 1]i where Z =





I − 1



− I

so the values of CK(i) are dependent on choice of α

Page 39: Minicourse on Network Science

Pavel Loskot c©2014 11/32

Centrality Measures

PageRank centrality

• reflects the probabilities that random walk through the network arrives to anyparticular node

• intuitively, if there are many links out of node v, one of these links to node urepresents average recommendation of u by v; if the number of links out of vis reduced, recommendation of u by v increases

• define the modified adjacency matrix [H]i j =


1/kout(i) if [vi, v j] ∈ E

0 otherwise

• PageRank vector CPR = [CPR(1), . . . ,CPR(N)]T at step k is updated as

Ck+1PR := Ck


note that node 4 traps a random

walker, and also, the search

is often randomly reset (with

probability 1 − α), so this

modified H should be used


H ′= αH +


N(a1T ) +

1 − αN

where [a]i =


1 if kout(i) = 0

0 otherwise

Page 40: Minicourse on Network Science

Pavel Loskot c©2014 12/32

Centrality Measures

Reciprocity (r)

• in directed networks, link from u to v can be reciprocated as link from v to u;these are called co-links

r =

i j[A]i j[A] ji

|E| (fraction of reciprocated links)

Rich-Club coefficient of degree k (R(k))

• hubs tend to be densely interconnected which is quantified by R(k)

• let subgraph (V ′(k), E′) ⊆ (V, E) where V ′(k) ⊆ V is subset of nodes withdegree at least k, and E′ ⊆ E are the corresponding edges among V ′(k)

R(k) =|E′|



Matching index (µi j)

• quantify similarity of connectivity of the two end-vertices of an edge

• small value of µi j indicates the edge between vi ∈ V and v j ∈ V is a bridgebetween two dissimilar regions of the network

µi j =

k,i, j AikAk j∑

i,k Aik +∑

j,k A jk

Page 41: Minicourse on Network Science

Pavel Loskot c©2014 13/32

Weighted Networks

Graph Network System

vertex node component

edge link interaction

Weights mapping

• weights can be assigned to vertices as well as to (more often) edges; weassume mapping

W : (V, E) 7→ (V,W)

so weighted adjacency matrix and original adjacency matrix, respectively,

[W ]i j = wi j ∈ R [A]i j =


1 |wi j| ≥ ∆0 |wi j| < ∆

Vertex strength

• degree distribution is generalized to the strength distribution having again a

power-law-like tails in many real-world networks

s(i) =∑


wi j

• it was observed that node strength and node degree have dependency as

E[s|k] ∝ kβ, β > 0

for β > 1, high-degree vertices (hubs) tend to be high-strength vertices

Page 42: Minicourse on Network Science

Pavel Loskot c©2014 14/32

Weighted Networks

Other generalizations

• the edge contributions can be normalized as wi j/∑

j wi j = wi j/s(i), e.g. theaverage nearest (first order) neighbor degree

kNN(i) =∑


wi j

s(i)[A]i j k( j)

• importantly, there are no generally agreed definitions of quantities (metrics)for weighted networks, e.g. the clustering coefficient ([A]i j ≡ ai j)

C3(i) =1

s(i)(k(i) − 1)


wi j + wik

2ai ja jkaik [Barrat, Barthelemy, Vespignani]

C3(i) =1

k(i)(k(i) − 1)


(wi jwikw jk)1/3 [Onnela, Saramaki, Kaski,Kertesz]

C3(i) =

j,k wi jw jkwik


k wik

)2 −∑

k w2ik

[Zhang] C3(i) =

j,k wi jw jkwki

(maxi j wi j)∑

j,k wi jwik


where for unweighted network we assume wi j = 1 if [vi, v j] ∈ E, and 0 otherwise

Page 43: Minicourse on Network Science

Pavel Loskot c©2014 15/32

Weighted Networks

Time-series as graphs

1. pre-processing: reduce measurement noise, reduce amount of data

2. calculate magnitude of correlations (possibly with thresholding)

0 ≤ [W ]i j =


did j


− E[di] E[

d j





− E[di]2




− E[

d j


≤ 1

3. construct a weighted graph assuming the weight matrix W

Page 44: Minicourse on Network Science

Pavel Loskot c©2014 16/32

Weighted Networks

Spanning tree of a graph

• a tree topology containing all nodes of the graph

• possibly additional requirement to maximize or minimize the sum of edgeweights

• it can be used to emphasize clusters in the graph, but . . . a lot of informationis discarded and is also sensitive to noise and thresholding

Example (NYSE stocks)

Page 45: Minicourse on Network Science

Pavel Loskot c©2014 17/32

Community Structure

Network communities

• so far, we considered local andglobal structure and properties;here, we look at spatial scalein-between individual nodes andthe whole network - clusters

• clusters are obtained by networkpartitioning or clustering

• our objective is to partition thenetwork using only its topology

Why clustering

• manage complex systems by creating hierarchy, for example,Big Data analysis and classification such as large databases, customer

recommendations, website ranking, genomics, market evaluations etc.

• identify bridges and weak ties in networks


• find P disjoint subsets Vi, so that ∪Pi=1

Vi = V and Vi∩V j is empty set for i , j

Page 46: Minicourse on Network Science

Pavel Loskot c©2014 18/32

Community Structure

Balanced partitioning

• given P, the size of partitions is approximately equal i.e. |Vi| ≈ |V |/P• possibly also, the cut (the links between subsets) size can be minimized


• there is a path through the community between every pair of nodes

• internal connection density significantly larger than density of externalconnections

Cut size

• assume a weighted network, and the partition V ⊂ V

• the internal and external weights of a node vi ∈ V in the partition

Wint(i) =∑

v j∈V

wi j , Wext(i) =∑

v j<V

wi j

• the cut size between V and the rest of nodes V\V

Ccut(V) =1



Page 47: Minicourse on Network Science

Pavel Loskot c©2014 19/32

Community Structure

Reducing cut size

• moving node vi in or out the partition V will change the cut size by

g(i) = Wext(i) −Wint(i)

so cut size is reduced if Wext(i) > Wint(i)

• for partitions already balanced, consider replacing one node in the partition(i.e., move one node out and another node in); the cut size is changed by

g(i, j) =


g(i) + g( j) − 2wi j if vi and v j connected

g(i) + g( j) otherwise

Centrality based partitioning

• links connecting nodes in different communities are likely to have large edgebetweenness centrality (defined analogically to node betweenness)

Algorithm [Girvan, Newman, 2002]

1. calculate edge betweenness for all links, remove link with highest such value

2. recalculate edge betweenness for remaining links

3. repeat until all links have been removed

Page 48: Minicourse on Network Science

Pavel Loskot c©2014 20/32

Community Structure


• need to compare different partitions to decide which one is the best

• intuitively, cohesion or links density within the community is likely to besignificantly larger than if the community is formed at random

• for partitioning ∪Pi=1

Vi = V, with edges Ei within the partition Vi, themodularity indicator (ci is community assignment of vertex vi)

Q =



|Ei||E| −

v j∈Vkk(v j)



= . . . =1


i, j


[A]i j −k(i)k( j)



δ(ci, c j)

so it is actual number of edges minus expected number of edges inside thecommunity for a random subgraph with the same node degree distribution

• Q ≥ −1, and max Q = 1 for strong community structure

• can be used as stopping criterion (Q >> 0) in Girvan-Newman algorithm

Modularity optimization

• find partitioning with maximum modularity (exact solution is NP complete):

complexity O(( |V ||V |/2


∼ 2|V |√π|V |/2 for large |V |

Page 49: Minicourse on Network Science

Pavel Loskot c©2014 21/32

Community Structure

Resolution problem

• modularity based clustering may failto identify obvious small clustersclose to a large cluster

• modularity is deficient if clusters arecircularly connect (pictured right)

• other similarity measures alsoaffected (minimum cuts, . . .)

Possible solution

• use multiple similarity metrics

• then choose the best partition byconsensus (e.g. majority vote)

Page 50: Minicourse on Network Science

Pavel Loskot c©2014 22/32

Community Structure

Hierarchical clustering

• complexity O((|E| + |V |)|V |), many networks are sparse (|V | ≈ |E|)Algorithm

1. Initialize: |V | communities of 1vertex each

2. Calculate modularity ∆Q for allpairs of existing communities

3. Merge the community pairhaving the largest increase ∆Q

4. Build the dendrogram andrepeat steps 2 and 3 until onlyone community remains

Clustering based on Euclidean distance

Page 51: Minicourse on Network Science

Pavel Loskot c©2014 23/32

Community Structure

Merging clusters

• similarity between clusters can be measured as single linkage: minimum

between all pairs of nodes in two clusters

• Complete linkage: maximum between all pairs of nodes in two clusters

• Average linkage: average between all pairs of nodes in two clusters

Limitations of modularity

• appears to be strongly dependent on the density of links in the network

• thus, not good measure to determine communities in sparse networks

Clustering techniques

1. Agglomerative (bottom-up) techniques: edges are added among nodes to

create communities (e.g. dendrogram)

2. Divisive (top-down) techniques: edges are removed from graph to create

separate communities

3. Spectral techniques: graph splitting based on eigen-analysis

Similarity measures

• quantify (dis)similarity between nodes to decide on communities in allclustering algorithms

• selection strongly application dependent (modularity, cosine similarity,Jaccard’s coefficient, . . .)

Page 52: Minicourse on Network Science

Pavel Loskot c©2014 24/32

Community Structure

Louvain method (based on modularity optimization)

• more accurate and more efficient (much faster) than hierarchical clustering

• number of communities decreases quickly in only few iterations


1. Initialize: every node is in its own community

2. For each node i, consider all its neighbors j, and check if moving i into j’scommunity increases ∆Q

3. Move i into community for which ∆Q is maximum

4. Repeat steps 2 and 3 until no further improvement possible (i.e. ∆Q = 0)

5. collapse the communities into single nodes (merging multiple edgesbetween these new nodes), and go back to step 2

Page 53: Minicourse on Network Science

Pavel Loskot c©2014 25/32

Community Structure

K-means clustering

• number of clusters K predefined

• minimize e.g. Euclidean distances:

{Vi}i=1,...,K = argmin




‖v − vi‖2 , vi =1





1. Initialize: select K verticesat random as initial clustersand assign remainingvertices to nearest clusters

2. Calculate new centroids vi

for each cluster

3. Re-assigned all vertices tothe nearest clusters

4. Go to step 2 until somestopping criterion is met

89% of data correctly classified

Page 54: Minicourse on Network Science

Pavel Loskot c©2014 26/32

Community Structure

Limitations of K-means clustering

• sensitivity to initial conditions and outliers

• sensitivity to non-homogeneous structure, i.e. clusters differ significantly insize, connection density, non-spherical shape (for Euclidean distance metric)

Page 55: Minicourse on Network Science

Pavel Loskot c©2014 27/32

Community StructureGaussian mixture models

• assume there are K clusters, vertex vi has location xi

• location of vertices in cluster Vi are normally distributed ∼ N(x|µi,λi)

• let zi, i = 1, . . . ,K be independent latent variables such that zi = 1 if cluster i∼ N(x|µi,λi) and zi = 0 otherwise, so Pr(z) =

∏Ki=1 π



• if zi are known, the data are labeled (parameters of their distribution areknown), otherwise data are un-labeled (unsupervised learning)

• πi = Pr(zi) are mixing probabilities (weights), so that∑K

i=1 πi = 1 and thedistribution of location x of a vertex is

p(x) =∑


p(x|z) p(z) =



πiN(x|µi, λi), wherep(x|z) =



N(x|µi, λi)zi

• unknown parameters: mixing coefficients (πi), means (µi), covariances (λi)

• using Bayes’ theorem, we can find posterior probabilities (responsibilities)Pr(zi|x) that the k-th Gaussian component has in explaining observed data

Algorithm [Expectation Maximization (EM)]

1. E-step: evaluate Pr(zi|x) given current parameters

2. M-step: re-estimate parameters using current Pr(zi|x)

Page 56: Minicourse on Network Science

Pavel Loskot c©2014 28/32

Community Structure

Overlapping communities

• nodes may belong to more than one community (i.e. subsets Vi not disjoint)

Clique percolation method [Palla 2005]

• K-clique are K fully connected nodes

• K-cliques adjacent if share K − 1 nodes

• K-clique community is a set of nodes

connected through adjacent K-cliques


1. identify maximal cliques in thenetwork (complex problem, butfortunately many real-worldnetworks are relatively sparse)

2. consider cliques as singlenodes; interconnect cliques ifthey share at least K − 1 nodes

3. identify connected componentsin graph created in step 2

Page 57: Minicourse on Network Science

Pavel Loskot c©2014 29/32

Community Structure

Spectral clustering

• K-means, Gaussian mixtures,hierarchical method are good forcompact clusters

• spectral clustering transformsthe data into a new basis wherestandard algorithms work well


1. Construct similarity matrix,

[S]i j = Exp



∥xi − x j



2. Construct Laplacian L = D−Swhere D is diagonal matrix ofweights, [D]ii =

j[S]i j

3. Construct matrix U of keigenvectors correspondingto k largest eigenvalues of L

4. Perform clustering on thetransformed data x′ = UTx

Page 58: Minicourse on Network Science

Pavel Loskot c©2014 30/32

Community Structure

Real-time clustering

• (dynamic) re-clustering for every new data arrival is expensive

• (dynamically) varying the number of clusters is confusing

Hierarchical Agglomerate Clustering [HAC, 2004]

1. Initialization: hierarchical clustering (e.g. using dendrogram)

2. new data either assigned to one of existing cluster, OR

3. new data form new cluster, and two existing clusters are merged

Page 59: Minicourse on Network Science

Pavel Loskot c©2014 31/32

Community Structure

Community analysis

• distribution of community sizes

• intra-community edge densities

• number of intra- and inter-community links

• average number of communities per node

• . . .

Community network

• communities→ nodes

• edges weighted by number of linksbetween communities

Page 60: Minicourse on Network Science

Pavel Loskot c©2014 32/32

Take-Home Messages

Network structure analysis

• structure of un-weighted static networks, i.e. knowing only their topology

• subgraphs, graphlets, fragments and motifs are building blocks of largenetworks; the statistics of their occurrence is useful to compare networktopology beyond their degree distribution

• network partitioning or clustering to identify (overlapping) communities

Measures of network structure

• centrality (degree, betweenness, closeness, eigenvector, Katz, PageRank,. . .)

• clustering coefficient, Rich-Club coefficient

• modifications of measures for weighted networks

Page 61: Minicourse on Network Science

Networks: Random Models

Page 62: Minicourse on Network Science

Pavel Loskot c©2014 1/23

Statistical Modeling


• account for models or parameters uncertainty, measurement noises etc.

• make (short-to-medium term) predictions from the models

• generate artificial data for verifying models and predictions

• decide how much randomness influence properties; here we comparestructural and functional properties of random and real-world networks

Milgram’s experiment [1967]

• famous “six-degree separation”

• 300 people at random to send letterto a person in Boston

• repeated in 2003: 18 targets, 60ksenders, communications via emails

• new findings in 2003: median 5 −7 steps, network structure is noteverything, high impact of incentives

• Facebook: 92% of users only 5 hopsaway, 99% at 6 hops away

Page 63: Minicourse on Network Science

Pavel Loskot c©2014 2/23

Random Network Models

Erdos-Renyi (ER) random graph [1959]

• graph GER(n, p) with n vertices and edges chosen independently withprobability p, “a zero-order approximation” of real-world networks

• thus, vertex degree is random (for large n, binomial distribution approximatedby Poisson distribution)

Pr(k) =


n − 1



pk(1 − p)N−1−k ≈ e−k kk

k!, k = E[k] = (n − 1)p

• most vertices have average linking to other nodes (i.e. degree close to k)

• diameter (d) and average distance (l) between two vertices is relatively smallcompared to the size of the graph

d =ln(n)

ln(p(n − 1))=


ln(k)≈ l

• average number of edges E[|E|] = p(





; the latter, since k =2E[|E|]


Connectivity of ER random graph

• average degree


< 1 graph disconnected

> 1 a giant component appears

≥ ln(n) graph (almost) completely connected

Page 64: Minicourse on Network Science

Pavel Loskot c©2014 3/23

Random Network Models

Average path length and diameter of ER (random) graph

• let l(i, j) be the shortest path between vertices vi and v j

• the shortest paths can be combined into a single metric as

l = 1


i, ji, j

l(i, j)

2average shortest path

d = maxi, j l(i, j) maximum shortest path (diameter)

• if n is large, the average path length l ∝ ln n is relatively small andgrowths slowly with network size (this is typical for many large networks);for comparison, 1D lattice (chain): l ∝ n, 2D lattice: l ∝ n1/2

Page 65: Minicourse on Network Science

Pavel Loskot c©2014 4/23

Random Network Models

Clustering coefficient of ER graph

• ratio of neighbors being friends to all possible friendships among neighbors

• probability that two neighbors are connected is p, so clustering coefficient

CER = p =k

nwhich is much smaller than for real-world networks with the same density

• for large networks limn→∞CER = 0, so large random networks resembles atree (i.e. they have no clustering)

Components in ER graphs

• if p (and thus, also k) is small, thereare several disjoint components

• if p is increased, there is one giantcomponent (of size ≈ n) with therest of nodes being in isolated smallcomponents

• the giant component appears whenp ≈ 1/n, e.g. for n = 103 (see figure)

Page 66: Minicourse on Network Science

Pavel Loskot c©2014 5/23

Random Network Models

Percolation transition (when increasing p)

1. Subcritical: k < 1, many small simple components of size at most ln n

2. Critical: k ≈ 1, size of largest component is ∼ n2/3, the giant componentappears and starts growing

3. Supercritical: k > 1, there is one giant component of size almost n, the

second largest component has size about ln n

Summary of ER graph

• degree distribution is Poisson (most nodes have degree close to average)with no correlations of node degrees

• average path length is small and ∝ ln n

• connectivity depends on k with percolation transition

Page 67: Minicourse on Network Science

Pavel Loskot c©2014 6/23

Random Network Models

Random geometric models [Penrose, 2003]

• main motivation: some networks can grow subject to geometric constraints

• e.g., place n nodes randomly in (2D) space; two nodes i and j connectedonly if their distance

∥xi − x j

∥ ≤ r

• there exists critical radius rc to form a connected giant component (if r > rc):

rc =

√ln n + O(1)


Random distance models [Avin, 2008]

• n nodes placed randomly in (2D) space

• links created randomly with the probability ∝ f (di j) where di j =

∥xi − x j

Page 68: Minicourse on Network Science

Pavel Loskot c©2014 7/23

SmallWorld Networks

More on Milgram’s experiment

• how accurate 6 degree separation, how likely the chain to be completed

Findings from real-world social networks

• sub-optimal choice in choosing next link in chain is made 1/2 of time

• Facebook measurements: average distance is 4.74

• Twitter measurements: average distance is 4.67 (50% are at 4 steps, nearlyeveryone in 5 steps)

Page 69: Minicourse on Network Science

Pavel Loskot c©2014 8/23

SmallWorld Networks

Main features

high clustering: Creal−world ≫ Crandom

average path length: lreal−world ≈ lrandom

Watts-Strogatz (WS) small world network model [Nature, 1998]

• launched the interest into complex networks (over 3.5k citations)

• single control parameter to generate regular to purely random networks

• the model: 1. generate regular graph, 2. rewire links with probability p

• in all network generators: self-loops and duplicated links not allowed

Page 70: Minicourse on Network Science

Pavel Loskot c©2014 9/23

SmallWorld Networks

WS original model

• select fraction of p edges and rewire one of their end-points

WS model alternation

• add fraction p of edges to initial regular lattice

Page 71: Minicourse on Network Science

Pavel Loskot c©2014 10/23

SmallWorld Networks

Properties of WS model:

Degree distribution

Pr(k) =min(k−K,K)






(1 − p)ipK−i (pK)k−K−i


∼ Poisson-like distribution

Clustering coefficient

• if node i has K neighbors,

C =#edges among K neighbors


• probability that connected triple stillconnected after rewiring is (1 − p)3

• C(p = 0) = 3k−34k−2�=

34, C(p = 1) ≈ 2k


• C(p) � C(p = 0) · (1 − p)3, i.e.,C(p)/C(0) � (1 − p)3

Average path length: l ≈ (n−1)(n+2k−1)


Page 72: Minicourse on Network Science

Pavel Loskot c©2014 11/23

SmallWorld Networks

Kleinberg’s geographical small world model [Nature, 2000]

• connectivity derived from geographical distances

• the model: 1. link nearest neighbors 2. add links with the probability

Pr(link between u and v) ∼ const × ‖u − v‖−r

where r is navigability exponent (e.g., links are purely random for r = 0)

Hierarchical small world model [Science, 2001]

• hierarchically nested groups, link probability pi j ∼ exp−αxi j

Other strategies for generative models of small world networks

• add/rewire links based on chosen properties of current links and edges

• add/rewire links to optimize particular property of the network

Page 73: Minicourse on Network Science

Pavel Loskot c©2014 12/23

SmallWorld Networks

Topology trade-offs

(a) commuter rail network

(b) star network

(c) minimum spanning tree network

Page 74: Minicourse on Network Science

Pavel Loskot c©2014 13/23

SmallWorld Networks


• in social networks, close friends know what you know, and they also knowothers who know what you know

• bridge between A and B (if removed, these two nodes become disconnected)

• local bridge between A and B (if removed, distance A-B increased to > 2)

Page 75: Minicourse on Network Science

Pavel Loskot c©2014 14/23

SmallWorld Networks

Strong and weak ties [Granovetter, 1974]

• in social networks, links are strong or weak ties (friends vs. acquaintances)

• strong triadic closure: if A-B and A-C are strong ties, then at least weak tie

between B-C exists

• if there are enough strong ties in network, local bridges must be weak ties

Almost local bridges

• neighbor overlap of nodes A and B:

#neighbors of both A and B#neighbors of at least A or B

=N(i, j)

(k(i)−1)+(k( j)−1)−N(i, j)

• almost local bridges are linkswhose end-nodes have no commonneighbors (i.e., the overlap of theirneighbors is 0)

Page 76: Minicourse on Network Science

Pavel Loskot c©2014 15/23

SmallWorld Networks

Removing ties from social networks (percolation analysis)

• removing weak ties breaks down the network

• removing strong ties degrades the network more smoothly

• however, this is specific only to social networks

Removing ties from other networks

• e.g. removing important road (strong tie) is more damaging

• central veins are more important then peripheral veins

Page 77: Minicourse on Network Science

Pavel Loskot c©2014 16/23

SmallWorld Networks

Illustration of weak vs. strong ties removal

(a) original network

(b) 80% of strongest links removed, 20% of weak ties remain

(c) 80% of weakest links removed, 20% of strong ties remain

• no evidence of degradation for (b), network clearly fragmented in case (c)

• strong links are within dense neighborhoods (triangles, cliques etc.)

• weak links (and bridges) interconnect these dense neighborhoods

Page 78: Minicourse on Network Science

Pavel Loskot c©2014 17/23

Scale Free Networks

Power-law distribution

Pr(k) ∼ const× k−γ, typically 2 < γ < 3

• i.e., a straight line in log-log domain:

log Pr(k) ∼ −γ log k + log const

• always a few highly connected hubs

Page 79: Minicourse on Network Science

Pavel Loskot c©2014 18/23

Scale Free Networks

Power-law distribution

• some other distributions look like power-laws

• estimating γ may not be so easy

• 1st and 2nd moments:

E[k] ∝∫ ∞


k × k−γdk = limk→∞


2 − γ


kγ−2− 1





= ∞ if γ ≤ 2

< ∞ if γ > 2



∝∫ ∞


k2 × k−γdk =


= ∞ if γ ≤ 3

< ∞ if γ > 3

Preferential attachment

• now, the concern is how to generate scale-free networks

• we use richer-get-richer effect and add new nodes sequentially:

1. with probability p, choose any existing node and link to it

2. with probability 1 − p, link to existing node with probability proportional totheir current degrees

Page 80: Minicourse on Network Science

Pavel Loskot c©2014 19/23

Scale Free Networks

Barabasi-Albert (BA) scale-free model

1. Growth: start from seed network of m0 isolated nodes

2. Preferential attachment: add a new node with m ≤ m0 edges to existingnodes that are chosen with the probability Π(i) = k(i)/

i k(i)

3. after t steps, the network has n = m0 + t nodes and mt edges, and

Π(i) =k(i)

i k(i)=


2mt − m≈ k(i)


• this procedure generates degree distribution

Pr(k) =2m(m + 1)

k(k + 1)(k + 2)�


k3∝ k−3

so γ = 3 and average degree k = 2m

• average shortest path length: l ∝ ln nln(ln n)

• clustering coefficients: C ∝ (ln n)2

n( . . . too small for real-world networks)

Page 81: Minicourse on Network Science

Pavel Loskot c©2014 20/23

Scale Free NetworksQuestion

• In scale-free networks, how much is “popularity” predictable?


• if we restart the process, different popular nodes will emerge

Other scale-free network models

• motivation: improve clustering coefficientand allow to change exponent γ

[Holme, Kim]

• after preferential attachment step,with probability p, add one moreedge to randomly selected neighbor

• resulting clustering coefficient C ∝ 1k

( . . . much more realistic)

[Vazquez et al.]

• random walk instead of preferential attachment (“get to know importantpeople through people you already know”)

[Kleinberg, Kumar]

• copy a vertex and rewire its edges with certain probability

Page 82: Minicourse on Network Science

Pavel Loskot c©2014 21/23

Scale Free Networks

Network generator with given γ

1. Initialize: seed network with m0 (isolated) nodes

2. Add one node and m links (not necessarily stemming from the new node) ateach time; after t time-steps, the link is added to node i with the probability

Π(i) = αk(i)

i k(i)+ (1 − α)


t + m0


i k(i) = 2mt

3. thus, α = 1 leads to preferential attachment, and α = 0 is for uniformattachment, and the degree distribution

Pr(k) ∝ k−(1+1/α)

Models with non-power-law distribution

• power-law distribution is a good fit for large networks (averaging effect)

• on smaller scales, in more “specialized” sub-networks, power-law may notbe such a good fit

• log-normal distribution has been observed in such cases

Page 83: Minicourse on Network Science

Pavel Loskot c©2014 22/23

Scale Free Networks

Configuration network model

• degrees are pre-assigned to n nodes assuming a degree distribution Pr(k)

• edges are added by randomly selecting pairs of these n nodes

• a family of graphs generated this way will have the same degree distribution

• excess degree is the number of possible outward links of a node which

has been arrived to during a walk (i.e., one less than the node degree inundirected graphs)

Pr(kexcess) =(k + 1)Pr(k + 1)

k kPr(k)

Other models

• many other stochastic models of networks can be devised and then analyzed

• hence, it is important to define quality of such models, e.g. (generally):

– flexibility (design for specific parameter settings)

– mathematical tractability

– accuracy (to fit experimental data, make predictions)

Page 84: Minicourse on Network Science

Pavel Loskot c©2014 23/23

Take-Home MessagesRandom networks

• Erdos & Renyi studied a simple model in 1959

• it has Poisson degree distribution with small average path length, butclustering goes to zero with network size

Small world networks

• the world is small, 6 degree separation (Milgram’s experiment)

• short average path length, but clustering still smaller than in real networks

• real-world networks contain weak and strong ties

• Watts & Strogatz proposed simple model of small world networks (in 1998)

Scale free networks

• main focus is to produce power-law distribution

• Barabasi & Albert proposed model based on preferential attachment (in1999); many modifications of this model can be (and were) devised

Network models

• mostly stochastic with main motivation is to emulate real-world networks

• find structural properties to explain specific (global) properties of networks

• useful to define quality of these models

Page 85: Minicourse on Network Science

Networks: Robustness

Page 86: Minicourse on Network Science

Pavel Loskot c©2014 1/11



• monitor network metrics while nodes or edges are being removed

Dual problem

• monitor network metrics while nodes or edges are being added

1. What strategy to remove/add nodes or edges?

• no knowledge: nodes and edges removed (uniformly) at random

• knowledge of structure: removing nodes and edges with high centrality

• adding nodes and edges: cf. random network generators

2. Which metrics most relevant and should be monitored?

• rate of decay/growth of: network diameter, average degree, averagedistance, size of giant component etc.

3. Which (class of) networks to consider?

• any network, networks with specific degree distribution etc.

4. Why to consider robustness?

• in general, networks resilience to attacks is a growing concern

• want to design networks that are robust to damage

Page 87: Minicourse on Network Science

Pavel Loskot c©2014 2/11


Pragmatic definition

• the network is robust if it can withstand accidental damage, random topologychanges as well as intentional attacks and remain operational

• this accounts for the remaining nodes and links to be able to carry flows andperform other tasks without excessive congestion, dead-locks etc.

• observing average decay (e.g. size of giant component) may not be thatuseful (e.g. it cannot identify local congestion further impairing the network)

• note also that we are still considering only networks with static topology


• 50 nodes, removing 40 out of 116 edges decreases k from 4.6 to 3.0

Page 88: Minicourse on Network Science

Pavel Loskot c©2014 3/11

Robustness as Stability

Global stability

• system is stable if it returns to equilibrium after any perturbation


• ability of a community to resist change in face of potentially perturbing force


• ability of a community to recover to normal functioning after disturbance


• variations in community density over time (measured e.g. as changes inmean/variance) due to external disturbances

Page 89: Minicourse on Network Science

Pavel Loskot c©2014 4/11


Percolation threshold

• if k decreases by removingedges, network suddenly becomesdisconnected

• if k increases by adding edges, giantcomponent suddenly emerges


p probability of filling squares, at p critical, giant connected component appears

Page 90: Minicourse on Network Science

Pavel Loskot c©2014 5/11


Experiment [Barabasi et al., 2000]

strategy: random failures versus targeted attacks removing nodes

metrics: average or maximum (network diameter) shortest path

networks: exponential versus scale-free (the same |V | and |E|)

Page 91: Minicourse on Network Science

Pavel Loskot c©2014 6/11


Experiment (cont.)

effect on size of

giant component s

and its average 〈s〉

Page 92: Minicourse on Network Science

Pavel Loskot c©2014 7/11

RobustnessExperiment (cont.)

(Internet and WWW)

effect on size of

giant component s

and its average 〈s〉

Page 93: Minicourse on Network Science

Pavel Loskot c©2014 8/11

Robustness of Scale-Free Networks

Random failures vs targeted attack

(a) original network of 574 nodes

(b) removing 20% (115) of nodes randomly

leaves 427 nodes in giant component

(c) removing only 2.8% (22) most connected hubs

leaves 301 nodes in giant component

Bottom line

• scale-free networks are robust against random failures

• they are very vulnerable against targeted attacks

Page 94: Minicourse on Network Science

Pavel Loskot c©2014 9/11

Robustness of Scale-Free Networks

Impact of power-law exponent on robustness (∼ k−γ)

• γ = 2.5: graceful degradation

• γ = 3.5: giant component disappearsat about f = 40%

• assume e.g. case of γ = 2.7 (squaremarkers)

• kmax is maximum degree amongremaining nodes

• removing only 1% of nodes discardsgiant component (top figure)

• kmax has to be very small to destroygiant component (bottom figure)

Page 95: Minicourse on Network Science

Pavel Loskot c©2014 10/11


Percolation threshold for random failures

• in general, minimum fraction of nodes required (i.e., that cannot be randomlyremoved) for giant component to exist

fc =E[k]



− E[k]

• specialized for random networks:

fc =1


thus, if k = E[k] is large, random network can withstand large losses; e.g. ifk = 4, then 1/4 of nodes is enough for giant component to exists (i.e., 3/4 ofnodes have to be removed to destroy giant component)

• specialized for scale-free networks:

fc −→ 0

as E[


tends to be very large (even infinite) which makes these networks

very robust against random failures (and attacks)

Page 96: Minicourse on Network Science

Pavel Loskot c©2014 11/11

Take-Home Messages

Scale-free networks

• very robust against random failures (some suggest that this is the reasonwhy these networks are found so often in real world)

• but very vulnerable against attacks to highly connected hubs

• since hubs are also responsible (and effective) for spreading messages,diseases etc. through the network

• in Social Networks, it is not hubs but rather weak ties and bridges that makethese networks vulnerable

Small world networks

• have not been considered here

• one extreme is a ring with one hop neighboring connections without anyshortcut links; such network is not robust at all

• another extreme is a fully connected network which is unbreakable

• small world networks are in-between these two extremes; their robustnessis likely derived from the density of shortcut links

Making networks more robust

• obvious strategy is to guarantee some minimum degree for every node (i.e.,to achieve connections redundancy)

Page 97: Minicourse on Network Science

Networks: Processes

Page 98: Minicourse on Network Science

Pavel Loskot c©2014 1/13

Epidemic Spreading

Network processes

• strongly influenced by network structure; e.g. shortcuts significantly speedup spreading (of information, diseases) and synchronization of processes

• hence, understanding of such network(ed) (distributed) processes requires

understanding of the underlying network structures

• e.g. neurons integrate signals from neighbors, if above threshold, theexcitation fires and then fades away; this leads to oscillating cascades

• here, we consider diffusion of diseases characterized by contagion (lackof choice), unlike information spreading where nodes make decisions tomaximize their pay-offs

Page 99: Minicourse on Network Science

Pavel Loskot c©2014 2/13

Epidemic Spreading

Simple spreading models

• ring topology with shortcuts

• all nodes susceptible

• nodes infected with probability p

• spreading of disease, computerviruses, . . .

• tree topology of spreading in wavesof k nodes, all nodes susceptible

• nodes infected with probability p

a) p is large, disease spreads out

b) p is small, disease dies out

Reproductive number: R0 = kp

a) if R0 < 1, disease dies out in finitenumber of waves

b) if R0 > 1, disease very likely infectsat least 1 person in each wave

Page 100: Minicourse on Network Science

Pavel Loskot c©2014 3/13

Epidemic Spreading

Limitations of simple models

• small changes in k and p can move R0 above or below threshold (R0 ≷ 1)

• network topology not realistic (e.g. no triangles)

• nodes get infected only once and never recover

More realistic: SI model

• two classes of nodes: S (susceptible) and I (infected)

• once infected, the node cannot recover

|V | = |S | + |I| total number of nodes (V = S ∪ I)

β = λk infection rate per node (0 ≤ λ ≤ 1)

β|S |/|V | susceptible contacts per unit of timedIdt= β|S ||I|/|V | overall rate of infection

• let i = |I|/|V | be fraction of nodes infected, then didt= β i (1 − i) which yields a

logistic curve:

i(t) =i(0) eβt

1 − (1 − eβt) i(0)

Page 101: Minicourse on Network Science

Pavel Loskot c©2014 4/13

Epidemic Spreading

More realistic: SIR model

• improve SI model by assuming infected nodes recover at rate υ, i.e., nodesstay infected only for (average) time τ = 1/υ

• recovered node will become resistant (i.e. cannot be infected again)

• define fractions s = |S |/|V |, i = |I|/|V |, r = |R|/|V |, so s + i + r = 1; rate ofchange of these fractions over time:


dt= −βsi,


dt= βsi − υi,


dt= υi

solution again requires initial conditions s(0), i(0) and r(0)

Possible outcomes

a) disease may die out

b) disease may spread towhole network

c) disease becomes endemic(does not spread, nor dieout)

Page 102: Minicourse on Network Science

Pavel Loskot c©2014 5/13

Epidemic Spreading

More realistic: SIS model

• no (permanent) recovery, but infected node may again become susceptible

• infected to susceptible rate υ:

a) if β > υ, logistic growth (as in SI model), but never infects whole population

b) if β→ υ, then i→ 0 (infection will slowly die out)

c) if β < υ, then infection dies out exponentially

• mathematical model (assumes r = 0):


dt= υi − βsi,


dt= βsi − υi, s + i = 1

Page 103: Minicourse on Network Science

Pavel Loskot c©2014 6/13

Epidemic Spreading

Prognosis of epidemic

• reproductive number: R0 = β/υ

a) if R0 > 1, infection survives

b) if R0 < 1, infection dies out

• in SI model, υ→ 0, so R0 ≫ 1

(a) SI model

(b) SIR model

(c) SIS model

Extensions of SIR model

• rather than assuming recovery after τ time units, let recovery be possible ateach time with some (fixed) probability

• infected state further subdivided (e.g. early, middle and final disease stages)

• non-homogeneous mixing: restrictions how the nodes meet (e.g. travel to

geographical locations, quarantining, . . . )

• other random network models (note that Erdos-Renyi model withhomogeneous mixing was implicitly assumed in SI, SIR and SIS models)

Page 104: Minicourse on Network Science

Pavel Loskot c©2014 7/13

Epidemic Spreading

SIS model in scale-free networks

• experimentally observed thatcomputer viruses survivesignificantly longer thanpredicted from SIS modelover random networks

• it was found that there is noepidemic threshold in scale-freenetworks, so infection proliferateindependently of spreading rate

• however, there is critical fractionof shortcuts in scale-freenetworks; if enough shortcuts,disease suddenly becomesepidemic

• critical fraction of shortcuts is afunction of rates β and υ

Page 105: Minicourse on Network Science

Pavel Loskot c©2014 8/13

Epidemic SpreadingNetwork immunization

• random networks: uniformly random immunization is helpful

• scale-free networks: targeted degree-based immunization required asrandom immunization does not help

• targeted local immunization: immunize one immediate neighbor for every

node in a randomly selected group (i.e. nodes with higher degree are morelikely to be immunized)

red-circles: random immunization of scale-free network

red-squares: targeted immunization of scale-free network

black-squares: random & targeted immunization of random network

Page 106: Minicourse on Network Science

Pavel Loskot c©2014 9/13

Take-Home Messages

Epidemic spreading

• practical modeling requires to extract model parameters from real data

• knowledge of nodes mobility is key to accurate modeling of spreading

• epidemic spreading strongly influenced by information diffusion (i.e. knowingwhat is happening and what to do)

• predictive modeling (if epidemic spreading on-going, it is desirable to be inreal-time) is routinely used in practice as prevention

SARS prediction and comparison with real outbreak data

Page 107: Minicourse on Network Science

Pavel Loskot c©2014 10/13

Network DynamicsSpatial-temporal scales

(a) short: link activation and deactivation– topology is a snapshot– connected components must respect time sequences of links

(b) longer: topology change from one structure to another– communities formation, merging, splitting– large communities persist in time if there is exchange of their members– small communities persist if their core is highly connected with strong ties

(c) long: network evolution (birth, growth and decline)– in scale-free networks, in spite of changes (nodes and links appear and

disappear), degree, weight and strength distributions remain stationary

Page 108: Minicourse on Network Science

Pavel Loskot c©2014 11/13

Information Cascades


• understand how behaviors, ideas, technology usage etc. are adopted,influenced and spread through networks

Diffusion model

• two nodes v and w

• two behaviors A and B

• two pay-offs a > 0 and b > 0

(i.e., the larger, the better)

Network implications

• let 0 ≤ p ≤ 1, and thereare d neighbors of v

• pd neighbors of vchoose A

• (1 − p)d neighbors of vchoose B

• A is better strategy if:

pd·a ≥ (1−p)d·b ⇒ p ≥b

a + b

Page 109: Minicourse on Network Science

Pavel Loskot c©2014 12/13

Information Cascades

Example diffusion in a network

• let a = 3, b = 2, so ba+b= 2/5

• A: dark circles, B: light circles

(b) only v and w adopt A

(c) nodes r and t switch to A (i.e. 2/3 neighbors of A); u does not switch (but1/3 of its neighbors chose A); note also that: 1/3 < 2/5 < 2/3

(d) also nodes s and u switch to A

Page 110: Minicourse on Network Science

Pavel Loskot c©2014 13/13

Take-Home Messages


• initial adoption by few nodes may generate complete cascade

• it is dependent on network structure

• it is also crucially dependent on threshold b/(a+b), so changing pay-offs canmake big difference (e.g. making the product more attractive)

• OR, directly influence key nodes (initial adopters)

• densely inter-connected clusters are difficult to penetrate

• key parameters: clusters connection density and pay-off threshold

Role of weak ties

• very useful in spreading information

• poor in transferring behaviors that are risky and/or costly

Influencing nodes

• in networks with many clusters, users are more easily influenced

• reinforcement is very important in influencing users

• node centrality is crucial for (information, behavior) diffusion

Page 111: Minicourse on Network Science

Networks: Algorithms

Page 112: Minicourse on Network Science

Pavel Loskot c©2014 1/24

Max Flow and Min Cut


• single source node s and single sink node t (for simplicity)

• directed edges between nodes represent flows (information, material, . . . )

• every edge assigned a weight representing max possible flow ≡ capacity

Page 113: Minicourse on Network Science

Pavel Loskot c©2014 2/24

Max Flow and Min Cut

Dual problems (of combinatorial optimization)

1. find minimum cut of a graph G = (V, E) where V is set of nodes and E areweighted edges (max flows)

2. find maximum possible total flow from s ∈ V to t ∈ V over E while flows atevery other node are equalized (in-flow = out-flow)

Cut (S ,T )

• node partitioning V = S ∪ T such that S ∩ T = ∅ and s ∈ S and t ∈ T

Capacity of cut (S ,T )

• sum of weights (capacities) leaving set S and entering set T

Page 114: Minicourse on Network Science

Pavel Loskot c©2014 3/24

Max Flow and Min Cut

Minimum cut problem

• find the cut with the minimum capacity

Maximum flow problem

• assign flows to edges not larger than their capacity, so that total flow from sto t is maximized and flows in all other nodes (V\{s, t}) are equalized

Page 115: Minicourse on Network Science

Pavel Loskot c©2014 4/24

Max Flow and Min Cut

Observation 1

• flow from S to T is equal to the total flow reaching sink t

Page 116: Minicourse on Network Science

Pavel Loskot c©2014 5/24

Max Flow and Min Cut

Observation 2

• flow from S to T is at most equal to capacity of the cut

• if flow from S to T is equal to capacity of the cut, then we have maximumpossible flow from S to T and (S ,T ) is minimum cut

Page 117: Minicourse on Network Science

Pavel Loskot c©2014 6/24

Max Flow and Min Cut

Greedy algorithm

1. select a path from s to t and set its flow to be equal to the minimum capacityamong its edges (≡ bottleneck)

2. for every edge, obtain residual capacity ≡ capacity - flow (“undo” flow sent):

i.e. add edge (w, v) to every edge (v,w) with positive residual capacity

3. augment path with strictly positive residual capacities

Page 118: Minicourse on Network Science

Pavel Loskot c©2014 7/24

Max Flow and Min Cut

Ford-Fulkerson algorithm

• greedy algorithm to find a maximum flow

• find augmenting path with strictly positive residual capacities

• if path can no longer be augmented, the flow is maximum

Max-flow min-cut theorem

• The value of maximum flow is equal to the capacity of the minimum cut.

Complexity of Ford-Fulkerson algorithm

• assume capacities are integers 1, . . . ,U

• Theorem 1: the algorithm terminates in at most |V | · U iterations.

• Theorem 2: if all edge capacities are integers, then the maximum flow hasinteger values of flows on every edge.

Page 119: Minicourse on Network Science

Pavel Loskot c©2014 8/24

Max Flow and Min Cut

Choosing initial augmenting path

• some choices lead to exponential time algorithm, clever choices lead topolynomial time algorithm (number of iterations):

1. choose path with fewest edges (shortest path, breadth first search)

2. choose path with maximum bottleneck capacity (fastest path, priority ordepth first search)

Application: Bipartite matching

• find maximum matching of a bipartite graph G

• solve max-flow problem for extended graph G′

• by integer theorem (see above), there exists a maximum flow with 0/1 values

Page 120: Minicourse on Network Science

Pavel Loskot c©2014 9/24

Take-Home Messages

Applications of max-flow and min-cut theorem

• Network connectivity

• Bipartite matching

• Data mining

• Open-pit mining

• Airline scheduling

• Image processing

• Project selection

• Baseball elimination

• Network reliability

• Security of statistical data

• Distributed computing

• Egalitarian stable matching

• Distributed computing

• . . .

There are many efficient algorithms for solving max-flow min-cut problem.

Page 121: Minicourse on Network Science

Pavel Loskot c©2014 10/24

Network Routing

Routing algorithms

• find the least cost path between any two nodes in the (telecommunication)network

• link cost: e.g. capacity, inverse of delay, or more simply, all links have a unitcost

• path cost: sum of link costs along the path

1. Link state routing algorithms

• assume every node has knowledge about network topology and all link costs

• thus, all nodes have the same (global) knowledge (how?)

• so-called centralized or link state algorithms

2. Distance vector routing algorithms

• only local knowledge of link costs to all neighbors

• iterative computations in collaboration with neighbors

• so-called decentralized or distance vector algorithms

Page 122: Minicourse on Network Science

Pavel Loskot c©2014 11/24

Network Routing

Link state routing: Dijkstra algorithm

• every node computes the least cost path to all other nodes in the network

• the computed paths are stored in so-called forwarding table

• after K iterations, the least cost paths known for K destination nodes


c(x, y) link cost between neighbors x and y (= ∞ if not neighbors)

D(v) current cost of path from source to destination node v

p(v) predecessor node along path from source to node v

V ′ set of nodes whose least cost paths already known

Page 123: Minicourse on Network Science

Pavel Loskot c©2014 12/24

Network Routing

Dijkstra algorithm example

• the shortest path constructed by tracking predecessors

• if ties encountered, they can be broken arbitrarily

Page 124: Minicourse on Network Science

Pavel Loskot c©2014 13/24

Network Routing

Dijkstra algorithm example

Complexity of Dijkstra algorithm

• at each iteration, need to check N nodes not in V ′, i.e., N(N + 1)/2comparisons ∼ O(N2)

• more efficient implementations devised ∼ O(N log N)

Page 125: Minicourse on Network Science

Pavel Loskot c©2014 14/24

Network Routing

Distance vector algorithm

• fully distributed generation of forwarding tables

• based on Bellman-Ford equation (dynamic programming)

dx(y) = minv∈N(x) (c(x, v) + dv(y))

v∗ = argminv∈N(x) (c(x, v) + dv(y))

N(x) neighbors of node x

c(x, v) link cost from x to v

dv(y) cost from neighbor v to destination y

v∗ next hop in least cost path from x to y


dv(z) = 5, dx(z) = 3, dw(z) = 3

du(z) = min

c(u, v) + dv(z),

c(u, x) + dx(z),

c(u,w) + dw(z)

= min

2 + 5,

1 + 3,

5 + 3

= 4

Page 126: Minicourse on Network Science

Pavel Loskot c©2014 15/24

Network Routing

Distance vector algorithm

• Dx(y) is least cost from x to y and it is iteratively estimated

• every node x maintains distance vectors for yourself and all its neighbors;recall that V is set of all nodes and N(x) is set of neighbors of x

Dx =


Dx(y) : y ∈ V]

Dv =


Dv(y) : y ∈ V]

, v ∈ N(x)

as well as x knows costs c(x, v) to all its neighbors v ∈ N(x)

• key idea is to periodically exchange distance vectors Dx among neighbors;

the vectors are then updated using B-F equation as:

Dx(y)← minv∈N(x)

(c(x, v) + Dv(y)) , for ∀y ∈ V

so (under some minor conditions) estimate Dx(y) −→ true value dx(y)

Distance vector updates (at each node)

1. asynchronous: triggered by change of local link cost, or by update message

from the neighbor

2. synchronous: notify all neighbors if own distance vector changes

Page 127: Minicourse on Network Science

Pavel Loskot c©2014 16/24

Network Routing

Example updates

Page 128: Minicourse on Network Science

Pavel Loskot c©2014 17/24

Network Routing

“Good news travel fast”

“Bad news travel slow”


Link state Distance vector

Messages O(|V | · |E|) msgs sent local exchange only

Convergence O(|V |2), may have time varies, possibly loops,

oscillations count-to-inf problem

Robustness may advertise incorrect may advertise incorrect

link cost, each node path cost, each node’s

computes its own table table used by others

(errors propagate )

Page 129: Minicourse on Network Science

Pavel Loskot c©2014 18/24

Search on Networks• the aim is to find some source-destination path in reasonable amount of time

• the path cost is not an issue unlike in routing

Surprising observations (from real-world networks)

1. short paths exist between pairs of nodes (6 degree separation)

2. these short paths can be discovered (and used)


• both observations closely interrelated

• it is not so clear how to discover (or even create) these short paths

• typical situation is nodes have only local rather than global information;flooding to discover the destination known to be very inefficient

Decentralized search

• Kleinberg’s small world networkmodel: n × n grid of nodes with localconnections plus every node v has arandom long range link to node w

Pr(v link to w) ∼ d(v,w)−α, α ≥ 0

and distance d(v,w) ≡ #grid steps

• value of α trade-offs how randomlong-range connections are

Page 130: Minicourse on Network Science

Pavel Loskot c©2014 19/24

Search on Networks

Comparing search strategies

• efficiency of a search strategy is expected delivery time (over random long-

range contacts i.e. topology, and random source-destination pairs)

• delivery time ∼ number of hops in the graph (unit-weight links)

Trading-off value of α

• α = 0 long-range links are uniformly distributed (∼ WS model), difficult tonavigate having only local knowledge (and knowing location of destination)

• for α = 0, the actual chosen path to destination is likely to be significantlylonger than the corresponding shortest path

• α > 0 higher clustering, long-range links less random, more realistic scenario

• lower-bounds on expected delivery time [Kleinberg 2000]

TD ≥

const × n(2−α)/3 0 ≤ α < 2

const × (log n)2 α = 2

const × n(α−2) 2 < α < 3

thus, α = 2 is a polynomial in log n, while other cases are polynomials in n

Page 131: Minicourse on Network Science

Pavel Loskot c©2014 20/24

Search on Networks

Web search

• information retrieval since 60’s using “textual analysis”

• more recently, information ranked by its score (e.g., #links to it)

Scoring a webpage

• #webpages pointing to it (unit-weight links)

• sum of the scores of neighboring webpages pointing to it

Page 132: Minicourse on Network Science

Pavel Loskot c©2014 21/24

Search on Networks


• nodes pointed to by highly ranked nodes

• they offer prominent, highly endorsed answers to queries


• nodes that point to highly ranked nodes

Assessing authorities and hubs

• compute weights h(i) (for hubs) and b(i) (for authorities)

h(i) =∑


[A]i j b( j)

b(i) =∑


[A]i j h( j)

• the weights are computed iteratively as (in matrix form)

ht+1 = (AAT )ht

bt+1 = (ATA)bt

• main drawback: it requires global knowledge (of A), so it is query-dependent

Page 133: Minicourse on Network Science

Pavel Loskot c©2014 22/24

Search on Networks

PageRank (named by the Google founder)

• ranking pages independently of queries

• main idea: page is important if it is linked by other important pages

• every page is assigned a weight

w( j) =∑


[A]i j w(i) ·





w(i) weights of in-bound neighboring pages

dout(i) out-degree of node i to dilute its importance

if it links to many other nodes

• the weights w(i) are probabilities that from any starting page, the page i isreached via a random walk

• however, if some page does not have out-bound links, the random walkergets trapped; so with probability s choose random walk, and with probability(1 − s) jump randomly to any other node

Page 134: Minicourse on Network Science

Pavel Loskot c©2014 23/24

Search on Networks


• many strategies may be devised, some are more efficient than others

• decentralized search is a practical requirement in large networks

• in social networks, weak (social) ties and hierarchy play significant role

• visiting the same nodes while searching is inefficient, yet there is tendencyto visit hubs often

To aid the search

• nodes as sources of information are scored (e.g. by level of trust)

• exploiting network structure of (distributed) information helps significantly

• challenge: real-time updates of contents

• ranking (i.e. scoring) algorithms are kept secret and changed (updated)continuously

Page 135: Minicourse on Network Science

Pavel Loskot c©2014 24/24

Take-Home Messages


• it is not only to find source-destination path, but the one having least cost

• it is implicitly assumed that each node has an address (identification)

• routing in the Internet evolved over time (i.e., it has not been designed fromthe beginning)

• it is still unclear why the Internet routing works so well at such large scales

• main issues with the Internet routing are robustness, security and congestion

Search on small world and scale free networks

• small world networks have small short path length and high clusteringcoefficient, however, Watts-Strogatz (WS) model does not capturenavigability of real-world networks

• search is fast and scales well in scale-free networks

Page 136: Minicourse on Network Science

Networks: Software

Page 137: Minicourse on Network Science

Pavel Loskot c©2014 1/11

Software Requirements for Graph Data


• input data in common format (e.g. Excel, CSV, . . . )

• convert (output) data into the desired format (GraphML, Pajek, . . . )

• Social Network Analysis (SNA) of data

• dynamic (temporal) analysis

• data visualization


• steep learning curve (easy to grasp)

• flexibility (use different formats for input andoutput)

• scalability (Big Data, application dependent)

• speed (if Big Data or real-time)

• parallel and distributed computing capability(MapReduce)

• functionality as modules or add-ins

• . . .

Page 138: Minicourse on Network Science

Pavel Loskot c©2014 2/11

Networks in Matlab

Page 139: Minicourse on Network Science

Pavel Loskot c©2014 3/11

Networks in Matlab

Page 140: Minicourse on Network Science

Pavel Loskot c©2014 4/11

Networks with Python

Page 141: Minicourse on Network Science

Pavel Loskot c©2014 5/11

Networks in C, R, Python

Page 142: Minicourse on Network Science

Pavel Loskot c©2014 6/11

Networks Visualization and Analysis

Page 143: Minicourse on Network Science

Pavel Loskot c©2014 7/11

Networks Community Analysis

Page 144: Minicourse on Network Science

Pavel Loskot c©2014 8/11

Social Network Analysis

Page 145: Minicourse on Network Science

Pavel Loskot c©2014 9/11

Popular in Bioinformatics

Page 146: Minicourse on Network Science

Pavel Loskot c©2014 10/11

Networks Online Demos

Page 147: Minicourse on Network Science

Pavel Loskot c©2014 11/11

Networks Data