small world graphs: characterization and alternative ...rama/papers/smallworld.pdf · small world...

Small world graphs:

characterization and alternative constructions

Rama CONT1 and Emily TANIMURA2

(1) IEOR Dept., Columbia University, New York.(2) CAMS – Ecole des Hautes Etudes en Sciences Sociales.

Advances in Applied Probability, Volume 40, no 4 (December 2008).

Abstract

Small world graphs are examples of random graphs which mimicempirically observed features of social networks. We propose an in-trinsic definition of small world graphs, based on a probabilistic for-mulation of scaling properties of the graph, which does not rely onany particular construction. Our definition is shown to encompass ex-isting models of small world graphs, proposed by Watts and studiedby Barbour & Reinert, which are based on random perturbations ofa regular lattice. We also propose alternative constructions of smallworld graphs which are not based on lattices and study their scalingproperties.

1

Contents

1 Introduction 31.1 Complex networks and small world graphs . . . . . . . . . . . 31.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Definitions and notations 52.1 Graphs and graph properties . . . . . . . . . . . . . . . . . . 52.2 Random graphs . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Scaling behavior of graph properties 8

4 Small World graphs: an intrinsic definition 10

5 Lattice-based constructions of Small Worlds 12

6 A lattice-free Small World model 176.1 Scaling behavior of the average degree . . . . . . . . . . . . . 206.2 Clustering behavior . . . . . . . . . . . . . . . . . . . . . . . . 206.3 Scaling behavior of the diameter . . . . . . . . . . . . . . . . 21

7 A randomized community-based small world 317.1 Behavior of the diameter . . . . . . . . . . . . . . . . . . . . . 327.2 Behavior of the average degree . . . . . . . . . . . . . . . . . 327.3 Behavior of the clustering coefficient . . . . . . . . . . . . . . 33

2

1 Introduction

1.1 Complex networks and small world graphs

Empirical studies of a wide variety of social and biological networks [30,3, 31, 27, 29] have revealed that many of them share several interestingproperties:

• links among network nodes are globally sparse i.e. the network is farfrom saturating the total number of possible links

• high local clustering of links: the link structure displays a high lo-cal density, as measured for instance by the clustering coefficient (seebelow).

• although the network may contain a large number of nodes, a pairof nodes in the network is typically linked by a path whose length isorders of magnitude smaller than the network size and grows slowlywith the number of nodes.

These properties distinguish real networks from simple models such as regu-lar lattices or the Erdos Renyi random graph model [18], which possess oneor two of these properties but not all of them, and have inspired the devel-opment of a new class of random graph models, called small world networks[31, 32, 27] which have generated, in turn, a host of applications and newmathematical problems [14, 31, 3].

Intuitively, a small world graph is a random graph which possesses thethree properties above. The prototype of the small world graph, given byWatts [31] is a crossover between a regular lattice and an Erdos Renyi ran-dom graph. The properties of the lattice based model in [31] has beenextensively studied using Monte Carlo simulation and via a mean-field ap-proximation [28].

In the economics literature, Jackson [24] has studied graphs that resultfrom link formation in a game theoretic setting. The Nash equilibrium inthis network formation game is a deterministic graph which exhibits highclustering and short distances between nodes for a wide range of param-eters in the game. These properties are shown to hold at least in a highconnectivity range, where the average degree grows rapidly with networksize.

These small world graphs are complex graph models whose mathematicalstudy can be challenging. Barbour and Reinert [6] have proposed a rigorousanalysis of a slightly modified version of the original construction of Watts.

3

Aldous [1] proposed a growth model in which new nodes are added one at atime and form links to previous nodes according to given probabilistic rules:local weak convergence methods are used to show that parameters in thismodel can be chosen so that the resulting graph is sparse and exhibits highclustering. Bollobas et al. [11] propose a framework for studying a generalclass of large inhomogeneous random graphs. For a recent review on themathematical study of complex networks see [14].

While these studies focus on properties of specific graph models, a com-mon mathematical framework encompassing all these models and enablingto compare them seems to be lacking in the literature: such a frameworkcould emphasize the common features of different constructions and shedlight on the mechanisms leading to the emergence of the small world prop-erty in real networks. As noted by Watts [31], the small world propertyshould be stated in terms of scaling of the graph properties with the size ofthe graph. A reasonable definition of the small world property should applyto the lattice based construction [31] but could potentially include qualita-tively different constructions. In particular, applications in social sciencesare not naturally based on lattice models so lattice-free constructions aredesirable.

1.2 Outline

We propose here an intrinsic definition of small world graphs which doesnot rely on an underlying lattice nor on any particular construction. Ourdefinition is based on a probabilistic formulation of scaling properties ofgraph-theoretical quantities and encompasses existing models of small worldgraphs, proposed by Watts and studied by Barbour & Reinert [6], which arebased on random perturbations of a regular lattice.

Such lattice-based constructions are not natural for applications to socialnetworks, where regular lattices are nowhere to be found. In the secondpart of the paper, we propose two examples of small world graphs which arebased on a community structure to which random links are then added. Asin Watts’ model, the proposed models lead to large inhomogeneous randomgraphs, which match various empirical properties of social networks.

The article is structured as follows. Section 2 recalls some basic notionson (random) graphs and defines a mathematical setting suitable for ourpurpose. In section 3 we define the notion of scaling behavior in a waywhich is meaningful for random graphs.

Based on these definitions, we propose in Section 4 a mathematical def-inition of small world graphs which is intrinsic in the sense that it does not

4

rely on a particular construction. In Section 5 we show that this definitionapplies to Watts randomly perturbed lattice model, in the setting of [6].

In Sections 6 and 7 we propose two examples of random graph modelswhich satisfy the definition of a small world but whose construction is notbased on an underlying lattice.

2 Definitions and notations

Let us start by defining a mathematical framework, which will allow us tomake precise statements about networks, their statistical properties, and thescaling properties of various graph-theoretical quantities with the size of thenetwork. In particular our framework should allow for

• varying the number of nodes of the graph, in order to study large sizeasymptotics and scaling

• randomness in the structure of the network: many network models,including Erdos–Renyi graphs and the Small world construction ofWatts [31], are instances of random graphs. Allowing for randomnessis especially relevant in applications since properties of large networkscan only be described in statistical terms.

2.1 Graphs and graph properties

A graph with nodes labeled i = 1, ..., N is defined by the set of its links,which can be viewed as a subset Γ of 1, ..., N×1, ..., N. Since we have inmind models for social networks, we exclude links between a node and itself:(i, i) /∈ Γ. Such a graph is conveniently represented by a N × N adjacencymatrix M defined by

M(i, j) = 1 if (i, j) ∈ Γ (1)M(i, j) = 0 otherwise. (2)

The set of nodes of a graph Γ shall be denoted [Γ] or, by abuse of notationwhen the context is clear, by Γ. In the sequel we consider undirected graphsi.e. such that (i, j) ∈ Γ ⇐⇒ (j, i) ∈ Γ, although the definitions in Section3 also apply to directed graphs. The set GN of undirected graphs with Nnodes is in one-to-one correspondence with symmetric adjacency matrices:

M ∈ MN×N |M(i, j) = M(j, i) ∈ 0, 1, M(i, i) = 0. (3)

5

We shall denote by

G∞ = ∪N≥1

GN (4)

the set of all (undirected) graphs, endowed with the cylindrical Borel sigma-field B: a function Q : G∞ → R is B-measurable if for all N ≥ 1, itsrestriction QN : GN → R is measurable. Examples of interest are thedegree of a node, the average degree, the (average) clustering coefficient,the typical interpoint distance and the diameter of a graph.

The degree of a node i is defined as the number of nodes it is linked to:

deg(i) =N∑

n=1

1M(i,n)=1. (5)

The average degree of a graph Γ ∈ GN is defined as

deg(Γ) =1N

N∑i=1

deg(i). (6)

The neighborhood V (i) of a node i is defined as

V (i) = j ∈ [1, N ] |M(i, j) = 1 (7)

and the set V k(i) as

V k(i) = j ∈ [1, N ] |Mk(i, j) > 0. (8)

Note that V 1(i) = V (i).Empirical studies on social networks indicate that they are characterized

by high local density, as measured by the notion of clustering coefficient. Thelocal clustering coefficient for a node i is defined as

γi =card(j, k) ∈ Γ|j ∈ V (i) and k ∈ V (i)

card(V (i))(card(V (i)) − 1)/2. (9)

Expressed as a function of the adjacency matrix M ,

γi =

∑j =k|M(i,j)=M(i,k)=1 M(j, k)

(∑N

l=1 M(i, l))(∑N

l=1 M(i, l) − 1)/2). (10)

The average clustering coefficient γ is defined as

γ =1N

N∑i=1

γi.

6

Another quantity of interest is the ‘size’ of the graph as measured by thedistance between the nodes. The distance between two nodes i and j isdefined as the length of the shortest path linking them in the graph:

dΓ(i, j) = infk ≥ 1, Mk(i, j) > 0. (11)

When the context is clear we shall simply denote the distance by d(i, j).The average interpoint distance is given by

d(Γ) =1

N(N − 1)

∑i=j

dΓ(i, j). (12)

Another measure of distance in the graph is the diameter defined as

D(Γ) = maxi=j

dΓ(i, j). (13)

If the graph is not connected D(Γ) = ∞.

2.2 Random graphs

A random graph of size N is a random variable Γ defined on a probabilityspace (Ω,F ,P) with values in GN . The graph theoretical quantities definedabove (diameter, clustering coefficient, average interpoint distance,...) arethen measurable functions of Γ and define random variables on (Ω,F , P ).

Many common random graph models result in sample graphs which mayhave more than one connected component with nonzero probability. In thiscontext measures of graph size such as the diameter are not finite-valued.Also, computation of the diameter requires exhaustive knowledge of all thelinks in the graph so it is not an observable quantity for an observer who hasaccess to a ’representative sample’ of the whole network, such as a socialscientist conducting a survey. The typical interpoint distance, defined asthe distance of a pair of nodes chosen at random is a more flexible notion of’size’ in this case. The typical interpoint distance of a graph Γ is the randomvariable defined as

T (Γ) = dΓ(U), (14)

where U is uniformly drawn among the links [(i, j)|i = j, i, j ∈ N2] and inde-pendent from Γ. It is readily observed that the law of the typical interpointdistance only depends on the graph Γ itself.

Note that the typical interpoint distance is a random variable, even ifthe underlying graph is deterministic. In a graph that consists of several

7

large connected components or a single one with a few isolated nodes, thetypical interpoint distance provides more relevant information than othermeasures of distance :

Example 1. Consider a sequence Sn of graphs with diameters bounded by aconstant C and consider the graph ΓN obtained by adding an isolated pointto Sn. For any N ≥ 2, d(ΓN ) = ∞, while the typical interpoint distanceverifies

P(T (ΓN ) ≤ C) = 1 − 2N

so, as N → ∞, we have

∀ε > 0, lim supP(T (ΓN ) ≥ C + ε) → 0

i.e. T (ΓN ) may converge in probability to a finite valued random variablewhereas the average distance verifies P(lim sup d(ΓN ) = ∞) = 1.

3 Scaling behavior of graph properties

A property often discussed in the literature on social networks is the scal-ing of various graph-theoretical quantities with the graph size [3, 29, 31].Consider a sequence (ΓN )N≥1 of graphs with ΓN ∈ GN and Q, a graphtheoretical quantity, defined as a measurable function on Q : G∞ → R.One can define in the following way the concept of scaling behavior for thequantity Q(ΓN )N≥1.

Definition 1 (Upper scaling bound). Let (ΓN )N≥1 be a sequence ofgraphs with ΓN ∈ GN . A deterministic sequence (g(N))N≥1 is called anupper scaling bound for (Q(ΓN ))N≥1 if

lim supN→∞

Q(ΓN )g(N)

≤ 1.

The notion of lower scaling bound can be defined analogously:

Definition 2 (Lower scaling bound). Let (ΓN )N≥1 be a sequence ofgraphs with ΓN ∈ GN . A deterministic sequence (g(N))N≥1 is called alower scaling bound for (Q(ΓN ))N≥1 if

lim supN→∞

Q(ΓN )g(N)

≥ 1.

8

In some cases upper and lower scaling bounds may coincide up to amultiplicative constant. This is the case if there is an f(N) that belongs tothe upper scaling bounds, and a g(N) belonging to the lower scaling boundsand a C > 0 such that limN→∞

f(N)g(N) = C. In this case we will say that

“Q(ΓN ) scales like f(N)”.

Example 2 (Scaling behavior in nearest-neighbor lattices). Considerthe d-dimensional lattice −n, .., 0, ..nd with nearest neighbor links. Thecorresponding graph ΓN has N = (2n + 1)d nodes and it is readily observedthat, as n → ∞:

• the average degree converges to 2d.

• the diameter (which, in this case, is the length of the diagonal) scalesas cN1/d.

• the clustering coefficient is zero: γi = 0 for any node i.

In this case, these quantities give both the upper and lower scaling bounds,which coincide.

When (ΓN )N≥1 is a sequence of random graphs the above definitionsmust be interpreted probabilistically. There are several possibilities, de-pending on the mode of convergence considered.

Definition 3 (Scaling). Let (ΓN )N≥1 be a sequence of random graphs,defined on the probability space (Ω,F ,P). Consider a measurable functionQ : G∞ →]0,∞[. A deterministic sequence (g(N))N≥1 is said to be

1. an upper scaling bound in expectation of Q(ΓN ),if

lim supN→∞

E[Q(ΓN )]g(N)

≤ 1 (15)

2. an upper scaling bound in probability if

lim supN→∞

P(Q(ΓN )g(N)

≤ 1) = 1 (16)

3. an almost-sure upper scaling bound if

P(lim supN→∞

Q(ΓN )g(N)

≤ 1) = 1. (17)

9

The above possibilities are not equally restrictive. It is thus possible, forexample, that the smallest almost sure upper bound is orders of magnitudegreater than the upper bound in probability or in expectation.

Example 3 (Erdos–Renyi random graph). The simplest random graphmodel, studied by Erdos and Renyi [18] is one where, in a graph with Nnodes, each link is established independently from others, with probabilityp(N). A graph configuration G ∈ GN is thus drawn with probability

P(ΓN = G) = p(N)l(G)(1 − p(N))N(N−1)/2 −l(G) (18)

where l(G) is the total number of links in G. One usually considers thecase where p(N) → 0. Erdos-Renyi graphs exhibit the following scalingproperties [18, 7]:

• the expected degree of a given node scales as Np(N) when N → ∞.

• (from theorem 7.3 in [7])When the expected degree of a link Np(N)verifies

Np(N) = lnN + f(N) with limN→∞

f(N) = −∞,

then P(∃N0 ≥ 1,∀N ≥ N0, ΓN is disconnected) = 1. If limN→∞ f(N)= +∞,then lnN/(ln(lnN)) is an almost–sure upper scaling bound ofthe diameter. In an even higher connectivity range, p(N)N = N1/d,the constant d + 1 defines an almost sure upper-scaling bound.

• the expected average clustering coefficient is p(N). Since the degreeof any node has expectation p(N)N , the expected average clusteringcoefficient cannot have a lower scaling bound greater than 0 unless theexpected degree of a node tends to infinity as N → ∞, not a veryrealistic situation for social networks.

4 Small World graphs: an intrinsic definition

Having defined scaling, we will now attempt to cast in our framework thecharacterization of Small World graphs as formulated by Watts [31].

Watts starting point is a set of empirical observations on the structureof social networks: these are complex networks which are characterized bya number of common features:

• a large number of nodes,

10

• sparsity of links : the number of existing links is far from saturatingthe total number of possible links

• a high degree of clustering, as measured for instance by the clusteringcoefficient and

• length scales, such as typical interpoint distance, which are orders ofmagnitude smaller than the number of nodes.

Comparing these properties with those of two well-known classes of graphs–regular lattices and Erdos–Renyi random graphs— Watts noted that whileneither of these possessed all of the desired properties, the Erdos–Renyigraph can exhibit realistic distance scaling properties, while regular latticescan exhibit high local clustering as network size grows. This led to thefollowing conjecture:

Conjecture 1 (Watts [31]). There exists a class of graphs that are highlyclustered yet whose characteristic length and diameter scales similarly toErdos-Renyi random graphs. These graphs are called small world graphs.

As we have seen from previous examples, the scaling of quantities in thegraph depends on the total number of links. Comparing different graphsin terms of the scaling of quantities such as clustering or distance is onlymeaningful when the average degree is similar. It is thus necessary to imposea condition on the scaling of the average degree. By doing so we also excludetrivial examples: indeed, in the highest connectivity range, when the averagedegree scales as cN , even a lattice would have a diameter bounded by aconstant. Thus, the cases of interest are those where the average degree isorders of magnitude smaller than N .

When this is the case, obtaining small distances between nodes is nottrivial. As an example, a one dimensional k-th-nearest-neighbor lattice withan average degree as high as 2k = N1/d, would still have a large diameterscaling as N1−1/d. Thus, even in this high connectivity range, efficient or-ganization of links is required for the distances in the graph to be short.Bearing in mind the relevance to social networks, we will limit ourselves tolower connectivity ranges although the existence of clustered graphs thathave short diameter when the average degree scales as N1/d is mathemat-ically non trivial. Most models in the literature, including Watts’ model,assume the average degree to be bounded by a constant. We think it isreasonable to impose a slightly more lenient condition, allowing also for anaverage degree scaling as C ln(N). As we have seen, this is the case for

11

the connected Erdos-Renyi graph. In practice a degree scaling as C lnN ,although unbounded, remains small even for large N .

Based on this discussion, we propose the following definition, which al-lows to capture these features:

Definition 4 (Small worlds). A network model (ΓN )N≥1 is said to be aSmall World if the following three conditions are verified:

• There is a constant C1 ≥ 0 such that C1 lnN is an upper scaling boundof the average degree deg(ΓN ).

• There is a C2 ≥ 0 such that C2 ln(N) is an upper scaling bound forthe typical interpoint distance.

• The clustering coefficient is bounded away from zero.

Each of these properties may hold almost surely, in probability or in expec-tation.

Remark 1. A related class of complex networks which has received much at-tention is that of scale-free networks [4, 10] in which the degree distributionshave Pareto tails. The Barabasi-Albert model [4] which is the prototypicalexample of a scale-free graph, has been shown by Bollobas et al. [12] toexhibit small distances between nodes [8] but not to exhibit clustering: theclustering coefficient decreases to zero with the size of the graph [10]. Thus,scale-free graphs need not be small worlds in the sense of the definition 4.It is also obvious that Watts’ Small World graph is not a scale-free graph.

5 Lattice-based constructions of Small Worlds

Let us now recall the original Small World construction of Watts [31, 32]and show that it verifies Definition 4.

The Watts model starts from an initial graph L, chosen to be a regularlattice of low dimension. In this graph, each of the existing links is randomlyrewired with small probability, resulting in a random perturbation of theoriginal lattice. This model typically leads to a disconnected graph withnon-zero probability but this problem can be avoided by adding additionallinks to the initial graph instead of rewiring, as in Barbour & Reinert [6].

We now describe the variation of the Watts model proposed by Barbour& Reinert [6]. Barbour and Reinert construct a graph by superposing a ring

12

lattice L with N nodes and an Erdos–Renyi random graph E. In the ringlattice, each node is connected to the 2k neighbors within distance k in thelattice. In the Erdos–Renyi graph each link exists with probability p

N . Theresulting graph ΓN has links defined by (i, j) ∈ ΓN if and only if (i, j) ∈ Lor (i, j) ∈ E.

Proposition 1. In the perturbed lattice model described above, with 13 ≤

p < 1, and k ≥ 2 we have

• 2 lnN is an upper scaling bound of the typical interpoint distance.

• the average degree is almost surely bounded by 2k + 1 where 2k is thedegree of the lattice, and

• the average clustering coefficient is almost surely bounded below by2k−12k+1 γL, where γL is the average clustering coefficient in the lattice.

Let us recall some results shown in [6]: the method used in [6] approxi-mates the typical interpoint distance between two nodes i and j, by startingfrom each one independently a branching process with branching rate 2k,corresponding to nearest neighbors in the lattice, to which new nodes areadded at a random rate determined by links in the Erdos–Renyi graph. Thetypical interpoint distance is evaluated by estimating the probability thattwo independent branching processes of this type do not intersect beforea given time. While the actual process is complicated due to the possibleoverlap of intervals, it can be approximated using a pure growth processwhose behavior is easier to characterize.

This growth process must take into account two types of intervals withdifferent branching rates, those that have just been created and consist ofonly one point which can be an endpoint for a random link, and those ofmore than one point where 2k new deterministic neighbors at each time arepotential endpoints of random links. Denote by M1(n) the number of onepoint intervals at time n and by M2(n) the number of intervals containing

more than one point at time n. Then the process M(n) =(

M1(n)M2(n)

)is

defined recursively by

M1(n) = Bin((M1(n − 1) + 2kM2(n − 1))N,p

N) (19)

M2(n) = M1(n − 1) + M2(n − 1)M1(0) = 1, M2(0) = 0.

13

The expected evolution of the process verifies

E[M(n)|Fn−1] = AM(n − 1) (20)

where Fn−1 = σ(M(0), ..M(n − 1)) and M =(

p 2kp1 1

). The matrix

A has two eigenvalues λ and λ′, where |λ| > |λ′| and λ > 1. Thus theeigenvalue λ = 1

2 [p + 1 +√

(p + 1)2 + 4p(2k − 1)] dominates in the long runbehavior of the process. If we denote by f1 the standardized eigenvectorassociated with the eigenvalue λ, we have

E[(f1)T M(n)|Fn−1] = (f1)T AM(n − 1) = λ(f1)T M(n − 1). (21)

Thus (W (n))n≥0 where W (n) = λ−n(f1)T M(n) defines a martingale whoselimit Wk,p := limn→∞ W (n) appears in the characterization of the typicalinterpoint distance.

If the processes started from i and j respectively have not intersectedat time n, the distance between the nodes i and j is at least 2m. In [6],the time of intersection is expressed in relation to the typical time at whichthat the first intersection is likely to occur. This time, nd, is defined by therelation λnd ≤ √

Np < λnd+1, implying λnd = φ√

Np, for 1λ < φ ≤ 1.

The main result in [6] states

Proposition 2. [6, Thm 5.9, p. 1271] There is a random variable V veri-fying for any x ∈ Z

P(V ≥ 2nd + x) = E exp− λ2

λ − λ′ (λ − p)φ2λxWk,pWk,p (22)

where λ, nd, φ and Wk,p have been defined above and Wk,p is an independentcopy of Wk,p.

The total variation distance between the distributions of T (ΓN ), the typ-ical interpoint distance and the random variable V verifies

DTV (L(T (ΓN )), L(V )) = O(ln(Np)(Np)γ/(4−γ)) (23)

where γ verifies 0 < γ ≤ 12 for any fixed kp. Hence DTV (L(T (ΓN )), L(V )) →

0 if Np → ∞ and kp is fixed.

The righthand side of (22) can be used to obtain a detailed character-ization of the typical interpoint distance. Here we are interested in resultsthat bound its upper tail. For our purposes, it was convenient to denoteby f(p, k) any bounded strictly positive quantity (not always the same one)that does not depend on N , when recalling the following bound from [6]:

14

Proposition 3. [6, p. 1280] The random variable V verifies

P(λ − 1

2(V − 2nd) ≥ z) ≤ f(k, p)e−f(k,p)zlog(f(k, p) + f(k, p)ef(k,p)z),

forz ∈ λ − 12

Z

whenever pk < 1.

We derive from the above that limN→∞ P(V ≥ 2 lnN) = 0: We haveλ = 1

2 [p+1+√

(p + 1)2 + 4p(2k − 1)]. If we choose p such that 23 ≤ pk < 1,

we have λ > 74 , and thus 2nd ≤ 2 ln((Np)

12

lnλ ≤ ln(Np)

ln( 74)≤ α ln(N), where α < 2.

Define the sequence (xN )N≥N0 by xN = maxx∈λ−12

Z|x ≤ λ−12 (2 lnN −

2nd) so that limN→∞ XN = ∞. Then

limN→∞

P(V ≥ 2 lnN) ≤ limN→∞

P(λ − 1

2(V − 2nd) ≥ xN ) ≤ (24)

limN→∞

f(k, p)e−f(k,p)xN log(f(k, p) + f(k, p)ef(k,p)xN ) = 0.

Remark 2. Although (3) assumes that pk < 1, it is obvious that the upperbound we obtained for the typical interpoint distance would be valid also forpk > 1, since the probability of every link in ΓN is increasing in p and k.

By (23) we conclude that

limN→∞

P(T (ΓN ) ≥ 2 lnN) ≤ limN→∞

[P(V ≥ 2 lnN)+O(ln(Np)(Np)γ/(4−γ))] = 0.

Degree: Clearly, when the probability of random links is small, thedegrees are close to that in the lattice. Label the elements of 1, ..., N ×1, ..., N in an arbitrary way i = 1, ..., N(N − 1)/2 and define Yi = 1 if thelink i exists in the Erdos-Renyi subgraph E and 0 otherwise: the Yi are theni.i.d. Bernoulli variables and we can apply

Lemma 1 (Bernstein’s inequality). Let Sk =∑i=m

i=1 Xi where (Xi)i=mi=1 is

a sequence of i.i.d. variables such that E[Xi] = 0, E[X2i ] = σ2 and |Xi| ≤ L,

then

P(|Sm| ≥ tσ√

m) ≤ 2 exp(− t2

1 + α/3), (25)

where α = Ltσ√

m.

15

Define Xi = Yi − E[Yi] and note that σ = pN . With m = N(N − 1)/2

and summing over N we obtain for any ε > 0:

∑N

P(1N

i=N(N−1)/2∑i=1

Yi ≥ p + ε) < ∞. (26)

so by the Borel-Cantelli theorem

lim supN

1N

i=N(N−1)/2∑i=1

Yi < p + ε P − a.s.

The average degree of a node is given by

2k +1N

i=N(N−1)/2∑i=1

Yi

where 2k is the degree in the underlying lattice. Since p < 1, we can chooseε > 0 so as to ensure that almost surely the average degree is bounded by2k + 1.

Clustering:When the number of random links added is almost surely bounded, we

can infer a bound on the average clustering coefficient. We define ∆i =card(j, l) ∈ Γ|j ∈ VL(i) and l ∈ VL(i), where VL(i) refers to i’s neighborsin the lattice.(In a regular lattice, ∆i = ∆ is the same for all i.) The averageclustering coefficient in the lattice γL is then given by

γL =∆

2k(2k − 1)/2. (27)

If 2k is the degree of each node in the lattice and Xi the degree of i inthe Erdos–Renyi graph, then the total degree is 2k + Xi in the perturbedlattice, and the average clustering coefficient in the Small world graph γverifies

γ ≥ 1N

i=N∑i=1

∆(2k + Xi)(2k − 1 + Xi)/2

. (28)

By (26) the average clustering coefficient is almost surely bounded belowby the solution to

min∑N

i=1 Xi=N1N

i=N∑i=1

∆(2k + Xi)(2k − 1 + Xi)/2

.

16

The problem above has a unique solution X1 = X2 = ... = XN = 1.Indeed if we define f(Xi) = ∆

(2k+Xi)(2k−1+Xi)/2 , then for any S > 0 theunique solution to the problem

minX1+X2=S

f(X1) + f(X2)

is X1 = X2 = S2 . Suppose that for N > 2, (xi)i=N

i=1 is a solution to (29),then we cannot have xl = xm for some m and l, since terms in the sum weminimize could be made smaller by putting Xl = Xm = xl+xm

2 . Thus wehave

γ ≥ ∆(2k + 1)(2k − 1 + 1)/2

(29)

which implies γ ≥ ∆(2k+1)(2k−1+1)/2 = 2k−1

2k+1 γL.

6 A lattice-free Small World model

The construction discussed in the previous section is based on random per-turbations of a regular lattice. Although this construction satisfies the re-quired scaling properties of small worlds, it is not a plausible model for thegenesis of small world phenomena in the context of social or biological net-works, since regular lattices are not natural underlying structures in suchcontexts. We propose now a lattice-free small world model, verifying Defi-nition 4, which is more natural for such applications. The idea, which canbe seen as a variation on Granovetter’s notion of weak links between socialcommunities [20], is to start from an initial set of disjoint fully connectedgraphs, representing tightly-knit communities, and introduce random links

We partition the nodes 1, ..., N into M disjoint sets of nodes G1, ...,GM :these clusters can be seen as ’communities of origin’ of the nodes. For ease ofexposition we will consider the case where Gi contains δ nodes with N = Mδ,but this can be easily relaxed. We then associate to every node i = 1..N acluster Xi uniformly drawn among G1, ...,GM , and link i to all nodes in thecluster Xi. Xi can be viewed as the ’secondary community’ of the node i.Finally, we link all pairs of nodes i and j which share a common secondarycommunity i.e. Xi = Xj .

The construction can be summarized as follows: define for m = 1..Mthe set of nodes

Am = Gm

⋃i ∈ 1, .., N|Xi = Gm, (30)

17

ΓN can then be characterized by

(i, j) ∈ ΓN ⇐⇒ i = j and (i, j) ∈m=M⋃m=1

(Am × Am).

The resulting graph contains fully connected subgraphs Am × Am, linkedtogether in a random manner. Typically, each node belongs to exactly twoclusters. In modeling terms we may think of the graph as a social networkwhere each individual belongs to two communities, such as, for example, agroup of colleagues and a group of family members, thus connecting twootherwise disjoint groups. We shall see that these multiple community af-filiations that give rise to the Small World property. Communities are alsonatural in other contexts. In the context of the World Wide Web, theseclusters would correspond to web domains or closely interlinked online com-munities. The presence of community structures in social networks is welldocumented in empirical studies [15] and algorithms have been proposed fordetecting underlying community structures in networks [25, 21, 23] but suchstructures have rarely been integrated in network models.

We will now study the scaling properties of the model defined above.

Proposition 4. The above construction with δ ≥ 6 and d = 1, satisfiesDefinition (4). More precisely:

• 6 lnN is an almost-sure upper scaling bound for the diameter of thegraph.

• The expected average degree is bounded from above by 4δ − 1.

• The local clustering coefficient at each node is almost-surely boundedfrom below by 1

4 .

Remark 3. The choice δ ≥ 6 is for analytical convenience and the boundis not tight. It is likely that δ could be chosen smaller and/or that onecould obtain a smaller scaling bound for the chosen value of δ. However, thecluster size δ must be greater or equal to 3 to ensure almost surely that nocluster is isolated.

18

A_k A_l

node i

Figure 1: Local view: the node i is the only common member of the com-munities Ak and Al.

Figure 2: Full lines link members of the same initial cluster. Dashed linesrepresent links created after the realization of the random variables.

19

6.1 Scaling behavior of the average degree

Defining Sm := Cardi|Xi = Gm we have card(Am) ≤ δ +Sm and the totalnumber of links in the graph is then bounded by

M∑m=1

card(Am)(card(Am) − 1)2

≤M∑

j=1

(Sj + δ)(Sj + δ − 1)/2. (31)

Since the N variables (Xi)i=Ni=1 are independently and uniformly drawn among

G1, ...,GM , the variables (Sj)j=1..M follow a multinomial law

P(S1 = s1, ..SM = sM ) =

N !

s1!...sM !(1M )N if

∑i=Mi=1 si = N

0 otherwise

Thus E[Sj ] = NM = δ and V ar(Sj) = N

M (1 − 1M ) = δ(1 − 1

M ) so that

E[S2j ] = V ar(Sj) + (E[Sj ])2 = δ(1 − 1

M) + δ2.

We have

E[M∑

j=1

(Sj + δ)(Sj + δ − 1)/2] = (32)

12

M∑j=1

E[S2j ] + (2δ − 1)E[Sj ]) + (δ(δ − 1)) ≤ M(4δ2 − δ)/2.

The average degree in a graph of N nodes equals the total number of linksdivided by N/2. Thus (4δ − 1) is an upper scaling bound for the expectedaverage degree.

6.2 Clustering behavior

We need to prove that the scaling of the clustering coefficient is boundedbelow by a constant. We first prove a lemma on local clustering in graphsthat we then apply to our construction.

Lemma 2. Consider a graph Γ and the subgraph Γ(i) = Γ⋂

[V (i) × V (i)]formed by the neighbors V (i) of i and links between them. If Γ(i) can bepartitioned into K disjoint complete subgraphs then the local clustering coef-ficient γi verifies γi ≥ 1

K − 1h where h is the number of nodes of the smallest

subgraph.

20

Proof. Consider such a partition of Γ(i) and let C1, C2, ..Cl be the number ofnodes in each complete subgraph. Then there are at least

∑lj=1 Cj(Cj−1)/2

links between i’s neighbors. The maximal number of possible links betweeni’s neighbors is(∑l

j=1 Cj)(∑l

j=1 Cj − 1)/2 links. Thus we have

γi ≥∑l

j=1 Cj(Cj − 1)

(∑l

j=1 Cj)2=≥

∑lj=1 C2

j

(∑l

j=1 Cj)2−

∑lj=1 Cj∑lj=1 C2

j

. (33)

Using 2CiCj ≤ C2i + C2

j we obtain

(l∑

j=1

Cj)2 =l∑

j=1

C2j +

∑j =k

CjCk ≤l∑

j=1

C2j + (l − 1)

l∑j=1

C2j = l

l∑j=1

C2j .

Noting that ∑lj=1 Cj∑lj=1 C2

j

≤ maxj

1Cj

,

the result follows.

Now we apply the proposition to our model. Since all nodes in thesame Am, m = 1, ..M are connected any subset of Am is a complete sub-graph. Each i belongs to at most two such sets Aj and Ak. These setsmay have common elements but we can always obtain a disjoint unionV (i) = (Aj − i − [Gk])

⋃(Ak − i − [Gj ]). The sets in the union con-

tain at least card([Gj ]) − 1 and card([Gk]) − 1 elements respectively. Thush ≥ δ − 1 ≥ 5. Applying the proposition, we obtain as lower bound for theclustering coefficient 1

2 − 1δ−1 ≥ 1

4 , as stated in proposition 4.

6.3 Scaling behavior of the diameter

In this section we will show that

P(lim supN

D(GN )6 lnN + 5

≤ 1) = 0. (34)

Our proof consists in showing that for all K ≥ K0

P(D(GK)

6 ln K + 5≥ 1) ≤ 1

K1+α, α > 0. (35)

21

Using this bound, we have from the Borel-Cantelli theorem

P(lim supN

D(GN )6 lnN

≥ 1) = P(lim supN

D(GN )6 lnN + 5

≥ 1) (36)

≤ limN→∞

∑K≥N

P(D(Gk)

6 lnK + 5≥ 1) ≤ lim

N→∞

∑K≥N

1K1+α

= 0.

In order to show that

P(D(GN )

6 ln N + 5≥ 1) ≤ 1

N1+α(37)

for sufficiently large N , we will proceed in several steps. We start from anarbitrary subgraph G. First we will estimate the probability that there isa set S within distance lnN from G that contains at least M

2 clusters orequivalently N

2 nodes. Then we estimate the probability that all nodes inSC are close to S.

We construct the set S by considering the clusters that we can reach froma cluster G1 in a given number of steps. In what follows, we will assume thatthe number δ of nodes in each cluster verifies δ ≥ 6. We start from the setE0 = G1 and then define Ek and Fk recursively in the following way:

Fk = i ∈ [1, ..., N ]|∃G ∈ Ek, i ∈ G (38)

Ek = G /∈k−1⋃l=1

El|∃j ∈ Fk−1, Xj = G. (39)

We then denote

Fk = cardF k = cardi ∈ [1, ..., N ]|∃G ∈ Ek, i ∈ G. (40)

Thus Ek is the set of clusters that can be reached after exactly k stepsfrom G1, and Fk is the number of nodes that belongs to the clusters in Ek.Define the events

Ak = card(Ek) ≥ e card(Ek−1).

We note that Ak occurs if∑

j∈Fk−11Aj ≥ e card(Ek−1). Aj denotes the

event [Xj /∈ ⋃m<k−1 Em and Xj = Xa for a ∈ Fk−1 such that a ≤ j]. The

event Ak implies that not too many of the successors of the nodes in Fk−1

belong to the sets that have been found previously.

22

Define, for u ≥ 0, the stopping times

τu = minl ≥ 1, card(l⋃

j=0

Ej) ≥ u. (41)

For a given T , we want to estimate the probability of the event

P(T⋂

j=1

Aj) =T∏

j=1

P(Aj |m=j−1⋂m=1

Am). (42)

We will now establish the lower bound P(Aj |⋂m=j−1m=1 Am) ≥ 1 − c

M3 . Weobtain this bound in a different manner depending on whether j = 1, 2 ≤j ≤ τ√M and τ√M ≤ j < τM

2.

j = 1: We are on A1 if E1 contains at least 3 > e different elements thatare not in E0. The probability of drawing an element among the previoussets is bounded by C

M thus whenever at least 3 of the δ ≥ 6 elements in N0

have new successors A1 is verified. The probability that at least 3 nodeshave successors found previously has a probability smaller than C

M3 .Now we consider 2 ≤ j ≤ τ√M . In fact, the probability of the event Aj

only depends on Fj−1 and∑m=j−1

m=0 card(Em). While j ≤ τ√M , the latterset necessarily contains less than

√M elements. Thus the probability of

drawing a set that is found previously is smaller than 1√M

. Conditionallyon the events,

⋂m<j Am, we also know that Fj−1 ≥ 3j−1F0. Thus, for every

2 ≤ j ≤ τ√M we have 18 ≤ F1 ≤ Fj−1 ≤ √M . Since Fj = δ card(Ej) with

δ ≥ 6 we see that we are on Aj if at most 12 Fj−1 of the variables (Xl)l∈F j−1

are among the previously found sets. Since the probability of intersectingprevious sets is bounded by 1√

Mwe have:

P(Ak|k−1⋂j=1

Aj) ≥ 1 −m=Fk−1∑

m= 12Fk−1

(Fk1

m

)(

1√M

)m. (43)

The binomial coefficient(

Fk1

m

)= (Fk−1−m+1)..(Fk−1)

m! . The numerator con-

tains m terms each smaller than Fk−1. Since k − 1 ≤ τ√M , Fk−1 ≤ √M.

We distinguish between two cases. First case 2 ln M < Fk−1 ≤ √M.

23

Then we have

m=Fk−1∑m= 1

2Fk−1

(Fk1

m

)(

1√M

)m ≤m=Fk−1∑

m= 12Fk−1

(Fk−1)m

m!(

1√M

)m ≤ (44)

m=Fk−1∑m= 1

2Fk−1

(√

M)m

m!(

1√M

)m ≤m=Fk−1∑

m= 12Fk−1

1(ln M)!

≤

Fk−11

(lnM)!≤

√M

(ln M)!≤ 1

M3.

Second case: conditionally on A1, we have 18 ≤ F1 ≤ Fk−1 ≤ 2 lnM .We have the bound

m=Fk−1∑m= 1

2Fk−1

(Fk1

m

)(

1√M

)m ≤m=Fk−1∑

m= 12Fk−1

(Fk−1)m

m!(

1√M

)m ≤ (45)

m=Fk−1∑m= 1

2Fk−1

(2 lnM)m(1√M

)m ≤m=Fk−1∑

m= 12Fk−1

(2 ln M√

M)9 ≤

Fk−1(2 ln M√

M)9 ≤ (2 lnM)10

1M4

≤ 1M3

.

We used the fact that there is an M0 such that for M ≥ M0 the term(2 ln M√

M)m is decreasing in m.

Now we establish a lower bound for P(Aj |⋂m=j−1m=1 Am), when τ√M <

j ≤ τM2

. For such a j, conditionally on⋂m=j−1

m=1 Am there is a 0 ≤ β < 1

such that Fj ≥ Mβ .Indeed, conditionally on⋂m=j

m=1 Am we have Fj ≥ ejF0 ≥eτ√M F0. Since for every T ≥ 0,

∑l≤T card(El) ≤

∑l=Tl=0 δl ≤ δT+1, we have

τ√M = minT ≥ 1|T∑

j=0

card(Ej) ≥√

M ≥ minT ≥ 1|δT+1 ≥√

M. (46)

It follows that there must be a β > 0 such that τ√M ≥ ln(Mβ) which impliesFj ≥ eτ√M F0 ≥ MβF0 ≥ Mβ .

24

For τ√M < k ≤ τM/2, we obtain a lower bound for the probability of(Ak|⋂j<k Aj) using Bernstein’s inequality (1). Conditionally on

⋂k−1j=1 Aj ,

Ak occurs if at least e card(Ek−1) out of the Fk−1 = δ card(Ek−1)≥ 6 card(Ek−1) independent variables reach new successors. By the defi-nition of τM/2, for k ≤ τM/2 the probability that the uniformly distributedvariables (Xi)i∈Fk−1

reach one of the previously attained elements is smallerthan 1

2 , and thus the probability of finding a new successor is greater thanP(Yi = 1) where Yi follows a Ber(1

2). We fix an ε ≥ 0 such that 3 − ε ≥ e.If we let (Yi) denote such a sequence of i.i.d. Ber(1

2), we have

P(Ak|⋂j<k

Aj) ≥ 1 − P(|i=Fk−1∑

i=1

(Yi − 12)| ≥ εFk−1). (47)

We will bound the right hand side using again the Bernstein inequality (1).For a sequence of i.i.d. Ber(1

2) we apply the (1) to (Xi− 12)i=N

i=1 with σ =12 , L = 1 and t =

√ε√

Nσ which gives α =

√ε/σ ≤ 3 and t2

1+(α)/3 ≥ εN2σ2 ≥ εN.

We have

P(|i=Fk−1∑

i=1

(Xi − 12)| ≥ εFk−1) ≤ 2e−εFk−1 . (48)

As we saw before,conditionally on⋂

j≤k−1 P j , we have F k−1 ≥ Mβ , withβ > 0.

Thus by the Bernstein inequality, we obtain for τ√M ≤ k ≤ τM/2

P(Ak|⋂j<k

Aj) ≥ 1 − 2e−εMβ ≥ 1 − 1M3

, (49)

where the last inequality holds for M ≥ M0.

We denote L =: min[l|el ≥ M2 ] < lnM . If we are on Aj until L,

card(EL) ≥ eLcard(E0) ≥ M2 . Thus if the events Aj hold until τM

2, neces-

sarily τM2≤ L. We define the bounded stopping time T =: min(L, τM

2). We

note that we have

(T⋂

l=1

Al) ⊂ (τM2≤ L). (50)

25

Since we have shown that for 1 ≤ j ≤ τM2

, P(Aj |⋂m=j−1m=1 Am) ≥ 1− 1

M3

and since the stopping time T is bounded by lnM we have

P(τM2≤ L) ≥ P(

T⋂l=1

Al) =T∏

j=1

P(Aj |j−1⋂l=1

Al) ≤ (51)

(1 − 1M3

)T ≤ (1 − 1M3

)ln M .

To conclude we use the following limit results:

Lemma 3. Let f(N) be a function verifying limN→∞ f(N) = ∞ and g(N)a function verifying limN→∞ g(N) = ∞ and such that h(N) =: g(N)

f(N) verifieslimN→∞ h(N) = 0 then there is an N0 such that we have for all N ≥ N0,

(1 − 1f(N)

)g(N) ≥ 1 − g(N)f(N)

. (52)

Proof. The assumptions imply that we have

limN→∞

(1 − 1f(N)

)f(N) = e−1 (53)

and for any Φ(n) with limn→∞ Φ(n) = 0, we have

limN→∞

eΦ(N) − 1Φ(N)

= 1. (54)

Fix ε ≥ 0 There is an N0 such that for N ≥ N0, we have (1 − 1f(N))

f(N) ≥e−(1+ε). and eΦ(N)−1 ≥ (1− ε)Φ(N) ⇔ e(Φ(N) ≥ 1+(1− ε)Φ(N). Applying(54) with Φ = −(1 + ε)h(N) we obtain

(1 − 1f(N)

)g(N) ≥ e−(1+ε)h(N) ≥ 1 − (1 − ε)(1 + ε)h(N) ≥ 1 − g(N)f(N)

.

Applying the lemma to f(M) = M3 and g(M) = lnM we have

P(τM2≥ lnM) ≤ 1 − (1 − 1

M3)ln M ≤ ln M

M3. (55)

26

We define the set S =:⋃τ M

2j=0 Ej .

We note that when τM2≤ lnM , for all G,G′ ∈ S

dΓ(G,G′) ≤ dΓ(G, S) + dΓ(G′, S) ≤ 2 lnM. (56)

Thus

P(dΓ(G,G′) ≤ 2 lnM) ≥ P(τM2≤ lnM) ≥ 1 − lnM

M3. (57)

Also, card(S) ≥ M2 by definition of τM/2.

At this point, it is very easy to show that, almost surely, the typicalinterpoint distance scales logarithmically. With some additional work, weshow the same to be true for the diameter.

We use a construction that divides the clusters in SC into disjoint con-nected sets containing at most ln M elements and satisfying either property

(H1) : [I ⊂ SC |∃G ∈ I such that d(Gm, S) = 1 or lnM − 1 ≤ card(I)]

or property

(H2) : [I ⊂ SC |∃G ∈ Iand ∃I ′ satisfying (H1) such that d(G, I ′) = 1].

These properties will later be used to show that the constructed sets arelikely to be close to S.

We construct disjoint sets (Ij)kmaxj=1 in the following way. We define I0 =

∅. For k ≥ 1 we then take an arbitrary element G ∈ SC −⋃l≤k I l and define

G =: Ek0 . Then we define the following sets recursively

F kl = i ∈ [1, ..., N ]|∃G ∈ Ek

l , i ∈ G (58)

and then denote by F kl the card(F k

l )/2 nodes of smallest index in F kl

and define

Ekl = G /∈

l−1⋃j=1

Ekj |∃a ∈ F k

l−1, Xa = G. (59)

This is essentially similar to what was done previously, except that nowwe only use the successors of half of the nodes that we find at each step.For the process (Ek

j )j ,we define the following stopping times:

27

τkln M = inft|

t∑j=0

card(Ekj ≥ lnM) (60)

τkS = inft|Ek

t

⋂S = ∅) (61)

τkH1

= inft|Ekt

⋂Is = ∅, s ≤ t, IsverifiesH1 (62)

τkH2

= mint|Et

⋂Is = ∅, s ≤ t, IsverifiesH2 (63)

τk = minτkln M , τk

S , τkH1

, τkH2

. (64)

We will show that τk is almost surely bounded. We construct (Ekj ) and

(F kj ) for j ≤ τk. Then we define Ik in the following way: if τk = τk

S , we

define Ik =:⋃τk

j=0 Ekj ; if τk = τk

ln M , we define Ik =:⋃τk

j=0 Ekj − A, where

A ⊂ Eτln M is chosen so that lnM − 1 ≤ card(⋃τk

j=0 Ekj −A) ≤ lnM . The Ik

defined this way verifies property H1.If τk = τk

H1, we also define Ik =:

⋃τk

j=0 Ekj and Ik verifies H2. If τk = τk

H2,

there is an s < k such that Is⋂

Eτk = ∅. In this case, we consider the set

Is⋃

(⋃

j≤τk

Ekj ). (65)

If it has less than lnM elements, we call it Ik and it satisfies property H2.If it has more than lnM elements we split it into a set satisfying propertyH1 and one or several sets satisfying H2 in the following way:

The set Is⋃

(⋃

j≤τk Ekj ) is connected by construction. Take an arbitrary

cluster G in the set. Consider the nodes at a maximal distance d from G andremove them one by one as long as more than lnM nodes remain. Thendo the same at distance d − 1 and so on. Nodes are removed one by oneuntil the remaining connected set contains no more than lnM clusters andthus verifies H1. We call this set Ik

1 . We partition the removed nodes intoconnected components (Ik

l )l=2..L. (Some components may consist of oneelement). Since the division into connected components is a partition ofthe set of removed nodes, each element is in exactly one component. Eachof these connected components contains an element linked to Ik

1 and thusverifies H2. Indeed, a removed node at distance d from G is linked to a nodeat distance d− 1 from G.Either this node is in Ik

1 or it is linked to a node atdistance d− 2 from G. Since there is a C such that all elements at distance≤ C from G belong to Ik

1 , each removed node belongs to a chain, and thus to

28

a connected component including an element at distance 1 from Ik1 . When

Ik is constructed, we repeat the procedure for k + 1. We continue this untila kmax such that SC − ⋃

l≤kmaxIl = ∅. Necessarily kmax ≤ card(SC) ≤ M .

We have constructed disjoint sets (Ik)k=kmaxk=1 such that every element in SC

belongs to a exactly one set Ik.

Now we will show that τk is almost surely bounded by lnM for k =1, 2..kmax. In this section, we say that the event Ak occurs if card(Ek) ≥ 1.This guarantees that at least one new cluster is reached at each step. Wedefine the bounded stopping times T k =: min(lnM, τk). Then (

⋂T k

j=1 Qj) ⊂(τk ≤ lnM)). The event Aj occurs when at least one new cluster is foundat step j. Before T k, fewer than lnM groups have already been reached.Since we use half of the successors of the nodes in a cluster, that is δ

2 ≥ 3successors, P (Aj |⋂j−1

l=1 Al) ≥ (1 − ( ln MM )3).We have

P(τk ≤ ln M) ≥ P(T k⋂l=1

Al) =T k∏j=1

P(Aj |j−1⋂l=1

Al) ≥ (66)

(1 − (ln M

M)3)T k ≥ (1 − (

(ln M)M

)3)ln M .

Moreover, the constructions at steps k = 1, 2..kmax are independent.They involve different nodes since constructions stop when intersecting pre-vious constructions. Thus we have

P(k=kmax⋂

k=1

τk ≤ lnM) =k=kmax∏

k=1

P(τk ≤ lnM) ≥ (1 − (lnM

M)3)M ln M . (67)

We can then apply (3) to obtain

P(k=kmax⋂

k=1

τk ≤ lnM)

≥ (1 − (ln M

M)3)M ln M > 1 − M lnM ∗ (lnM)3

M3> 1 − 1

M1+ε(68)

for ε > 0. Now we will give a lower bound of the probability that d(Ij , S) =1 when Ij is a set verifying H1. If Ij verifies H1 it either contains anelement linked to S, or contains lnM subgraphs. In the second case, since

29

we only used half of the successors of the nodes in Ij to construct the set,there are still δ

2 lnM ≥ 3 lnM nodes whose successors are unexplored. Wecondition on the event τM/2 ≤ lnM, so that the set S contains at leastM/2 subgraphs and on ⋂k=kmax

k=1 τk ≤ lnM, which guarantees that eachset (Ik)k=kmax

k=1 verifies H1 orH2. The probability that a successor is in S isat least 1

2 . Since there are at most kmax ≤ M sets verifying H1, and theseare disjoint the conditional probability that they are all at distance 1 fromS is bounded by:

P(⋂

j|IjverifiesH1[d(Ij , S) = 1]|

k=kmax⋂k=1

τk ≤ lnM, τM/2 ≤ lnM) = (69)

∏j|IjverifiesH1

P(d(Ij , S) = 1|k=kmax⋂

k=1

τk ≤ lnM, τM/2 ≤ lnM) ≥

(1 − (12)3 ln M )M ≥ (1 − 1

M2+α)M ≥ 1 − 1

M1+α

where α > 0. The last inequality by applying (3). If all sets verifying H1

contain an element at distance 1 from S, then no element in a set verifyingH1 is further than lnM + 1 from S, since sets verifying H1 are connectedand contain at most lnM elements. It follows that all sets verifying H2 mustbe at distance at most 2 lnM + 2 from S, since they contain an element atdistance 1 from a set verifying H1, are connected and have no more thanlnM elements. Thus the previous estimate implies

P(maxdΓ(x, S), x ∈ SC ≤ lnM + 2|k=kmax⋂

k=1

τk ≤ lnM, τM/2 ≤ lnM)

≥ 1 − 1M1+α

. (70)

To conclude, we consider any two clusters G and G′. When maxdΓ(x, S), x ∈SC ≤ ln M + 2, there are G ∈ S and G ∈ S such that d(G, G) ≤ 2 lnM + 2and d(G′, G) ≤ 2 lnM + 2. Also we saw previously that if G and G are inS,and τM/2 ≤ lnM , then d(G, G) ≤ 2 lnM . Thus the probability that

d(G,G′) ≤ d(G, G) + d(G′, C) + d(G, G) ≤ 6 lnM + 4 (71)

30

is bounded by the product of the estimations (55),(68) and (70):

P(dΓ(G,G′) ≤ 6 lnM + 4 ) ≥

P(maxdΓ(x, S), x ∈ SC ≤ ln M + 2|kmax⋂k=1

τk ≤ lnM, τM/2 ≤ lnM)

P(kmax⋂k=1

τk ≤ lnM) P (τM/2 ≤ lnM) ≥ (1 − 1M1+ε

)(1 − 1M1+ε

)(1 − lnM

M3)

≥ 1 − C

N1+ε′ .(72)

Now, for any two nodes x, y, if x ∈ G and y ∈ G′, we have dΓ(x, y) ≤d(G,G′) + 1 since all the clusters are complete subgraphs. Thus we haveshown that

P(D(GN )

6 ln(N) + 5≥ 1) ≤ 1

N1+ε′ . (73)

By the arguments in the beginning of this section, it follows that

P(lim supN

D(GN )6 ln(N)

≥ 1) = 0 (74)

or equivalently that 6 ln(N) is almost surely an upper scaling bound ofD(GN ).

7 A randomized community-based small world

In this section we propose another random graph model with small worldproperties, which can be seen as a randomized version of the precedingconstruction: we start from a collection of complete graphs representingcommunities but allow each node to link to a random number of secondarycommunities. Allowing the number of communities to be random is bothnatural from a modeling perspective and simplifies the study of scaling prop-erties, as detailed below.

We will show that this construction verifies Definition 4, with a logarith-mic scaling of the expected average degree, whereas the expected averagedegree was bounded independently of N in the previous construction.

We consider a set of N nodes partitioned into M “communities” (clus-ters) G1, ...,GM of size ∼ lnN . To simplify the exposition we will assume

ln N − 1 ≤ card(Gm) ≤ lnN + 1 for m = 1, .M.

31

The graph ΓN is constructed as follows. First, we link all nodes in each ofthe clusters G1, ...,GM . We then introduce links among communities by rep-resenting these clusters as nodes of an auxiliary Erdos–Renyi (hyper)graphE with M nodes and link probability p = r ln(M)

M , with r > 1. If two clustersGm and Gn are linked in E, we randomly pick a node in either Gm or Gn andlink it to all points in the other cluster. Finally, we link all nodes i, j whichbelong to a common cluster. The graph ΓN thus obtained differs from theone described in Section 6 in that the number of clusters to which a node isconnected is now random.

Proposition 5. The above construction is a Small World in the sense ofDefinition 4:

• 2 lnN is an almost-sure upper scaling bound for the diameter,

• 5 lnN is an upper scaling bound of the expected average degree

• the expected clustering coefficient is bounded below by 13 .

7.1 Behavior of the diameter

Since p = r ln(M)M , with r = 1 + δ, Theorem 7.3 from [7, Theorem 7.3] guar-

antees that lnM/(ln(lnM)) is an almost–sure upper scaling bound of thediameter of E. Since the nodes in (Gm)m=1,...,M are completely connected, itfollows that 2 lnM/(ln(lnM))+1 is an upper scaling bound for the diameterof ΓN .

7.2 Behavior of the average degree

To determine the expected average degree, we bound the expected numberof links in the graph. Recall that Gm = (i, j) ∈ Gm × Gm, i = j is acomplete subgraph of ΓN . Define Cm as the maximal complete subgraphcontaining Gm. From the above construction we observe that

ΓN = ∪Mm=1Cm

The number of links in ΓN is therefore bounded by

M∑m=1

card([Cm])(card[Cm] − 1)2

.

The set [Cm] contains the elements in the cluster Gm (at most ln N + 1)plus a number of nodes that is bounded by the degree of m in the auxiliary

32

Erdos–Renyi graph, degE(m). Thus the expected number of total links isbounded by

E[m=M∑m=1

(degE(m) + lnN + 1)(degE(m) + lnN)/2]

=M∑

m=1

(E[(degE(m))2] + (2 lnN + 1)E[degE(m)] + (lnN + 1) lnN)/2.

Since the degree of every node m in the Erdos–Renyi graph follows a Bino-mial law Bin(r ln M

M , M−1), we obtain E[degE(m)] ≤ r lnM and E[(degE(m))2] =V ar[degE(m)] + (E[degE(m)])2 ≤ r lnM + (r lnM)2. Thus, the expectedtotal number of links in the system is less than

M(r lnM + (r lnM)2 + (2 lnN + 1)r lnM + (ln N + 1) ln N)/2≤ M [(r2 + 2r + 1)(lnN)2 + (2r + 1) ln N ]/2.

Dividing by N/2 ≥ (M lnN)/2, the expected average degree is bounded by(2r + r2 + 1) lnN + 2r + 1 ≤ 5 lnN .

7.3 Behavior of the clustering coefficient

We will now show that the total number of links in the auxiliary Erdos–Renyi graph converges in probability to a bounded random variable. Wewill use this fact together with proposition (2) to determine a lower boundfor the clustering coefficient.

We apply theorem (1)to the possible links in the Erdos Renyi graph.There are K = M(M − 1)/2 such links. Since each link exists with proba-bility r ln M

M , we can apply the theorem to (Xi − r ln MM )i=K

i=1 where (Xi)i=Ki=1 is

an i.i.d. sequence of Ber( r ln MM ). We have r ln M

2M ≥ σ2 = r ln MM − ( r ln M

M )2 ≤r ln M

M and t =√

εσ which gives α = Lt

σ√

K= 2

√ε

σ2√

K≤ 4M

√ε

r ln M√

M(M−1)/2≤ 1

and 1 + α/3 ≤ 2 for M sufficiently large and ε small. Since t2

1+α/3 ≥ ε2σ2 ≥

εM2r ln M ≥ √

M we obtain

P(|i=K∑i=1

(Xi − r lnM

M)| ≥

√εK) ≤ 2e

−( t2

1+α/3) ≤ 2e−

√M . (75)

This estimates the probability that the total number of links in the Erdos-Renyi graph is smaller than M(M−1)/2 r ln M

M +√

ε√

M(M − 1)/2 ≤ (M−1)r ln M2 +

33

√εM√2

≤ N for ε chosen sufficiently small and r close to 1. When this is thecase, P(lim sup

∑k,l 1(k,l)∈E ≤ N) = 1. Now, we would like to divide the

neighbors of each node i into a disjoint, fully connected sets, in order toapply proposition (2). Denote Si := cardm|i ∈ [Cm]. Each subgraph Cm

is fully connected. Thus we have V (i) =⋃

m|i∈[Cm][Cm] − i. This isa union of Si fully connected sets. By removing elements that belong tomore than one set in the union from all but one of the [Cm], we obtain apartition (Ak)k=1..Si of V (i) such that Gk − i ⊂ Ak ⊂ [Ck] − i. Thuscard(Ak) ≥ card(Gk − i) ≥ ln N − 2 and by proposition 2, i’s cluster-ing coefficient verifies γi ≥ 1

Si− 1

ln N−2 . We have the following bound on∑i=Ni=1 Si:

i=N∑i=1

Si =i=Si∑i=1

m=M∑m=1

1i∈Am ≤

m=M∑m=1

i=N∑i=1

1i∈Gm +∑l,k

1(k,l)∈E ≤ N +∑l,k

1(k,l)∈E.

Then

P(i=N∑i=1

Si > 2N) ≤ P(∑l,k

1(k,l)∈E > N) ≤ 2e−√

M . (76)

We consider the solution to the problem

min(∑N

i=1 Si=2N1N

i=N∑i=1

1Si

.

The same type of argument as the one used on (29) shows that the uniquesolution is S1 = S2 = .. = SN = 2. Thus, with a probability bounded in(76), the average clustering coefficient is bounded below by 1

2 − 1ln N−2 ≥ 1

3 .Consequentially by Borel Cantelli

P(lim supN

1N

i=N∑i=1

γi ≤ 13) ≥

P(lim supN

1N

i=N∑i=1

γi ≤ 13|i=N∑i=1

Si ≤ 2N)P(lim supi=N∑i=1

Si ≤ 2N) = 1.

34

References

[1] Aldous, D. (2004). A tractable complex network model based on the stochastic mean-field model of distance, Complex Networks, Lecture Notes in Physics, 650 ,Springer.

[2] Amaral, L.A.N., Scala, A., Barthelemy, M., Stanley, H.E,(2000). Classes of SmallWorld networks, Proceedings of the National Academy of Science, vol. 97 (21),11149-11152

[3] Barabasi A., Newman, M.E.J. and Watts, D. (2005). The structure and dynamics ofnetworks, Princeton University Press.

[4] Barabasi A.L., Albert, R. (1999). Emergence of scaling in random networks, Science286,509–512.

[5] Barbour, A.D and Reinert G. (2001). Small Worlds, Random structures and algo-rithms, 19, 54–74.

[6] Barbour, A.D and Reinert G. (2006). Discrete small world networks, ElectronicJournal of Probability, 11, 1234–1283.

[7] Bollobas, B. (2001). Random Graphs, New York: Academic Press.

[8] Bollobas B. and Riordan O. (2004). The diameter of a scale-free random graph,Combinatorica, 24 (1).

[9] Bonato, A. (2004). In: Proceedings of Combinatorial and Algorithmic Aspects ofNetworking, 159–172.

[10] Bollobs,B., Janson, S., Riordan, O. (2002) Mathematical Results on Scale-free Ran-dom Graphs, in: Bornholdt, S. and Schuster, H. (2002). Handbook of Graphs andNetworks-from the Genome to the Internet, Berlin: Wiley-VCH.

[11] Bollobs B. , Janson, S., Riordan, O. (2007). The phase transition in inhomogeneousrandom graphs. Random Structures and Algorithms 31(1), 3-122

[12] Bollobs B., Riordan, O., Spencer, J., Tusndy, G. (2001). The degree sequence of ascale-free random graph process, Random Structures and Algorithms, 18(3), 279–290

[13] Cont, R. and Loewe, M. (2001). Social distance and social interactions: a discretechoice model, Journal of Mathematical Economics, forthcoming.

[14] Chung, F. and Lu, L. (2006). Complex Graphs and Networks, AMS.

[15] Degenne, A., Fors, M. (1999). Introducing Social Networks, Sage Publications, Lon-don

[16] Dorogovtsev S.N. and Mendes J.F.F. (2003). Evolution of Networks: from BiologicalNets to the Internet and WWW, Oxford University Press.

[17] Durrett, R. (2006). Random graph dynamics, Cambridge University Press.

35

[18] Erdos P. and Renyi, A. (1960). On the evolution of random graphs. Magyar Tud.Akad. Mat. Kutato Int. Kozl. 5, 17–61.

[19] Klemm, K. and Eguiluz, V. (2002). Growing scale-free networks with small-worldbehavior, Physical Review E, 65, 057102.

[20] Granovetter, M. (1973). The strength of weak ties, American Journal of Sociology,78, 1287-1303

[21] Girvan, M. and M. E. J. Newman, (1999). Community structure in social and bio-logical networks, Proceedings of the National Academy of Science, USA.

[22] Fernholz, D. and Ramachandran, V. (2007). The diameter of sparse random graphs.Random Structures and Algorithms 31(4), 482-516.

[23] Hofman, J. and Wiggins, C. (2008). A Bayesian Approach to Network Modularity,Physical Review Letters, 100, 258701 (2008).

[24] Jackson, M. and Rogers, B. (2005). The Economics of Small Worlds, forthcoming:Journal of the European Economic Association, Papers and Proceedings

[25] Johnson, S. (1967). Hierarchical Clustering Schemes, Psychometrika, 2, 241-254

[26] Newman, M., Barabasi,A.,Watts, D. (2006). The structure and Dynamics of Net-works Princeton University Press.

[27] Newman, M. (2003)The structure and function of complex networks, SIAM Review,45, 167–256

[28] Newman, M.E.J., Moore C., Watts D. (2000). Mean-field solution of Small worldnetworks, Physical Review Letters, 84, 3201–3204

[29] Newman, M.E.J., Watts D. and Strogatz, S.H. (2002). Random graph models ofsocial networks, Proc. Natl. Acad. Sci. USA, 99, 2566–2572.

[30] Travers, J. and Milgram, S. (1969). An experimental study of the small world prob-lem, Sociometry, 32, 425–443.

[31] Watts, D. J. (1999). Small Worlds, Princeton University Press.

[32] Watts, D. J. and Strogatz, S.H. (1998). Collective dynamics of small world networks,Nature, 393, 440–442.

36

small world graphs: characterization and alternative ...rama/papers/smallworld.pdf · small world...

Documents