Download - Using centrality modeling in network surveys

Social Networks 24 (2002) 385–394

Using centrality modeling in network surveys

Ove Frank∗Department of Statistics, Stockholm University, S-10691 Stockholm, Sweden

Abstract

In a well-known paper [Social Networks 1 (1979) 215] Linton Freeman clarified the importanceof the centrality concept in network analysis. There are a variety of centrality measures available, andthey are mainly used as descriptive statistics in various network studies. For instance, actor centralitymeasured by vertex degree captures those aspects of centrality that have an impact on contacts givenor received by the actor. The approach taken here is to consider actor centrality as a latent propertythat manifests itself in generating a particular network structure, and in order to measure centrality weare bound to rely on observable features of this network. By borrowing ideas from recent link-tracingsurvey methodology, we illustrate how probabilistic network models with centrality parameters canbe used to improve on estimators and predictors of various actor attributes related to centrality.© 2002 Elsevier Science B.V. All rights reserved.

Keywords: Network centrality; Random graphs; Bayesian modeling; Actor centrality predictors; Networkcentrality estimators; Centrality co-variates

1. Introduction

The concept of centrality in networks is used to reflect different actors’ varying importancefor the structural properties of the network.Freeman (1979)gives a clarifying discussion ofactor centrality and network centrality.Wasserman and Faust (1994)give a presentation ofvarious centrality measures and further references to the literature on centrality. Centralitymeasures are generally conceived as descriptive statistics of specific structural properties ofactors or networks. Betweenness centrality of an actor measures what proportion of indirectcontacts between other actors that are via this actor. Such a measure might be an importantexplanatory variable in studies of actor attributes like actor influence or actor control. Degreecentrality measures the actor’s number of direct contacts to others and might be of concern instudies of popularity and activity of actors. Closeness centrality and information centralityare measures based on distances and paths in the network, and they are focusing on such

∗ Tel.: +46-8-162986; fax:+46-8-167511.E-mail address: [email protected] (O. Frank).

0378-8733/02/$ – see front matter © 2002 Elsevier Science B.V. All rights reserved.PII: S0378-8733(02)00014-X

386 O. Frank / Social Networks 24 (2002) 385–394

structural properties of the network that might be related to availability, safety, and security.The common idea is that centrality tries to capture some structural property that can explainother actor attributes or performance properties of the network.

Guided by this idea, centrality is here considered to be a latent attribute of actors, that is anunobservable actor variable, which is supposed to have an impact on the network structure.Using a Bayesian approach the network is modeled as a random graph with the probabilitydistributions of observable features governed by the latent centralities of the actors. Thus,one basic statistical problem is to predict these latent actor centralities by using appropriateobservable statistics. Other relevant statistical problems involve estimation and testing ofthe model and various properties of it related to centrality.

In Section 2, the Bayesian approach to centrality is described in more detail using a simpleillustrative centrality model. Survey methodology in relation to networks is discussed inSection 3. There are many ways of utilizing network information in surveys, and variousexamples related to centrality are discussed.Section 4takes up the choice of sampling units,Section 5the choice of sample selection probabilities, andSection 6the use of snowballsampling.

2. Modeling actor centrality

A network withN actors is given by a graph onN vertices together with various attributesof the vertices and edges. The vertices are labeled by integers, and the vertex set is denotedbyV = {1, . . . , N}. The graph is specified by its adjacency matrixx with elementsxuv thatare 1 or 0 according to whether or not vertexu has an edge to vertexv. The row and columnsums in the matrixx correspond to the out- and in-degrees of the vertices. For undirectedgraphs the adjacency matrix is symmetric, and the row and column sums are both equal tothe degrees of the vertices. The discussion in the sequel is restricted to directed graphs withno loops at any vertex or with loops at every vertex. The relative out-degree of vertexu isdenoted byxu = ∑

v∈V xuv/(N − 1) providedxuu = 0 for u ∈ V .A random graph model has a random adjacency matrixx with elements that are Bernoulli

variables. Thus, there are 2N (N−1) possible outcomes ofx, and their probabilities are sup-posed to be governed by a latent vectorz of elementszu specifying properties ofu ∈ V thathave an influence on centrality. To be more specific, consider a model withxuv = auvbuv

whereauv for fixed u are independent identically distributed (iid) Bernoulli variables forall v, andbuv for fixedv are iid Bernoulli variables for allu. It follows that for fixedu, thepairs (auv, bvu) are iid for allv, and they are independent for differentu. The assumptionsimply that the pairs (auv, bvu) for differentv represent out- and in-effects at vertexu. Letzu specify the joint probability distribution of (auv, bvu), sayzu = (Eauv, Ebvu, Eauvbvu).This model has 3N degrees of freedom like the general dyad independence model. Witha Bayesian approach, the joint probability distributions given byzu for different verticesu ∈ V can be considered as independently sampled from a Dirichlet distribution of dimen-sion 3 (with four parameters). Conditional onz the dyads (xuv, xvu) are independent andtheir probabilities are functions ofz. For instance, a mutual dyad has probability

Exuvxvu = EauvbvuEavubuv

O. Frank / Social Networks 24 (2002) 385–394 387

and a null dyad has probability

E(1 − xuv)(1 − xvu) = 1 − Exuv − Exvu + Exuvxvu

whereExuv = EauvEbuv. Unconditionally the dependencies between the dyads are gov-erned by the Dirichlet parameters. It is interesting to note that this Bayesian approachto a conditional dyad independence model offers a conjugate Dirichlet prior. For theHolland–Leinhardt model and other dyad independence models with log-linear interac-tion parameters, it is not easy to see what a natural prior might be. The possibility to modeldyad dependence via priors onz can be demonstrated with a simplified version of the presentmodel. Letbuv = 1 so thatxuv = auv andzu = Eauv. The Dirichlet distribution reduces toa beta distribution, sayzu is beta(α, β)-distributed. By choosingα andβ equal or unequaland by choosing them less than or larger than 1, very different network structures can beachieved. Thus, even if only out-effects are controlled by this conditional model, the uncon-ditional model can still express very varied structures with respect to transitivity, centrality,and other properties. In fact, it should be clear that if the beta distribution is chosen so thatmost actors have very low or very high centralities, this implies quite a different structurefrom what would be obtained if most actors had a centrality very close to the average. Forthis reason it should be of some interest to analyze centrality for this simplified model,and consider the following as an illustration of the potential of the more general Bayesianapproach to network modeling.

The simple illustrative example is obtained by assuming thatzu is beta(α, β)-distributed,and conditional onzu the edge indicatorxuv is Bernoulli(zu) for v different fromu. The betadistribution provides actor centralities which determine the occurrence of edges and therebyhave an impact on the structure of the graph. Takingxvv = 0, the out-degree (N − 1)xu isconditionally bin(N − 1, zu)-distributed. According to well-known properties of the betadistribution and the binomial distribution, the relative out-degree has expected value andvariance given by

Exu = µx = µz and Varxu = σ 2x =

[µz(1 − µz) + (N − 2)σ 2

z

](N − 1)

,

where

µz = α

(α + β)and σ 2

z = αβ

(α + β)2(α + β + 1).

Denoting the mean and variance of theN relative out-degrees by

mx =(

1

N

) ∑u∈V

xu and s2x =

(1

N

) ∑u∈V

(xu − mx)2,

it follows that unbiased estimators ofµz andσ 2z are given by

mx and[N(N − 1) − 1]s2

x/ − (N − 1) − mx(1 − mx)

(N − 1)(N − 2).

Thus for large values ofN, the mean and variance of the relative out-degrees are asymptot-ically unbiased estimators of the mean and variance of the latent centrality.


So far, the assumption of a beta-distribution has not been essential. This assumption isconvenient in order to specify the conditional distribution ofzu given xu, which now is abeta [α + (N − 1)xu, β + (N − 1)(1− xu)]-distribution. Therefore, the actor centralityzu

can be predicted by an estimate of the conditional expected value

E(zu|xu) = [α + (N − 1)xu]

(α + β + N − 1).

Solving the equations forµz ands2z in terms ofα andβ leads to

E(zu|xu) = wxu + (1 − w)µx

where

w = σ 2z

σ 2x

= (N − 1)

(α + β + N − 1).

It follows that the centrality predictor for actoru is given by a weighted mean of the relativeout-degreexu of actoru and the mean of the relative out-degrees of all the actors,mx , usingthe weight

1 + (N − 1)[s2x − mx(1 − mx)

]N(N − 2)s2

x

.

This weight converges to 1 for increasingN, which means that the relative out-degree is thedominating term of the actor centrality predictor for largeN. For largeN the results are inagreement with the common practice of measuring actor centrality by relative out-degree.For moderate and small values ofN, the Bayesian approach suggests appropriate adjustmentsbased on the variability of actor centralities.

3. Network surveys: inference on centrality co-variates

Consider the finite populationV of N actors and an actor attribute with valuesyv for v ∈ V .The Horvitz–Thompson estimator of the meanmy = (1/N)

∑v∈V yv based on observations

yv for v ∈ S, whereS is a random sample fromV, is given by (see, for instance,Särndalet al., 1992)

∑v∈S

(yv

Nπv

)or

∑v∈S(yv/πv)∑v∈S(1/πv)

depending on whether or notN is known. Hereπv is the probability thatv is selected forthe sample.

If the valuesyv are modeled as the outcomes of independent identically distributed randomvariables depending on centrality, that is as co-variates of centrality, then the informationabout centrality conveyed by the observed graph might be useful for choosing the sampleselection probabilities. In this case

πv = P(v ∈ S|x, y)


is the conditional selection probability wherex is the adjacency matrix of the graph andy isthe vector of theN co-variate valuesyv for v ∈ V . Conditional on the outcomes ofx andy, theHorvitz–Thompson estimator

∑v∈S(yv/Nπv) has the expected valuemy and the variance

Var

[∑v∈S

(yv

Nπv

)|x, y

]=

∑u∈V

∑v∈V

[yuyv

(πuv − πuπv)

πuπvN2

],

where

πuv = P(u ∈ S, v ∈ S|x, y)

Unconditionally, the estimator has the expected valueEmy = Eyv = µy and the variance

Var∑v∈S

(yv

Nπv

)=

(1

N

)σ 2

y +∑u∈V

∑v∈V

E

[yuyv

(πuv − πuπv)

πuπvN2

].

This is the variance of the conditional expected value plus the expected value of the con-ditional variance. If the sampleS is a Bernoulli sample, that is, conditional onx andy,the population units are independently selected to be included or not in the sample, thenπuv = πuπv for u andv different andπvv = πv. This implies that the conditional varianceof the estimator is equal to

Var

[∑v∈S

(yv

Nπv

)|x, y

]=

∑v∈V

[y2v

(1 − πv)

πvN2

],

and it follows that this variance is minimized subject to a fixed conditional expected samplesize

∑v∈V πv = n if πv is chosen to be proportional toyv, that is an optimal sample selection

design is obtained by choosingπv = nyv/Nmy. Now, however, this sample selection designcannot be implemented since, it would require all the co-variate values to be known (andthere would be no need for estimating their mean value). In practice we could look for asample selection design based on observable centrality co-variates. The next three sectionspresent examples of this idea.

4. Vertex sampling versus edge sampling

A Bernoulli sampling of vertices with a common selection probabilityp yields that theHorvitz–Thompson estimator ofmy is equal to

∑v∈S(yv/Np) which has a conditional

variance∑

v∈V [y2v (1 − p)/N2p)].

Consider now the following alternative sampling design. Edges are sampled accordingto Bernoulli(p)-selection from the graph onV given by the adjacency matrixx. Let S bedefined as the set of vertices that have at least one sampled out-edge. Herexvv = 1 forv ∈ V should be assumed so that every vertex has a chance of being included inS. Let therelative out-degreexu of vertexu be defined as the proportion of other vertices that have anedge fromu so that∑

v∈V

xuv = 1 + (N − 1)xu.


The graph onV consisting of the sampled edges is a Bernoulli(p)-selected subgraph of thegraph given by adjacency matrixx. Unlessx represents a complete graph, the sample graphis not an ordinary Bernoulli(p) graph onV. Although Bernoulli(p) graphs are common inthe graph theory literature, Bernoulli(p)-selected subgraphs of non-complete graphs arenot very well-known.Frank (1999)discusses such Bernoulli(p)-selected subgraphs in thecontext of measuring social capital. The Bernoulli(p) edge sampling design implies that thesample inclusion probability ofv is equal to

πv = 1 − (1 − p)1+(N−1)xv

and the Horvitz–Thompson estimator ofmy is given by

∑v∈S

[yv

N [1 − (1 − p)1+(N−1)xv]

].

For smallp, the Horvitz–Thompson estimator is approximately given by the ratio estimator

∑v∈S

[yv

N [1 + (N − 1)xv]p

].

The conditional expected vertex sample size is close ton if p is chosen according to

p = n

N [1 + (N − 1)mx ],

and for suchp, the variance of the ratio estimator is smaller than the variance of the estimatorbased on a Bernoulli(p) vertex sample withp = n/N if and only if[

mx + 1

(N − 1)

] ∑v∈V

[y2v

[xv + 1/(N − 1)]

]<

∑v∈V

y2v .

For largeN this condition expresses that there should be a positive covariance betweenthe two sequencesxv andy2

v/xv for v ∈ V . Thus, for co-variates with this property, edgesampling is preferred to vertex sampling.

So far the arguments have been conditional on the specific outcomes ofx andy. Uncon-ditionally,

∑v∈S(yv/Nπv) is an unbiased estimator ofµy with variance given by

Var∑v∈S

(yv

Nπv

)= σ 2

y

N+

∑v∈V

E

[y2v (1 − πv)

πvN2

],

which is equal to

(σ 2y + µ2

y)

n− µ2

y

Nif πv = n

N,

and equal to

[1 + (N − 1)µx ]∑v∈V

E

[y2v

nN(1 + (N − 1)xv)

]− µ2

y

Nif πv = n[1 + (N − 1)xv]

N [1 + (N − 1)µx ].


Hence, the last design yields a smaller variance if and only if

E

[y2v

(1 + (N − 1)xv)

]<

Ey2v

[1 + (N − 1)µx ],

that is if and only if

Cov

[1 + (N − 1)xv,

y2v

(1 + (N − 1)xv

]> 0.

To be more specific, consider a co-variateyv which is a duration time of some activity ofactorv which is expected to be proportional to centrality. Assume thatyv conditional onzv

is exponentially distributed with intensityλ/zv. Moreover, conditional onzv, (N − 1)xv isbin(N − 1, zv), xv andyv are conditionally independent, andzv is beta(α, β)-distributed. Itfollows by straightforward calculations that

E

[y2v

(1 + (N − 1)xv)

]= E

[(2z2

v/λ2)[1 − (1 − zv)

N−1]

Nzv

]<

2µz

λ2N

which is less than

Ey2v

[1 + (N − 1)µx ]= 2Ez2

v

λ2[1 + (N − 1)µz]

if α +β < N − 1. Consequently, in this case edge sampling is generally preferred for largepopulations.

5. Sample selection probabilities depending on centrality

If we want to estimate the mean of a co-variate of centrality, then it should be of interestto know whether a sample selection withπv depending on centrality is preferred to asample selection disregarding centrality. A simple illustration of this problem is obtained bycomparing Bernoulli sampling withπv = p and Bernoulli sampling withπv = exp(−cxv)

wherec is a positive constant chosen so that the two sampling designs have about thesame sample size. Both designs have a random sample size, and we could require that theirexpected sample sizes should be equal. Since, the expected sample size is equal to

∑v∈V πv

which satisfies

∑v∈V

πv ≥ N2∑v∈V (1/πv)

,

and the arithmetic and harmonic means of the selection probabilities are alternative measuresof the average sample fraction, we choosec so that

∑v∈V

(1

πv

)

is constant, which turns out to be convenient for the following.


For Bernoulli sampling, the Horvitz–Thompson estimator ofmy has a variance

∑v∈V

[y2v (1 − πv)

πvN2

].

It follows that the estimator with selection probabilities based on centrality has a smallervariance than the estimator with a constant selection probability if and only if

∑v∈V

(y2v

πv

)<

∑v∈V

(y2v

p

).

By substituting

p = N∑v∈V (1/πv)

,

the condition is equivalent to

∑v∈V

(y2v

πv

)<

∑v∈V

(y2v

N

) ∑v∈V

(1

πv

)

which means a negative covariance between the two sequencesy2v and 1/πv for v ∈ V .

Here 1/πv = exp(cxv), and the condition is consistent with the idea that co-variates thatare negatively correlated with centrality should preferably be sampled at non-central actors.

The investigator usually controls the selection of the sampleS fromV so that the selectionprobabilitiesπv andπuv depend on a known specified sampling design. For instance,S canbe a Bernoulli sample withπv depending on the average out-degree ofv as in the previousdiscussion. Sometimes a process that is not controlled by the investigator performs theselection. For instance,S can be a Bernoulli sample withπv governed by some action ofv or by some latent attribute ofv like the latent centralityzv. An example is the samplingof drug addicts that voluntarily show up at some treatment center. For such cases, thedescription of the selection procedure could be part of the modeling, andπv should reflectwhat mechanisms are likely to affect the probability of selectingv for observation.

If my should be estimated by using observationsyv forv ∈ S whereS is a Bernoulli sampleobtained with selection probabilitiesπv = exp(−czv) depending on the latent centralityzv ofv, we need to estimateπv before we can use the Horvitz–Thompson estimator. For instance,a predictor based onx could replacezv. Perhaps there are better estimation procedures thanthe ones provided by the Horvitz–Thompson approach. Further research is needed for properhandling of sample selection procedures that are not controlled by the investigator.

6. Snowball sampling

Snowball sampling is a way of successively expanding a sample of vertices by addingadjacent vertices.Goodman (1961)introduced this kind of sampling, andFrank (1979),Frank and Snijders (1994), andThompson and Frank (2000)have developed survey theoryfor snowball sampling.


Consider an initial sampleS0 of actors selected by a Bernoulli(p)-design. The actors inS0 together with those not inS0 that have an edge from at least one of the actors inS0 are themembers of a so called one-wave snowball sampleS. Conditional on the adjacency matrixx the probability that the snowball sampleS includes actorv is given by

πv = 1 −∏u∈V

(1 − pxuv) = 1 −∏u∈V

(1 − p)xuv = 1 − (1 − p)xv,

where for conveniencexvv = 1 for v ∈ V andxv is the in-degree ofv. This selection prob-ability is approximately equal topxv for smallp. Unconditionally the inclusion probabilityis equal to the expected value

Eπv = 1 − (1 − p)(1 − pµz)N−1.

By choosingp = n0/N and lettingN tend to infinity, it follows that

Eπv → 1 − exp(−n0µz).

Thus, an expected initial sample size ofn0 actors leads to an expected snowball samplefraction of 1− exp(−n0µz) for large populations.

The probability that two different actorsu and v are included in the snowballS isequal to

πuv = 1 −∏w∈V

(1 − pxwu) −∏w∈V

(1 − pxwv) +∏w∈V

[1 − p[1 − (1 − xwu)(1 − xwv)]]

= 1 − (1 − p)xu − (1 − p)xv + (1 − p)xu+xv−(x′x)uv,

which for smallp can be approximated by

1 − (1 − pxu) − (1 − pxv) + [1 − p(xu + xv − (x′x)uv)] = p(x′x)uv,

where

(x′x)uv =∑w∈V

xwuxwv.

It also follows from the first expression forπuv that

Eπuv = 1 − 2(1 − p)(1 − pµz)N−1 + (1 − p)2

[1 − p(2µz − σ 2

z − µ2z)

]N−2,

and for increasingN

Eπuv → 1 − 2 exp(−n0µz) + exp[−n0(2µz − σ 2

z − µ2z)

].

Using the approximations, the Horvitz–Thompson estimator ofmy is given by

∑v∈S

(yv

Npxv

),


and it has a variance equal to

Var

[∑v∈S

(yv

Npxv

)|x, y

]=

∑u∈V

∑v∈V

[yuyv

(πuv − πuπv)

πuπvN2

]

=∑v∈V

(y2v

pxvN2

)+

∑u∈V,u�=v

∑v∈V

[yuyv(x

′x)uv

pxuxvN2

]− m2

y.

We note that the statistics needed are the in-degrees and the pair-wise intersection numberscorresponding to these in-degrees. An upper bound to the variance that needs only thein-degrees is easily shown to be

∑u∈V

∑v∈V

[yuyv

p(xuxv)1/2N2

]− m2

y.

If this bound is further approximated by replacing the in-degrees by their expected values1+ (N − 1)µz, it follows that the variance of the Horvitz–Thompson estimator is boundedby

m2y

(1 − n0µz)

n0µz

,

and by choosing the initial sample size larger than 1/(1+ ε2)µz the relative accuracy of theestimator is within±ε.

References

Frank, O., 1979. Estimation of population totals by use of snowball samples. In: Holland, P.W., Leinhardt, S.(Eds.), Perspectives on Social Network Research. Academic Press, New York.

Frank, O., 1999. Measuring social capital by network capacity indices. In: Proceedings of the Conference onCreation and Returns of Social Capital. University of Amsterdam, Amsterdam, in press.

Frank, O., Snijders, T., 1994. Estimating the size of hidden populations using snowball sampling. Journal ofOfficial Statistics 10, 53–67.

Freeman, L.C., 1979. Centrality in social networks: conceptual clarification. Social Networks 1, 215–239.Goodman, L.A., 1961. Snowball sampling. Annals of Mathematical Statistics 32, 148–170.Särndal, C.-E., Swensson, B., Wretman, J., 1992. Model Assisted Survey Sampling. Springer, New York.Thompson, S.K., Frank, O., 2000. Model-based estimation with link-tracing sampling designs. Survey

Methodology 26, 87–98.Wasserman, S., Faust, K., 1994. Social Network Analysis. Cambridge University Press, Cambridge.

Download - Using centrality modeling in network surveys

Top Related