graph limits or graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution...
TRANSCRIPT
![Page 1: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/1.jpg)
Graph Limits or Graphons
Christof Seiler
Stanford University, Spring 2016, Stats 205
![Page 2: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/2.jpg)
Overview
I We consider very large graphs (number of nodes n→∞)I Modeling graphs using functions (graph limits)I Statistical estimation of graph limits, also called graphonsI Approximation with stochastic blockmodels
![Page 3: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/3.jpg)
Random Graphs
I Simplest random graph model Erdős and Rèni:I Given a positive integer n and a real number 0 ≤ p ≤ 1I Generate a random graph by taking node pairs (independent)I Connecting them with probability p
I Random graph models on a fixed set of nodes fail to reproduceimportant properties of real-life networks
I Degree of a node: number of incoming and outgoing edgesI For example, the degrees of Erdős and Réni random graphs
follow a binomial distribution, and so they are asymptoticallynormal if the edge probability p is a constant
I The degrees are highly concentrated around the mean, whilethe degrees of real life networks tend to obey the “Zipfphenomenon”, which means that the tail of the distributiondecreases according to a power law
![Page 4: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/4.jpg)
Random Graphs
I To address this, new random graph models have been createdby adding not just new edges but also nodes as the graph grows
I These graphs reproduce the “heavy tail” behavior of the degreesequences of real-life graphs
I This leads us to the idea of graph limits: by making graphsgrow we reach a limit
I This limit can be described by an well defined limit object: thegraphon
![Page 5: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/5.jpg)
Graphs in Pictures
I Petersen graph
Source: Lovasz (2012)
![Page 6: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/6.jpg)
Graphs in Pictures
I Half graph
Source: Lovasz (2012)
![Page 7: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/7.jpg)
Graphs in Pictures
I Random graphI Random graph limitI ChessboardI Rearranged chessboard
Source: Lovasz (2012)
![Page 8: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/8.jpg)
Graphs in Pictures
I Half graph limit
Source: Lovasz (2012)
![Page 9: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/9.jpg)
Graphs in Pictures
I Starting with a single node (top left)I Create a new nodeI Connect every pair of nonadjacent nodes with probability 1/n,
where n is the current number of nodes
Source: Lovasz (2012)
![Page 10: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/10.jpg)
Graphs in Pictures
Source: Caron (2015). Link
![Page 11: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/11.jpg)
Generative Model
I A graph limit or graphon is a measurable function f from[0, 1]2 into [0, 1] that satisfies f (x , y) = f (y , x) for all x and y
I The ordering of a given graphon f (x , y) along the x and y axeshas no inherent meaning
I f (x , y) has a built-in invariance to “rearrangements” of the xand y axes
I This is similar to statistical shape analysis, where we seek todescribe objects in a manner that is invariant to theirorientation in Euclidean space
I Thus f (x , y) represents an equivalence class of all symmetricfunctions that can be obtained from one another throughmeasure-preserving transformations of [0, 1]
![Page 12: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/12.jpg)
Generative Model
I We can use graphons to generate random graphsI Consider U1, . . . ,Un iid uniform random variables between 0
and 1I Putting an edge between vertices i and j with probability
f (Ui ,Uj)I Do this independently for all 1 ≤ i < j ≤ n
![Page 13: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/13.jpg)
Statistical Estimation
I Given such a model, the statistical question that we can ask isI Is it possible to estimate the graphon from a single realization
of the graph?I In other words, is it possible to estimate the numbers
f (Ui ,Uj), 1 ≤ i < j ≤ n
I Written as matrix M with (i , j)th elements that are given byf (Ui ,Uj)
![Page 14: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/14.jpg)
Statistical Estimation
I Use the matrix estimation by Universal Singular ValueThresholding (USVT) by Chaterjee
1. Subtract (a + b)/2 from each entry of X and divide by(b − a)/2 to make entries lie [−1, 1]
2. For each yi ,j (collection of (yi ,j) is matrix Y)
yij ={
xij if xij is observed0 if xij is unobserved
3. Singular value decomposition of Y
Y =m∑
i=1siuivT
i
4. Define p̂ as proportion of observed values of X
![Page 15: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/15.jpg)
Statistical Estimation
5. Choose small positive number η ∈ (0, 1) and let S be the set of“threshold singular values” define as
S := {i : si ≥ (2 + η)√
np̂}
with η is a predefined, 0.01 seems to work6. Define
W := 1p̂
∑i∈S
siuivTi
7. Define
m̂ij =
wij −1 ≤ wij ≤ 11 wij > 1−1 wij < −1
8. Map entries back from [−1, 1] to [0, 1]
![Page 16: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/16.jpg)
Alternatives to the USVT Estimator
I The USVT estimator estimates graphon at observed pairsI Or in other words, the estimator is conditioned on the observed
pairsI USVT tries to find a low-rank approximation of the probability
matrix MI The USVT estimator is a consistent estimator (as the number
of nodes goes to infinity) of the probability matrix M(Chatterjee 2012)
I An alternative estimator is the the histogram estimatorproposed by Chan and Airoldi (2014)
![Page 17: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/17.jpg)
Histogram Estimator
I Chan and Airoldi recently introduced thesorting-and-smoothing algorithm
Source: Chan and Airoldi (2014)
1. Sorting: Rearrange graph so that the empirical degrees aremonotonically increasing
2. Smoothing: Fit a piecewise constant function
![Page 18: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/18.jpg)
Histogram Estimator (Example)
Source: Chan and Airoldi (2014)I Collaboration network of arXiv astro physics (ca-AstroPh)I Symmetric binary graph consisting of 1.8× 104 nodes and
3.9× 105 edgesI The graphon indicates close collaborations among a group of
people concentrated around the top left corner of the graphonI It also shows a number of small communities along the diagonal
![Page 19: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/19.jpg)
Histogram Estimator (Example)
Source: Chan and Airoldi (2014)I Who-trust-whom online social network of a general consumer
review site Epinions.comI Members of the site can decide whether to “trust” each otherI Unsymmetrical binary graph consisting of 7.5× 104 nodes and
5.1× 105 edgesI The graphon indicates that there are some influential nodes
which consistently interact among themselvesI These can be seen from the repeated patterns of the graphon
![Page 20: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/20.jpg)
Other Important Topics
I The regularity lemma says:Regularity Lemma: The nodes of every graph can bepartitioned into a “small” number of “almost equal” parts insuch a way that for “almost all” pairs of partition classes, thebipartite graph between them is “quasirandom”.
Source: Lovasz (2012)
![Page 21: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/21.jpg)
Other Important Topics
I In other words, every graphon can be approximated by astochastic blockmodel
I The stochastic blockmodels (with blocks q = 1, . . . ,Q)I The generative model for binary graphs:I Draw edge strength from multinomial distribution
Zi ∼ Multi(1, α = (α1, . . . , αQ))
such that Ziq = 1 if node i belongs to block qI Draw edges from Bernoulli distribution
Xij |{ZiqZjl = 1} ∼ Bern(πql)
with π being a Q × Q matrix of connection probabilitiesI According to this model, the latent variables Z1, . . . ,ZN are iidI Given this latent structure, all the edges are independent
![Page 22: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/22.jpg)
Example: arXiv Citation Network
Source: Gopalan and Blei (2013). Efficient Discovery ofOverlapping Communities in Massive Networks
![Page 23: Graph Limits or Graphons · 2019-05-25 · phenomenon”,whichmeansthatthetailofthedistribution decreasesaccordingtoapowerlaw. Random Graphs I Toaddressthis ... I Chessboard I Rearrangedchessboard](https://reader035.vdocuments.site/reader035/viewer/2022070616/5d17697f88c99309378dafa4/html5/thumbnails/23.jpg)
References
I Lovasz (2012). Large Networks and Graph LimitsI Chatterjee (2012). Matrix Estimation by Universal Singular
Value ThresholdingI Chan and Airoldi (2014). A Consistent Histogram Estimator
for Exchangeable Graph ModelsI Latouche, Birmelé, and Ambroise (2010). Variational Bayesian
Inference and Complexity Control for Stochastic Block ModelsI Rohe, Chatterjee, and Yu (2011). Spectral Clustering and the
High-Dimensional Stochastic Blockmodel