{connecting statistics, combinatorics, and computational algebra{...

41
Algebraic statistics for network models –Connecting statistics, combinatorics, and computational algebra– –Part One– Sonja Petrovi´ c (Statistics Department, Pennsylvania State University) Applied Mathematics Department, Illinois Institute of Technology Summer School on Network Science Columbia, SC Monday, 20 May 2013 S. Petrovi´ c ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 1 / 22

Upload: others

Post on 25-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Algebraic statistics for network models–Connecting statistics, combinatorics, and computational algebra–

–Part One–

Sonja Petrovic

(Statistics Department, Pennsylvania State University)↓

Applied Mathematics Department, Illinois Institute of Technology

Summer School on Network ScienceColumbia, SC

Monday, 20 May 2013

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 1 / 22

Page 2: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

General framework and motivation

Network analysis has advanced

- Empirical work, simple models, probabilistic properties.- But: surprising non-standard properties, new theoretical challenges instatistics

Some approaches that have a statistical grounding do not necessarilyscale well to large sparse network settings

- ( model/data fit )- Recent: degenerate statistical behavior of network modeling tools

Motivation: practical problems for network data structures where thenumber of variables, parameters is large (relative to the number ofindependent observations).

Algebraic statistics:

Insights to a variety of categorical data problemsinsights aid model development and analysis processes

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 1 / 22

Page 3: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

What is Algebraic Statistics?

Algebraic geometry and related fields applied to statistics

Fact (Guiding principle)

Many important statistical models correspond to algebraic orsemi-algebraic sets of parameters.

The geometry of these parameter spaces determines the behavior ofwidely used statistical inference procedures.

Model geometry: ”Shape” of a statistical model: intuitive notion offundamental importance to statistical inference; reflected in itsabstract geometric properties

Ex: is the likelihood function multimodal?Does the model have singularities (is non-regular)?Nature of underlying singularities?

When a model is algebraic, use tools from algebraic geometry andcomputational algebra software packages.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 2 / 22

Page 4: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

What is Algebraic Statistics?

Algebraic geometry and related fields applied to statistics

Fact (Guiding principle)

Many important statistical models correspond to algebraic orsemi-algebraic sets of parameters.

The geometry of these parameter spaces determines the behavior ofwidely used statistical inference procedures.Model geometry: ”Shape” of a statistical model: intuitive notion offundamental importance to statistical inference; reflected in itsabstract geometric properties

Ex: is the likelihood function multimodal?Does the model have singularities (is non-regular)?Nature of underlying singularities?

When a model is algebraic, use tools from algebraic geometry andcomputational algebra software packages.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 2 / 22

Page 5: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Motivating problem Tools: algebra and polyhedral geometry

Model Validation Problem

Problem

Given a candidate ERGM P and one observed network x, decide (with ahigh degree of confidence) whether x can be regarded as a draw fromsome distribution Pθ0 ∈ P.

Maximum likelihood estimation problem:

Use the observed data x to produce an optimalestimate for θ0.

11/4/09 9:03 PMModel Viewer

Page 1 of 2http://www.javaview.de/services/modelViewer/index.html

Home Demos Applications Tutorial Download Help Feedback

Model Viewer: A Web-Based Geometry Viewer

Visualize and study your own geometry models using this web service which is based onJavaView. The model files may reside on your local computer or somewhere on the internet.Simply, browse your local disk or type the URL of a model using the form below.

JavaView v.3.95www.javaview.de

Loading http://www.javaview.de/models/primitive/Dodecahedron_Demo.jvx ...

In the display, use the right mouse to get help or to open the control panel.

no file selectedChoose File

Type or browse a file from your local disk and press <upload>: upload

Currently, the file formats described in data formats are supported which include JavaView's JVX,BYU, Sun's OBJ, Mathematica graphics MGS, Maple graphics MPL, STL, WRL, DXF (someformats are partially supported only). You may also upload gzip- or zip-compressed files whichmust have an extension like .jvx.gz or .mpl.zip.

Your uploaded file will remain on the server for at most 3 hours!

Applications:

Apply any of the algorithms implemented in JavaView to your own models.

Goodness-of-fit problem (and model selection):

Can the MLE be considered as a satisfactory generativemodel for the data at hand?

Markov bases

Markov bases

Sonja Petrovic (SAC seminar) Algebraic statistics February 15, 2012 11 / 27

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 3 / 22

Page 6: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Motivating problem Tools: algebra and polyhedral geometry

Model Validation Problem

Problem

Given a candidate ERGM P and one observed network x, decide (with ahigh degree of confidence) whether x can be regarded as a draw fromsome distribution Pθ0 ∈ P.

Maximum likelihood estimation problem:Use the observed data x to produce an optimalestimate for θ0.(Faces of model polytope.)

11/4/09 9:03 PMModel Viewer

Page 1 of 2http://www.javaview.de/services/modelViewer/index.html

Home Demos Applications Tutorial Download Help Feedback

Model Viewer: A Web-Based Geometry Viewer

Visualize and study your own geometry models using this web service which is based onJavaView. The model files may reside on your local computer or somewhere on the internet.Simply, browse your local disk or type the URL of a model using the form below.

JavaView v.3.95www.javaview.de

Loading http://www.javaview.de/models/primitive/Dodecahedron_Demo.jvx ...

In the display, use the right mouse to get help or to open the control panel.

no file selectedChoose File

Type or browse a file from your local disk and press <upload>: upload

Currently, the file formats described in data formats are supported which include JavaView's JVX,BYU, Sun's OBJ, Mathematica graphics MGS, Maple graphics MPL, STL, WRL, DXF (someformats are partially supported only). You may also upload gzip- or zip-compressed files whichmust have an extension like .jvx.gz or .mpl.zip.

Your uploaded file will remain on the server for at most 3 hours!

Applications:

Apply any of the algorithms implemented in JavaView to your own models.

Goodness-of-fit problem (and model selection):

Can the MLE be considered as a satisfactory generativemodel for the data at hand?

Markov bases

Markov bases

Sonja Petrovic (SAC seminar) Algebraic statistics February 15, 2012 11 / 27

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 3 / 22

Page 7: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Motivating problem Tools: algebra and polyhedral geometry

Model Validation Problem

Problem

Given a candidate ERGM P and one observed network x, decide (with ahigh degree of confidence) whether x can be regarded as a draw fromsome distribution Pθ0 ∈ P.

Maximum likelihood estimation problem:Use the observed data x to produce an optimalestimate for θ0.(Faces of model polytope.)

11/4/09 9:03 PMModel Viewer

Page 1 of 2http://www.javaview.de/services/modelViewer/index.html

Home Demos Applications Tutorial Download Help Feedback

Model Viewer: A Web-Based Geometry Viewer

Visualize and study your own geometry models using this web service which is based onJavaView. The model files may reside on your local computer or somewhere on the internet.Simply, browse your local disk or type the URL of a model using the form below.

JavaView v.3.95www.javaview.de

Loading http://www.javaview.de/models/primitive/Dodecahedron_Demo.jvx ...

In the display, use the right mouse to get help or to open the control panel.

no file selectedChoose File

Type or browse a file from your local disk and press <upload>: upload

Currently, the file formats described in data formats are supported which include JavaView's JVX,BYU, Sun's OBJ, Mathematica graphics MGS, Maple graphics MPL, STL, WRL, DXF (someformats are partially supported only). You may also upload gzip- or zip-compressed files whichmust have an extension like .jvx.gz or .mpl.zip.

Your uploaded file will remain on the server for at most 3 hours!

Applications:

Apply any of the algorithms implemented in JavaView to your own models.

Goodness-of-fit problem (and model selection):

Can the MLE be considered as a satisfactory generativemodel for the data at hand?(Markov bases for random walk on a fiber.)

Markov bases

Markov bases

Sonja Petrovic (SAC seminar) Algebraic statistics February 15, 2012 11 / 27

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 3 / 22

Page 8: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Statistical network analysis What is a model?

Statistical Network (Random Graph) Analysis

Let Gn be the set of simple graphs on n nodes: |Gn| = 2(n2).* Nodes = units of some (sub)population of interest.* Edges = a set of static relationships among units.

- Reprensentation: 0/1 adjacency matrix (n × n), or a point in {0, 1}n;OR: a

(n2

)-dimensional 0/1 vector indexed by node pairs.

Example

Graph with an edge {1, 2} and a triple edge {1, 3}: x = [1, 3, 0, . . . , 0]T .

A statistical model is a collection of probability distributions overgraphs indexed by a set of parameters.The form and properties of such statistical models are dictated by thegoals of the analysis at hand.With a model in place, we can determine the probability of anynetwork topology.We can also estimate the model parameters that best fit given data.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 4 / 22

Page 9: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Statistical network analysis What is a model?

Statistical Network (Random Graph) Analysis

Let Gn be the set of simple graphs on n nodes: |Gn| = 2(n2).* Nodes = units of some (sub)population of interest.* Edges = a set of static relationships among units.- Reprensentation: 0/1 adjacency matrix (n × n), or a point in {0, 1}n;

OR: a(n

2

)-dimensional 0/1 vector indexed by node pairs.

Example

Graph with an edge {1, 2} and a triple edge {1, 3}:

x = [1, 3, 0, . . . , 0]T .

A statistical model is a collection of probability distributions overgraphs indexed by a set of parameters.The form and properties of such statistical models are dictated by thegoals of the analysis at hand.With a model in place, we can determine the probability of anynetwork topology.We can also estimate the model parameters that best fit given data.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 4 / 22

Page 10: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Statistical network analysis What is a model?

Statistical Network (Random Graph) Analysis

Let Gn be the set of simple graphs on n nodes: |Gn| = 2(n2).* Nodes = units of some (sub)population of interest.* Edges = a set of static relationships among units.- Reprensentation: 0/1 adjacency matrix (n × n), or a point in {0, 1}n;

OR: a(n

2

)-dimensional 0/1 vector indexed by node pairs.

Example

Graph with an edge {1, 2} and a triple edge {1, 3}: x = [1, 3, 0, . . . , 0]T .

A statistical model is a collection of probability distributions overgraphs indexed by a set of parameters.The form and properties of such statistical models are dictated by thegoals of the analysis at hand.With a model in place, we can determine the probability of anynetwork topology.We can also estimate the model parameters that best fit given data.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 4 / 22

Page 11: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Statistical network analysis What is a model?

Statistical Network (Random Graph) Analysis

Let Gn be the set of simple graphs on n nodes: |Gn| = 2(n2).* Nodes = units of some (sub)population of interest.* Edges = a set of static relationships among units.- Reprensentation: 0/1 adjacency matrix (n × n), or a point in {0, 1}n;

OR: a(n

2

)-dimensional 0/1 vector indexed by node pairs.

Example

Graph with an edge {1, 2} and a triple edge {1, 3}: x = [1, 3, 0, . . . , 0]T .

A statistical model is a collection of probability distributions overgraphs indexed by a set of parameters.The form and properties of such statistical models are dictated by thegoals of the analysis at hand.With a model in place, we can determine the probability of anynetwork topology.We can also estimate the model parameters that best fit given data.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 4 / 22

Page 12: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Statistical network analysis Why search for a ‘good’ model?

Statistical network analysis - what it can tell us

Why search for a well-fitting statistical model of an observed social network?s

Allows us to understand the uncertainty associated with observedoutcomes.

Allows inferences about whether network substructures are morecommonly observed than by chance.

Allows for simulation.

Allows for the assessment of local effects (reciprocation,attractiveness, desire to expand, etc).

Statistical models for networks:

Classes of probability distributions for graphs,interpretable, realistic models for large distributions based on edges

(modeling the random occurrence of edges).

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 5 / 22

Page 13: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

ERGMs A well-known family of statistical models

Exponential Random Graph (ERG) Models

Specify a set of informative network statistics on Gn(capture key features of the network)

t : Gn → Rd , x 7→ t(x) = (t1(x), . . . , td(x)) ∈ Rd ,

such that the probability of observing x is a function of t(x) only.

Standard examples:

the number of edges E (x) (Erdos-Renyi model);

4

number of triangles T (x);

Figure 1 shows the su!cient statistics in a 2-D scatter plot whose x-axisis the number of edges in the graph and whose y-axis is the number oftriangles. And in this figure, we use the color to indicate the number ofgraphs corresponding to each pair of su!cient statistics. The darker thecolor is, the larger the number is. From this figure, we can see the su!centstatistics are far from the upper boundary of the convex hull, but we canimage as the number of the node increases, the upper boundary will approachthe data points and the area will become thiner.

What kind of graphs are in the boundary? First,the upper boundaryshould consist of the complete graph (the right-most point), the empty graph(the left-most point) and something between. Something between means thecompositions of the complete graph and the empty graph, for example,

3

the number of k-stars;

Figure 1 shows the su!cient statistics in a 2-D scatter plot whose x-axisis the number of edges in the graph and whose y-axis is the number oftriangles. And in this figure, we use the color to indicate the number ofgraphs corresponding to each pair of su!cient statistics. The darker thecolor is, the larger the number is. From this figure, we can see the su!centstatistics are far from the upper boundary of the convex hull, but we canimage as the number of the node increases, the upper boundary will approachthe data points and the area will become thiner.

What kind of graphs are in the boundary? First,the upper boundaryshould consist of the complete graph (the right-most point), the empty graph(the left-most point) and something between. Something between means thecompositions of the complete graph and the empty graph, for example,

3

More elaborate examples (number of statistics grows with n)the degree sequence: β-model [RPF 2011; Chatterjee-Diaconis 2011]in- and out-degrees (directed): p1 model [Holland-Leinhardt 1981]T (x) and k-stars: Markov graph model [Frank-Strauss 1986](log-linear models over a set of edges)

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 6 / 22

Page 14: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

ERGMs A well-known family of statistical models

Exponential Random Graph (ERG) Models

Specify a set of informative network statistics on Gn(capture key features of the network)

t : Gn → Rd , x 7→ t(x) = (t1(x), . . . , td(x)) ∈ Rd ,

such that the probability of observing x is a function of t(x) only.Standard examples:

the number of edges E (x) (Erdos-Renyi model);

4

number of triangles T (x);

Figure 1 shows the su!cient statistics in a 2-D scatter plot whose x-axisis the number of edges in the graph and whose y-axis is the number oftriangles. And in this figure, we use the color to indicate the number ofgraphs corresponding to each pair of su!cient statistics. The darker thecolor is, the larger the number is. From this figure, we can see the su!centstatistics are far from the upper boundary of the convex hull, but we canimage as the number of the node increases, the upper boundary will approachthe data points and the area will become thiner.

What kind of graphs are in the boundary? First,the upper boundaryshould consist of the complete graph (the right-most point), the empty graph(the left-most point) and something between. Something between means thecompositions of the complete graph and the empty graph, for example,

3

the number of k-stars;

Figure 1 shows the su!cient statistics in a 2-D scatter plot whose x-axisis the number of edges in the graph and whose y-axis is the number oftriangles. And in this figure, we use the color to indicate the number ofgraphs corresponding to each pair of su!cient statistics. The darker thecolor is, the larger the number is. From this figure, we can see the su!centstatistics are far from the upper boundary of the convex hull, but we canimage as the number of the node increases, the upper boundary will approachthe data points and the area will become thiner.

What kind of graphs are in the boundary? First,the upper boundaryshould consist of the complete graph (the right-most point), the empty graph(the left-most point) and something between. Something between means thecompositions of the complete graph and the empty graph, for example,

3

More elaborate examples (number of statistics grows with n)the degree sequence: β-model [RPF 2011; Chatterjee-Diaconis 2011]in- and out-degrees (directed): p1 model [Holland-Leinhardt 1981]T (x) and k-stars: Markov graph model [Frank-Strauss 1986](log-linear models over a set of edges)

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 6 / 22

Page 15: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

ERGMs A well-known family of statistical models

Exponential Random Graph (ERG) Models

Specify a set of informative network statistics on Gn(capture key features of the network)

t : Gn → Rd , x 7→ t(x) = (t1(x), . . . , td(x)) ∈ Rd ,

such that the probability of observing x is a function of t(x) only.Standard examples:

the number of edges E (x) (Erdos-Renyi model);

4

number of triangles T (x);

Figure 1 shows the su!cient statistics in a 2-D scatter plot whose x-axisis the number of edges in the graph and whose y-axis is the number oftriangles. And in this figure, we use the color to indicate the number ofgraphs corresponding to each pair of su!cient statistics. The darker thecolor is, the larger the number is. From this figure, we can see the su!centstatistics are far from the upper boundary of the convex hull, but we canimage as the number of the node increases, the upper boundary will approachthe data points and the area will become thiner.

What kind of graphs are in the boundary? First,the upper boundaryshould consist of the complete graph (the right-most point), the empty graph(the left-most point) and something between. Something between means thecompositions of the complete graph and the empty graph, for example,

3

the number of k-stars;

Figure 1 shows the su!cient statistics in a 2-D scatter plot whose x-axisis the number of edges in the graph and whose y-axis is the number oftriangles. And in this figure, we use the color to indicate the number ofgraphs corresponding to each pair of su!cient statistics. The darker thecolor is, the larger the number is. From this figure, we can see the su!centstatistics are far from the upper boundary of the convex hull, but we canimage as the number of the node increases, the upper boundary will approachthe data points and the area will become thiner.

What kind of graphs are in the boundary? First,the upper boundaryshould consist of the complete graph (the right-most point), the empty graph(the left-most point) and something between. Something between means thecompositions of the complete graph and the empty graph, for example,

3

More elaborate examples (number of statistics grows with n)the degree sequence: β-model [RPF 2011; Chatterjee-Diaconis 2011]in- and out-degrees (directed): p1 model [Holland-Leinhardt 1981]T (x) and k-stars: Markov graph model [Frank-Strauss 1986](log-linear models over a set of edges)

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 6 / 22

Page 16: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

ERGMs may have non-standard properties (e.g. increasing number of parameters)

Number of parameters increasing with n

In general, it is desirable to have an increasing number of parameters inorder to provide more descriptive and flexible models.

However, the number of parameters can only grow at a o(n) rate,otherwise inference is not possible.

Examples of network models in which the number of parameters grow withthe size n of the network are

the β-model;

the Markov random graph models;

models related to JDM (Aaron Dutle, Wed 5/22);

the SBM with growing number of hidden communities.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 7 / 22

Page 17: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

ERGMs may have non-standard properties But: significant dimension reduction property (well-understood)

Sufficiency and graph equivalence classes

ERGMs are models over equivalence classes of Gn,where two graphs x and y are regarded probabilistically equivalentwhenever t(x) = t(y).

ET Model: we consider G9 with network statistics (E (x),T (x)).There are 236 distinct graphs but only 444 distinct network statistics.

0 5 10 15 20 25 30 35 400

10

20

30

40

50

60

70

80

90

Number of edges

Num

ber o

f tria

ngle

s

β-Model: for n = 6, 7, 8, 9, there are 6944, 11850, 2135740, 47003045distinct (ordered) degree sequences (Stanley, 1991).

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 8 / 22

Page 18: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

A simplest ERGM for random graphs- -with sufficient statistics of interest to many communities!

The β model for random graphs: degree sequences

The β-model is the ERGM on labeled networks with network statisticsgiven by the (ordered) degree sequence:

x ∈ Gn 7→ d(x) = d = (d1, . . . , dn) ∈ Nn,

where di is the degree of node i .

Definition (Set of all possible outcomes for generalized Beta)

Sn := {xi ,j : i < j and xi ,j ∈ {0, 1, . . . ,Ni ,j}} ⊂ N(n2).

Definition (Parametrization of the beta model)

For β ∈ Rn:

pi ,j =eβi+βj

1 + eβi+βjand pj ,i = 1− pi ,j =

1

1 + eβi+βj, ∀i 6= j .

The model Mβ for n vertices consists of all pi ,j ’s of this form.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 9 / 22

Page 19: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

A simplest ERGM for random graphs- -with sufficient statistics of interest to many communities!

The β model for random graphs: degree sequences

The β-model is the ERGM on labeled networks with network statisticsgiven by the (ordered) degree sequence:

x ∈ Gn 7→ d(x) = d = (d1, . . . , dn) ∈ Nn,

where di is the degree of node i .

Definition (Set of all possible outcomes for generalized Beta)

Sn := {xi ,j : i < j and xi ,j ∈ {0, 1, . . . ,Ni ,j}} ⊂ N(n2).

Definition (Parametrization of the beta model)

For β ∈ Rn:

pi ,j =eβi+βj

1 + eβi+βjand pj ,i = 1− pi ,j =

1

1 + eβi+βj, ∀i 6= j .

The model Mβ for n vertices consists of all pi ,j ’s of this form.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 9 / 22

Page 20: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

What parameters ‘best explain’ the given (network) data? MLE non-existence leads to degeneracy

Inference

Problem

Given one observation x ∈ Gn (given t = t(x)), estimate the parameters.

In the ET model, want to learn 2 parameters, in the β-model, n.

MLE (p) := argmaxp∈Mn

∏i<j

pxijij .

Fact: The MLE is nonexistent if p is on the boundary of MA.Consequence: Some of the coordinates of p are either 0 or 1,and are, therefore, non-estimable.

Key tasks (Fundamental for goodness-of-fit testing)

(1) decide whether the MLE exists(2) identify non-estimable parameters.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 10 / 22

Page 21: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

What parameters ‘best explain’ the given (network) data? MLE non-existence leads to degeneracy

Inference

Problem

Given one observation x ∈ Gn (given t = t(x)), estimate the parameters.

In the ET model, want to learn 2 parameters, in the β-model, n.

MLE (p) := argmaxp∈Mn

∏i<j

pxijij .

Fact: The MLE is nonexistent if p is on the boundary of MA.Consequence: Some of the coordinates of p are either 0 or 1,and are, therefore, non-estimable.

Key tasks (Fundamental for goodness-of-fit testing)

(1) decide whether the MLE exists(2) identify non-estimable parameters.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 10 / 22

Page 22: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

What parameters ‘best explain’ the given (network) data? MLE non-existence leads to degeneracy

Inference

Problem

Given one observation x ∈ Gn (given t = t(x)), estimate the parameters.

In the ET model, want to learn 2 parameters, in the β-model, n.

MLE (p) := argmaxp∈Mn

∏i<j

pxijij .

Fact: The MLE is nonexistent if p is on the boundary of MA.Consequence: Some of the coordinates of p are either 0 or 1,and are, therefore, non-estimable.

Key tasks (Fundamental for goodness-of-fit testing)

(1) decide whether the MLE exists(2) identify non-estimable parameters.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 10 / 22

Page 23: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

This crucial problem must not be overlooked. MLE non-existence leads to degeneracy

A small example of data leading to a nonexistent MLE

× 0

N1,2 ×× N3,4

0 ×(The model: pi ,j = e

βi+βj

1+eβi+βj

.)

× 0 1 2

3 × 2 1

2 1 × 3

1 2 0 ×

× 0 0.5 0.5

1 × 0.5 0.5

0.5 0.5 × 1

0.5 0.5 0 ×

Left: data exhibiting the above pattern, when Ni ,j = 3 for all i 6= j .Right: table of the extended MLE of the estimated probabilities. Undernatural parametrization, the supremum of the log-likelihood is achieved inthe limit for any sequence of natural parameters {β(k)} of the formβ(k) = (−ck ,−ck , ck , ck), where ck →∞ as k →∞.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 11 / 22

Page 24: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

This crucial problem must not be overlooked. MLE non-existence leads to degeneracy

A small example of data leading to a nonexistent MLE

Problem (!)

Current algorithms and software for fitting ERGMs have no simplemechanism for detecting non-existence and identifying non-estimableparameters and degeneracy.

× 0 1 2

3 × 2 1

2 1 × 3

1 2 0 ×

× 0 0.5 0.5

1 × 0.5 0.5

0.5 0.5 × 1

0.5 0.5 0 ×

Left: data exhibiting the above pattern, when Ni ,j = 3 for all i 6= j .Right: table of the extended MLE of the estimated probabilities. Undernatural parametrization, the supremum of the log-likelihood is achieved inthe limit for any sequence of natural parameters {β(k)} of the formβ(k) = (−ck ,−ck , ck , ck), where ck →∞ as k →∞.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 11 / 22

Page 25: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Geometry of Discrete Exponential Families Polytopes and MLE existence

Basics of Discrete Exponential Families

The set P = convhull(t(x) : x ∈ Gn) is called the model polytope.

int(P)= {Eθ[t], θ ∈ Θ} is precisely the set of all possible expectedvalues of t (mean value space; homeomorphic to parameter space).

Theorem

The MLE exists for x if and only if t(x) ∈ int(P).

In the ET example, MLE exists for 415of 444 cases.

The boundary of P specifies degeneratedistributions for the nonexistentMLEs:extended exponential family.

0 5 10 15 20 25 30 35 400

10

20

30

40

50

60

70

80

90Convex support

Number of edges

Num

ber o

f tria

ngle

s

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 12 / 22

Page 26: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Geometry of Discrete Exponential Families Polytopes and MLE existence

Basics of Discrete Exponential Families

The set P = convhull(t(x) : x ∈ Gn) is called the model polytope.

int(P)= {Eθ[t], θ ∈ Θ} is precisely the set of all possible expectedvalues of t (mean value space; homeomorphic to parameter space).

Theorem

The MLE exists for x if and only if t(x) ∈ int(P).

In the ET example, MLE exists for 415of 444 cases.

The boundary of P specifies degeneratedistributions for the nonexistentMLEs:extended exponential family.

0 5 10 15 20 25 30 35 400

10

20

30

40

50

60

70

80

90Convex support

Number of edges

Num

ber o

f tria

ngle

s

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 12 / 22

Page 27: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Do we understand the geometry of the β model? What is its polytope?

Model polytope for β

It is parametrized by the vertex-edge incidence matrix of a complete graph:

A4 =

1 1 1 0 0 01 0 0 1 1 00 1 0 1 0 10 0 1 0 1 1

rows indexed by the vertices; columns indexed by (i , j) with i < j .

Definition (The model polytope)

Sn := conv {Anx , x ∈ Sn}

Example

Represent the graph with an edge {1, 2} and a triple edge {1, 3} as

x = [1, 3, 0, . . . , 0]T ∈ Sn.

Corresponding point in the model polytope is Anx = [4, 1, 3, 0, . . . ].

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 13 / 22

Page 28: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Do we understand the geometry of the β model? What is its polytope?

The polytope of degree sequences

Definition

The polytope of degree sequences is

Pn := convhull ({Ax , x ∈ Gn}) .

Facet-defining inequalities of Pn are known (Mahadev-Peled ’96).

Theorem (Rinaldo-P.-Fienberg)

Let x ∈ Sn be the observed vector of edge counts. The MLE exists if andonly if ∑

j<i

xj ,iNi ,j

+∑j>i

xi ,jNi ,j∈ int(Pn), i = 1, . . . , n.

Example (Stanley)

f (P8) = (334982, 1726648, 3529344, 3679872, 2074660, 610288, 81144, 3322, 1).

We used polymake for the computations on small polytopes.S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 14 / 22

Page 29: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Model polytope for the β model? Combinatorics and facial sets

Facial sets of the model polytope

Proposition (Rinaldo-P.-Fienberg)

A point y belongs to the interior of some face F of Pn if and only if thereexists a set F ⊂ {(i , j), i < j} such that for anyp = {pi ,j : i < j , pi ,j ∈ [0, 1]} satisfying

y = Anp,

pi ,j ∈ {0, 1} if (i , j) 6∈ F and pi ,j ∈ (0, 1) if (i , j) ∈ F .

F is called a facial set of Sn, and Fc a co-facial set.

The MLE does not exist for the graph x if and only if the set{(i , j) : i < j , xi ,j = 0 or Ni ,j} contains a co-facial set.

Facial sets specify which probability parameters are estimable:only the probabilities {pi ,j , (i , j) ∈ F}.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 15 / 22

Page 30: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Model polytope for the β model? Combinatorics and facial sets

Example: Co-facial sets for nonexistent MLEs

× 0N1,2 ×

× N3,4

0 ×

× 0N1,2 × 0 0

N3,2 ×N4,2 ×

× 0 0 0N1,2 ×N1,3 ×N4,1 ×

× 0 0N1,2 × 0N1,3 N2,3 ×

×

× N1,2

0 × 0 0N2,3 ×N2,4 ×

Table: Co-facial sets for P4 (empty cells indicate any entry values).

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 16 / 22

Page 31: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Geometry of network models Polytopes and general algorithms

MLE existenceRinaldo-Petrovic-Fienberg, Beta model, Annals of Statistics 2013

Non-existence of MLEs occurs commonly in large sparse networkmodels, but most often goes undetected.

Methodologies for estimation and model validation under anon-existent MLE with proven statistical performance have yet to bedeveloped.

Techniques from (polyhedral) geometry offer:

The only way to detect non-existenceAn exact handle on what parameters of the model are actuallyestimableAn algorithmic approach to how to estimate such parameters.

Algorithms

Related algorithms for finding the facial sets of the model cone aredescribed in Fienberg-Rinaldo ’12.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 17 / 22

Page 32: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Geometry of network models Polytopes and general algorithms

Overview of Algorithms

Two tasks:

Decide if t(x) ∈ Pn: existence of the MLE (easy)Decide for which face F of Pn, t ∈ relint(F ): identification ofestimable parameters (hard).

For network models, we propose a 2-step procedure1 Cayley trick (or lifting): replace Pn with a larger set which is simpler to

analyze: Cn, a polyhedral cone.Cayley trick: details

2 Find F using the boundary of Cn.Cone Boundary: details

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 18 / 22

Page 33: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

This is a general framework that applies to any toric model

Two related toric models where main Theorem applies

Definition (Random graphs with fixed degree sequence)

In the special case when Nij = 1, the support Sn reduces to

Gn := {0, 1}(n2), undirected simple graphs on n nodes.

Corollary (RPF)

A conjecture in Chatterjee-Diaconis-Sly (’10) is true: for the random graphmodel, the MLE exists if and only if d(x) ∈ int Pn.

Definition (The Rasch model)

A random bipartite graph model, the support being Gk,l , the set ofbipartite graphs on k and l vertices.

Theorem (RPF)

The MLE of the Rasch model parameters exists if and only ifd(x) ∈ int Pp,q, the polytope of bipartite degree sequences.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 19 / 22

Page 34: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

This is a general framework that applies to any toric model

Two related toric models where main Theorem applies

Definition (Random graphs with fixed degree sequence)

In the special case when Nij = 1, the support Sn reduces to

Gn := {0, 1}(n2), undirected simple graphs on n nodes.

Corollary (RPF)

A conjecture in Chatterjee-Diaconis-Sly (’10) is true: for the random graphmodel, the MLE exists if and only if d(x) ∈ int Pn.

Definition (The Rasch model)

A random bipartite graph model, the support being Gk,l , the set ofbipartite graphs on k and l vertices.

Theorem (RPF)

The MLE of the Rasch model parameters exists if and only ifd(x) ∈ int Pp,q, the polytope of bipartite degree sequences.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 19 / 22

Page 35: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

This is a general framework that applies to any toric model

Extensions

(1) Removing the sampling constraint: Let quantities Ni ,j be random!

Theorem (Thanks to Haase and Yu)

The model polytope has 3n facets, and is obtained from the product ofsimplices by removing the vertices {ei × e ′i}, i = 1, . . . , n.

(2) Specialize (1) to directed graphs without multiple edges.

This is the Bradley-Terry model for pairwise comparisons.

Theorem (Zermelo ’29, Ford ’57)

If the graph is strongly connected, then the MLE exists.

Algorithms for detecting co-facial sets still apply. The matrix of themodel polytope has dimension (

(n2

)+ n)× n(n − 1).

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 20 / 22

Page 36: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

This is a general framework that applies to any toric model

Extensions

(3) A directed random graph model used in social networking:the p1 model (Holland-Leinhardt ’81).

The model polytope is the Minkowski sum of(n

2

)polytopes.

Example (n=4)

410 = 1, 048, 576 different graphs x. Three cases of the p1 model:

1 There are 225, 025 points A4x, and the MLE exists for 7, 983.

2 349, 500, the MLE exists in 12, 684 cases

3 583, 346, the MLE never exists.

Theorem (Rinaldo-P.-Fienberg)

Sufficient conditions for MLE existence, with large probability, as n grows.

In the case of fixed degree sequence graphs, our asymptotic resultsimprove those of Chatterjee-Diaconis ’11.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 21 / 22

Page 37: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

How a network is generated is crucial to properly calculate statistical network properties!

HOMEWORK

1 Read section 4.2.1 of Social and Economic Networks by Matthew O.Jackson.

2 Think about lattice points of the model polytope for β-model interms of graphical degree sequences.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 22 / 22

Page 38: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Extra stuff

See intro of the beta model annals paper, why the beta model, why thegeneralization.

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 23 / 22

Page 39: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Extra stuff

Extra slide

Characterization of the sets of edges and non-edges in a given graphleading to a nonexistence of the MLE.Examples of a nonexistent MLE (the degrees are non-trivial) for 4, 5, 6nodes:

2

2

2

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 24 / 22

Page 40: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Extra stuff Overview of Algorithms in more detail: Cayley trick

The combinatorial complexity of Pn make it computationallyintractable. For instance, the f -vector of P8 is

(334982, 1726648, 3529344, 3679872, 2074660, 610288, 81144, 3322).

Cayley Trick. Instead of dealing directly with Pn – a Minkowski sumof line segments – we construct Cn, a larger full-dimensionalpolyhedral cone with

(n2

)more facets than Pn but with many less

vertices. Despite its larger dimension, Cn is amenable tocomputational analysis.

Theorem

Every facial set of Pn corresponds to one facial set of Cn.

Go back to the presentation

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 25 / 22

Page 41: {Connecting statistics, combinatorics, and computational algebra{ …imi.cas.sc.edu/django/site_media/media/events/2013/... · 2013-05-22 · Markov bases Markov bases Sonja Petrovic«

Extra stuff Overview of Algorithms in more detail: The faces of Cn

To determine which face F of Cn is such that t(x) ∈ F is a non-linearoptimization problem.

We propose two solutions:

1 Repeated linear programming.2 Non-linear optimization.

Go back to the presentation

S. Petrovic ([email protected]) Algebraic Statistics for Network Models Monday, 20 May 2013 26 / 22