algebraic-geometric methods for learning gaussian mixture models

59
Algebraic-Geometric Methods for Learning Gaussian Mixture Models Mikhail Belkin Dept. of Computer Science and Engineering, Dept. of Statistics Ohio State University / ISTA Joint work with Kaushik Sinha

Upload: diem

Post on 23-Feb-2016

63 views

Category:

Documents


0 download

DESCRIPTION

Algebraic-Geometric Methods for Learning Gaussian Mixture Models. Mikhail Belkin Dept. of Computer Science and Engineering, Dept. of Statistics Ohio State University / ISTA Joint work with Kaushik Sinha. TexPoint fonts used in EMF. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Mikhail BelkinDept. of Computer Science and Engineering,

Dept. of Statistics Ohio State University / ISTA

Joint work with Kaushik Sinha

Page 2: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

First considered by Pearson, 1894. Analyzed 1000 crabs from Naples. Concluded (erroneously?) that there were two distinct populations.

From crabs to Gaussians

Page 3: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

The Problem

Learning Gaussian Mixture Model

• Classical problem in statistics– goes back to the classical work of Pearson(1894) .

• Widely used model for scientific/engineering tasks

• Application areas include– Speech Recognition– Computer Vision– Bioinformatics– Astronomy– Medicine– …..

Page 4: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Gaussian Mixtures

Problem:identifying parameters of a Gaussian mixture distribution from a finite sample.

The Problem

Page 5: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

The Problem

The Problem• Mixture of Gaussians in

• What does learning such a mixture mean?– estimating the parameters of a mixture within pre-specified accuracy from a

sample.– parameters are the means, covariance matrices and mixing weights of

component Gaussian distributions.– number of parameters:

Page 6: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Most popular method: Expectation Maximization

EM is by far the most popular method for mixture fitting. Iterative procedure to find parameters (similar to k-means clustering).

Simple to implement. Guaranteed to converge. Converges to true values, if initialized close to true values.

However:

Sensitive to initialization. Numerous local maxima. Does not detect the number of components.

Expectation Maximization

Page 7: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

How EM fails: a simple example

From Tao, Belkin, Yu, Annals of Statistics, 2010

Page 8: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

The Problem

Some Recent Progress

• Understanding computational aspects of Gaussian Mixture Learning– is it possible to learn Gaussian mixture in time and using a sample of size

polynomial in dimension?

• Dasgupta (1999) showed that learning a mixture of Gaussians in using a sample size polynomial in n is possible.– result was surprising because complexity of many problems scales

exponentially with dimension (curse of dimensionality).

Rn

Something as simple as the volume of a convex body cannot be estimated using number of samples polynomial in dimension

(Barany, Furedi, 88).

Page 9: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Learning in high dimension

Dasgupta’s Result, 1999

• It is possible to learn mixture of Gaussians in using a sample size polynomial in , if the component separation is rof the order– component separation is the minimum distance between the component

means.

Page 10: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Partial Summary of Results on Gaussian Mixture Learning

• Min. separation is independent of and

• Our Result (solves the general problem)

Summary of Existing Results

Author Min. Separation Description

[Dasgupta], 1999 Gaussian mixtures, mild assumptions

[Dasgupta-Schulman], 2000 Spherical Gaussian mixtures

[Arora-Kannan], 2001 Gaussian mixtures

[Vempala-Wang], 2002 Spherical Gaussian mixtures

[Kannan-Salmasian-Vempala], 2005 Gaussian Mixtures, Logconcave Distr.

[Achlioptas-McSherry], 2005 Gaussian Mixtures

Page 11: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Partial Summary of Results on Gaussian Mixture Learning

• Min. separation is independent of and

• Our Result (solves the general problem)

Summary of Existing Results

Author Min. Separation Description

[Dasgupta], 1999 Gaussian mixtures, mild assumptions

[Dasgupta-Schulman], 2000 Spherical Gaussian mixtures

[Arora-Kannan], 2001 Gaussian mixtures

[Vempala-Wang], 2002 Spherical Gaussian mixtures

[Kannan-Salmasian-Vempala], 2005 Gaussian Mixtures, Logconcave Distr.

[Achlioptas-McSherry], 2005 Gaussian Mixtures

Min. separation is an increasing function of

and/orr

Page 12: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Partial Summary of Results on Gaussian Mixture Learning

• Min. separation is independent of and

• Our Result (solves the general problem)

Summary of Existing Results

Author Min. Separation Description

[Dasgupta], 1999 Gaussian mixtures, mild assumptions

[Dasgupta-Schulman], 2000 Spherical Gaussian mixtures

[Arora-Kannan], 2001 Gaussian mixtures

[Vempala-Wang], 2002 Spherical Gaussian mixtures

[Kannan-Salmasian-Vempala], 2005 Gaussian Mixtures, Logconcave Distr.

[Achlioptas-McSherry], 2005 Gaussian Mixtures

[Belkin-Sinha], 2009 Identical spherical Gaussian mixtures

[Kalai-Moitra-Valiant], 2010 Gaussian mixtures with 2 components

Min. separation is an increasing function of

and/orr

[Feldman-O’Donnell-Servedio], 2006 Axis aligned Gaussians, no param. est.

Page 13: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Partial Summary of Results on Gaussian Mixture Learning

• Min. separation is independent of and

• Our Result (solves the general problem)

Summary of Existing Results

Author Min. Separation Description

[Dasgupta], 1999 Gaussian mixtures, mild assumptions

[Dasgupta-Schulman], 2000 Spherical Gaussian mixtures

[Arora-Kannan], 2001 Gaussian mixtures

[Vempala-Wang], 2002 Spherical Gaussian mixtures

[Kannan-Salmasian-Vempala], 2005 Gaussian Mixtures, Logconcave Distr.

[Achlioptas-McSherry], 2005 Gaussian Mixtures

[Belkin-Sinha], 2009 Identical spherical Gaussian mixtures

[Kalai-Moitra-Valiant], 2010 Gaussian mixtures with 2 components

[Belkin-Sinha], 2010 Gaussian mixtures

Min. separation is an increasing function of

and/orr

[Feldman-O’Donnell-Servedio], 2006 Axis aligned Gaussians, no param. est.

[Moitra-Valiant], 2010 Gaussian mixtures

Page 14: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Identifiabilty

• Different values of parameters could give rise to same distribution

• Need for a quantification of how hard it is to learn the parameters from data

Obstacle in Learning

Example: parameters of the following distribution family cannot be learned from sampled data

Page 15: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Identifiabilty

• Different values of parameters could give rise to same distribution

• Need for a quantification of how hard it is to statistically learn the parameters from data

Obstacle in Learning

Example: parameters of the following distribution family cannot be learned from sampled data

Page 16: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Identifiabilty

• Different values of parameters could give rise to same distribution

• Need for a quantification of how hard it is to statistically learn the parameters from data

Obstacle in Learning

If and are close to two values of parameters and with identical probability distributions , then it is hard to distinguish them from sampled data, even when is large.

Example: parameters of the following distribution family cannot be learned from sampled data

Example:p1(x) = 0:1

0@e¡ x2

2p 2¼

1A + 0:9

0@e¡ x2

2p 2¼

1A

p2(x) = 0:5µ

e¡ x 22p 2¼

¶+ 0:5

µe¡ ( x ¡ 0:0001)2

2p 2¼

Page 17: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Radius of Identifiability

• Introduce Radius of Identifiability

– it is the radius of largest open ball around such that any two different parameters from this ball give rise to different probability density functions.

– if no such ball exists, i.e., , then parameters cannot be identified uniquely, given any amount of data.

– complexity scales with

Radius of Identifiability

Page 18: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Radius of Identifiability

• Introduce Radius of Identifiability

– it is the radius of largest open ball around such that any two different parameters from this ball give rise to different probability density functions.

– if no such ball exists, i.e., , then parameters can not be identified uniquely, given any amount of data.

– complexity scales with

Radius of Identifiability

We show that explicit formula of for Gaussian mixture is

Page 19: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Our Result

Main Result

• We show that the parameters of a Gaussian mixture with radius of identifiability in can be learned (up to permutation) within pre-specified precision with confidence using a sample size , where is radius of the bounding ball.

– minimum separation can even be zero, i.e., two Gaussian components can

have same means but different covariance matrices.

– polynomial dependence on is necessary.

B

Page 20: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Overview of Our Proof

1. Reduction to fixed dimension– we show that learning Gaussian mixture in dimensions can be reduced to

, parameter estimation problems in dimensions. (more on this later)

2. Learning in fixed dimension – we introduce the general notion of “polynomial family” (more on this soon).

– we show that the parameters of polynomial families can be learned within accuracy with confidence at least , using a sample of size polynomial in dd and .

– in addition to Gaussian distribution, almost all standard parametric probability distributions as well as their mixtures and products form polynomial families (more on this soon).

Overview of Our Proof

Page 21: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Overview of Our Proof

1. Reduction to fixed dimension– we show that learning Gaussian mixture in dimensions can be reduced to

, parameter estimation problems in dimensions (more on this later).

2. Learning in fixed dimension – we introduce the general notion of “polynomial family” (more on this soon).

– we show that the parameters of polynomial families can be learned within accuracy with confidence at least , using a sample of size polynomial in dd and .

– in addition to Gaussian distribution, almost all standard parametric probability distributions as well as their mixtures and products form polynomial families (more on this soon).

Overview of Our Proof

Page 22: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Overview of Our Proof

1. Reduction to fixed dimension– we show that learning Gaussian mixture in dimensions can be reduced to

, parameter estimation problems in dimensions (more on this later).

2. Learning in fixed dimension – we introduce the general notion of “polynomial family” (more on this soon).

– we show that the parameters of polynomial families can be learned within accuracy with confidence at least , using a sample of size polynomial in dd and .

– in addition to Gaussian distribution, almost all standard parametric probability distributions as well as their mixtures and products form polynomial families (more on this soon).

Overview of Our Proof

Page 23: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Learning in Fixed Dimension: Polynomial Family

• Definition

– a family of probability distributions parameterized by , forms a polynomial family, if each (raw) moment of exists and can be represented as a polynomial of the parameters .

Polynomial Family

Page 24: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Examples of Polynomial Families

Polynomial Family

Gaussian

Moments are given by Hermite Polynomials

Page 25: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Examples of Polynomial Families

Polynomial Family

Gaussian Gamma

Binomial

Exponential

Examples include almost all standard parametric families as well as their mixtures and products. Hence Gaussian mixtures are also a polynomial family.

Moments are given by Hermite Polynomials

Page 26: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Proof Sketch For Learning in Fixed Dimension

• Main result for polynomial families– there is an algorithm which given for an identifiable family , where is

the set of parameters within a ball of radius , outputs within of with probability at least , using a number of sample points from polynomial in and .

• Proof Sketch1. given a polynomial family, find a finite set of moments that completely

characterizes a distribution (identifiability).

2. reformulate the problem of learning the parameters in terms of this set of moments using algebraic inequalities.

3. reduce the problem of learning the parameters to 1 dimension, using techniques from algebraic geometry, specifically,Tarski-Seidenberg theorem (elimination of quantifiers).

Polynomial Family

Page 27: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Identifiability and Finite Set of Moments

• Identifiability– family is identifiable if for any .

• We will prove that when is identifiable, finite number of moments are sufficient to uniquely identify the parameter (next slide)

• Requires application of Hilbert Basis Theorem

Polynomial Family

Hilbert Basis Theorem :

Every ideal in a ring of polynomials is finitely generated.

Page 28: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Finite Set of Moments Fully Characterizes Polynomial Family

• For a polynomial family each moment is a polynomial of .

• Let be a polynomial of variables.

• Let be the ideal in the ring of polynomials of variables generated by polynomials .

• Let , where is an increasing sequence.

• Hilbert Basis theorem ensures that is finitely generated hence there exists some large enough such that for any

• Identifiability implies first moments defines the distribution uniquely.

Polynomial Family

Page 29: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Finite Set of Moments Fully Characterizes Polynomial Family

• For a polynomial family each moment is a polynomial of .

• Let be a polynomial of variables.

• Let be the ideal in the ring of polynomials of variables generated by polynomials .

• Let , where is an increasing sequence.

• Hilbert Basis theorem ensures that is finitely generated hence there exists some large enough such that for any

• Identifiability implies first moments defines the distribution uniquely.

Polynomial Family

Page 30: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Finite Set of Moments Fully Characterizes Polynomial Family

• For a polynomial family each moment is a polynomial of .

• Let be a polynomial of variables.

• Let be the ideal in the ring of polynomials of variables generated by polynomials .

• Let , where is an increasing sequence.

• Hilbert Basis theorem ensures that is finitely generated hence there exists some large enough such that for any

• Identifiability implies first moments defines the distribution uniquely.

Polynomial Family

Page 31: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Finite Set of Moments Fully Characterizes Polynomial Family

• For a polynomial family each moment is a polynomial of .

• Let be a polynomial of variables.

• Let be the ideal in the ring of polynomials of variables generated by polynomials .

• Let , where is an increasing sequence.

• Hilbert Basis theorem ensures that is finitely generated hence there exists some large enough such that for any

• Identifiability implies first moments defines the distribution uniquely.

Polynomial Family

Page 32: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Finite Set of Moments Fully Characterizes Polynomial Family

• For a polynomial family each moment is a polynomial of .

• Let be a polynomial of variables.

• Let be the ideal in the ring of polynomials of variables generated by polynomials .

• Let , where is an increasing sequence.

• Hilbert Basis theorem ensures that is finitely generated hence there exists some large enough such that for any

• Identifiability implies first moments defines the distribution uniquely.

Polynomial Family

Page 33: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Finite Set of Moments Fully Characterizes Polynomial Family

• For a polynomial family each moment is a polynomial of .

• Let be a polynomial of variables.

• Let be the ideal in the ring of polynomials of variables generated by polynomials .

• Let , where is an increasing sequence.

• Hilbert Basis theorem ensures that is finitely generated hence there exists some large enough such that for any

• Identifiability implies first moments defines the distribution uniquely.

Polynomial Family

Page 34: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

What Next?

• If the first moments are known precisely then the problem of learning the parameters is almost solved– only remaining task is to solve a finite set of polynomial equations.– can be done algorithmically.

• However, moments need to be estimated from sample data– uncertainty in moment estimation introduces uncertainty in parameter estimation.– how do we deal with it?

• A powerful result from mathematics, called the Tarski-Seidenberg Theorem helps us to prove that– moment estimation error depends only polynomially on parameter estimation error.

Polynomial Family

Page 35: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Tarski-Seidenberg Theorem

Polynomial Family

• Semi-algebraic Set– a semi algebraic set in is a finite union of sets defined by a finite number of

polynomial equations and inequalities.

• Tarski-Seidenberg Theorem– Let be a projection map. If is a semi-algebraic set in for some ok

, then is a semi-algebraic set in . – this is equivalent to elimination of quantifiers for semi-algebraic sets.

Equivalent to elimination of existential quantifier

Page 36: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Characterization of Uncertainty

Polynomial Family

• Suppose first moments completely characterizes the distribution– for any two parameters , define .– , iff .

• Fix and consider the set

– since logical statements can be expressed as algebraic conditions by Tarski-Seidenberg Theorem is a semi-algebraic subset of .

– eliminating the quantifiers reduces the problem to 1-dimension, where it can be shown that is polynomially dependent on supremum of .

Page 37: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Characterization of Uncertainty

Polynomial Family

• Suppose first moments completely characterizes the distribution– for any two parameters , define .– , iff .

• Fix and consider the set

– since logical statements can be expressed as algebraic conditions by Tarski-Seidenberg Theorem is a semi-algebraic subset of .

– eliminating the quantifiers reduces the problem to 1-dimension, where it can be shown that is polynomially dependent on supremum of .

Page 38: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Characterization of Uncertainty

Polynomial Family

• Suppose first moments completely characterizes the distribution– for any two parameters , define .– , iff .

• Fix and consider the set

– here can be viewed as an around taking probability distribution into account.

– since logical statements can be expressed as algebraic conditions by Tarski-Seidenberg Theorem is a semi-algebraic subset of .

– eliminating the quantifiers reduces the problem to 1-dimension, where it can be shown that is polynomialy dependent on .here can be viewed as an

around taking probability distribution into account.

Page 39: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

A Simple Example

• Consider a univariate Gaussian with zero mean– second moment uniquely defines this distribution.

• For any two , assume and consider the following set

– by Tarski-Seideberg Theorem is a semi algebraic subset of and in this case we can see what it is exactly (geometric interpretation on next slide).

• For a fixed , supremum of represents allowable parameter estimation error for a fixed moment estimation error– elimination of and leads to a relation between and .

Polynomial Family

Page 40: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Reduction to Fixed Dimension

• Results of learning polynomial families for a fixed dimension can not be applied directly to mixture of high-dimensional Gaussians– number of parameters to be estimated increases with dimension.– how do we deal with it?

• We show that it is possible to estimate the parameters in high dimension by solving parameter estimation problems in appropriate low dimensions – why does it work?

Reduction

Page 41: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Low-dimensional Projection of Gaussian Mixture• Some Good News

– projecting a Gaussian mixture onto a lower dimensional coordinate plane yields a low-dimensional Gaussian mixture where,

• mixing coefficients remain the same• new component means are the projections of the original component means,• new component covariance matrices are the restrictions of the original component

covariance matrices.– results for polynomial families can be used to learn the parameters of this low-

dimensional Gaussian mixture.– hopefully, parameters of high dimensional Gaussian mixture can be learned by

learning the parameters of several low-dimensional Gaussian mixtures.

• Some Difficulties– radius of identifiability of the projected gaussian mixture may become zero (not

learnable!).– parameters of high dimensional Gaussian mixture may not be uniquely learned from

the parameters of several low-dimensional Gaussain mixtures.

Reduction

Page 42: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Low-dimensional Projection of Gaussian Mixture• Some Good News

– projecting a Gaussian mixture onto a lower dimensional coordinate plane yields a low-dimensional Gaussian mixture where,

• mixing coefficients remain the same.• new component means are the projections of the original component means.• new component covariance matrices are the restrictions of the original component

covariance matrices.– results for polynomial families can be used to learn the parameters of this low-

dimensional Gaussian mixture.– hopefully, parameters of high dimensional Gaussian mixture can be learned by

learning the parameters of several low-dimensional Gaussian mixtures.

• Some Difficulties– radius of identifiability of the projected Gaussian mixture may become zero (not

learnable!).– parameters of high dimensional Gaussian mixture may not be uniquely learned from

the parameters of several low-dimensional Gaussain mixtures.

Reduction

Page 43: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Low-dimensional Projection of Gaussian Mixture

• Good News

– projecting a Gaussian mixture onto a lower dimensional coordinate plane yields a low-dimensional Gaussian mixture where,

• mixing coefficients remain the same.• new component means are the projections of the original component means.• new component covariance matrices are the restrictions of the original component

covariance matrices.– results for polynomial families can be used to learn the parameters of this low-

dimensional Gaussian mixture.– hopefully, parameters of high dimensional Gaussian mixture can be learned by

learning the parameters of several low-dimensional Gaussian mixtures.

Reduction

Page 44: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Low-dimensional Projection of Gaussian Mixture

• Some Difficulties– radius of identifiability of the projected Gaussian mixture may become zero (not

learnable!).– parameters of high dimensional Gaussian mixture may not be uniquely learned

from the parameters of several low-dimensional Gaussian mixtures.

Reduction

µ1

µ2

e1

e2

Page 45: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Low-dimensional Projection of Gaussian Mixture

• Some Difficulties– radius of identifiability of the projected Gaussian mixture may become zero (not

learnable!).– parameters of high dimensional Gaussian mixture may not be uniquely learned

from the parameters of several low-dimensional Gaussian mixtures.

Reduction

µ1

µ2

e1

e2

Page 46: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Low-dimensional Projection of Gaussian Mixture

• Some Difficulties– radius of identifiability of the projected Gaussian mixture may become zero (not

learnable!).– parameters of high dimensional Gaussian mixture may not be uniquely learned

from the parameters of several low-dimensional Gaussian mixtures.

Reduction

µ1

µ2

e1

e2

Page 47: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Low-dimensional Projection of Gaussian Mixture

• Some Difficulties– radius of identifiability of the projected Gaussian mixture may become zero (not

learnable!).– parameters of high dimensional Gaussian mixture may not be uniquely learned

from the parameters of several low-dimensional Gaussian mixtures.

Reduction

µ1

µ2

e1

e2

Page 48: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Low-dimensional Projection of Gaussian Mixture

• Some Difficulties– radius of identifiability of the projected Gaussian mixture may become zero (not

learnable!).– parameters of high dimensional Gaussian mixture may not be uniquely learned

from the parameters of several low-dimensional Gaussian mixtures.

Reduction

µ1

µ2

e1

e2

?

Page 49: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Sketch of the algorithm• Step 1

– identify a low-dimensional (fixed) coordinate plane where radius of identifiability reduces only by a fixed amount.

• this can be done deterministically by checking at most number of coordinate planes.

– project high dimensional Gaussian mixture onto this coordinate plane and learn the parameters of this low-dimensional Gaussian mixture.

• using results of learning polynomial families in a fixed dimension.

• Step 2– parameters along each remaining coordinate can be estimated separately by

adding each coordinate at a time and aligning the estimates obtained from two parameter estimation problems in overlapping fixed dimensions.

• total number of such low-dimensional parameter estimation problems is at most .

Reduction

Page 50: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Reduction

Sketch of the idea: two components in three dimensions.

Page 51: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Reduction

Sketch of the idea: two components in three dimensions.

Page 52: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Reduction

Sketch of the idea: two components in three dimensions.

Page 53: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Reduction

Sketch of the idea: two components in three dimensions.

Page 54: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Reduction

Sketch of the idea: two components in three dimensions.

Page 55: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Reduction sketch

Sketch of the idea: two components in three dimensions.

Page 56: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Reduction

Sketch of the idea: two components in three dimensions.

Page 57: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Reduction

Sketch of the idea: two components in three dimensions.

Page 58: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Reduction

Sketch of the idea: two components in three dimensions.

Page 59: Algebraic-Geometric Methods for Learning Gaussian Mixture Models

Conclusion

• Resolve the general problem of polynomial learning of Gaussian mixture distribution.– Completes an active line of research in theoretical computer science.

• The proof brings together the techniques of algebraic geometry and the classical method of moments.

• A step toward understanding algorithmic issues of Gaussian mixture modelling.

Conclusion