university of joensuu dept. of computer science p.o. box 111 fin- 80101 joensuu tel. +358 13 251...

University of JoensuuDept. of Computer ScienceP.O. Box 111FIN- 80101 Joensuu

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Gaussian Mixture Models

Speech and Image Processing UnitDepartment of Computer Science

University of Joensuu, FINLAND

Ville Hautamäki

Clustering Methods: Part 8

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Preliminaries

• We assume that the dataset X has been generated by a parametric distribution p(X).

• Estimation of the parameters of p is known as density estimation.

• We consider Gaussian distribution.

http://research.microsoft.com/~cmbishop/PRML/Figures taken from:

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Typical parameters (1)

• Mean (μ): average value of p(X), also called expectation.

• Variance (σ): provides a measure of variability in p(X) around the mean.

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Typical parameters (2)

• Covariance: measures how much two

variables vary together.

• Covariance matrix: collection of covariances between all dimensions.

– Diagonal of the covariance matrix

contains the variances of each attribute.

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

One-dimensional Gaussian

• Parameters to be estimated are the mean (μ) and variance (σ)

1 1Normal( | , ) exp

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Multivariate Gaussian (1)

• In multivariate case we have covariance matrix instead of variance

2 1/ 2

1 1 1Normal( | , ) exp

(2 ) det( ) 2

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Multivariate Gaussian (2)

Full covarianceDiagonalSingle

2 211 12

2 212 22

ln ( ) ln Normal( | , )N

Complete data log likelihood:

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Maximum Likelihood (ML) parameter estimation

• Maximize the log likelihood formulation

• Setting the gradient of the complete data log

likelihood to zero we can find the closed form

solution.

– Which in the case of mean, is the sample average.

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

When one Gaussian is not enough

• Real world datasets are rarely unimodal!

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Mixtures of Gaussians

( ) Normal( | , )M

k k kk

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Mixtures of Gaussians (2)

• In addition to mean and covariance parameters (now M times), we have mixing coefficients πk.

Following properties hold for the mixing coefficients:

It can be seen as the prior probability of the component k

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Responsibilities (1)

• Component labels (red, green and blue)

cannot be observed.

• We have to calculate approximations

(responsibilities).

Complete data Incomplete data Responsibilities

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

• Responsibility describes, how

probably observation vector x is from

component k.

• In clustering, responsibilities take

values 0 and 1, and thus, it defines the

hard partitioning.

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

We can express the marginal density p(x) as:

( ) ( ) ( | )M

p p k p k

( ) ( | )

Normal( | , )

l l ll

p l p l

From this, we can find the responsibility of the kth component of x using Bayesian theorem:

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Expectation Maximization (EM)

• Goal: Maximize the log likelihood of the whole data

• When responsibilities are calculated, we can maximize individually for the means, covariances and the mixing coefficients!

ln ( | , , ) ln Normal( | , )N M

k n k kn k

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Exact update equations

New mean estimates:

Covariance estimates

Mixing coefficient estimates

k k n nnkN

k k nn

1( )( )( )

k k nnkN

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

EM Algorithm

• Initialize parameters

• while not converged

– E step: Calculate responsibilities.

– M step: Estimate new parameters

– Calculate log likelihood of the new

parameters

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Example of EM

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Computational complexity

• Hard clustering with MSE criterion is NP-complete.

• Can we find optimal GMM in polynomial time?

• Finding optimal GMM is in class NP

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Some insights

• In GMM we need to estimate the parameters, which all are real numbers– Number of parameters:

M+M(D) + M(D(D-1)/2)

• Hard clustering has no parameters, just set partitioning (remember optimality criteria!)

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Some further insights (2)

• Both optimization functions are mathematically rigorous!

• Solutions minimizing MSE are always meaningful

• Maximization of log likelihood might lead to singularity!

Tel. +358 13 251 7959fax +358 13 251 7955

www.cs.joensuu.fi

Example of singularity

university of joensuu dept. of computer science p.o. box 111 fin- 80101 joensuu tel. +358 13 251...

variance slide

covariance parameters

covariance estimates

covariance matrix

new parameters

typical parameters

multivariate gaussian

log likelihood formulation

Documents

aluekierros joensuu

joensuu debates!

karelia-ammattikorkeakoulu · thesis march 2018 degree...

karelia-ammattikorkeakoulu metsätalouden...

joensuu 2016 fra

inhimillinen ja oikeudenmukainen joensuu

joensuu 2015 uusintapainos

joensuu - in english

joensuu 2014-2015

university of joensuu p.o.box 111 fi-80101 joensuu finland...

z-päivä joensuu

joensuu 2012 på svenska

joensuu 2010 en français

joensuu 2013 på svenska

university of joensuu dept. of computer science p.o. box 111...

mielekäs ura, joensuu 2012

joensuu 2013 - auf deutsch

joensuu 2016 de

joensuu 2010 auf deutsch

joensuu 2012 deutsch