l12 — em-algorithm€¦ · david bolin - [email protected] mc based statistical methods, l12...

6
Testing EM Examples HA 3 Monte-Carlo and Empirical Methods for Statistical Inference, FMS091/MASM11 L12 — EM-algorithm David Bolin - [email protected] MC Based Statistical Methods, L12 1/24 Testing EM Examples HA 3 The EM-algorithm Hypothesis testing The EM-algorithm Examples Censored data Truncated data Mixture models Project 3 David Bolin - [email protected] MC Based Statistical Methods, L12 2/24 Testing EM Examples HA 3 Composite Bootstrap Hypothesises testing The basis of a hypothesis test consists of A null hypothesis that we wish to test. A test statistic t(y ), i.e. a function of the data. A rejection or critical region R. If the test statistic falls in the rejection region, then we reject the null hypothesis. We divide hypothesises into Simple Specifies a single distribution for the data, e.g. y N ( θ, σ 2 ) with H 0 : θ = 0 and σ 2 known. Composite Specifies more than one distribution for the data, e.g. y N ( θ, σ 2 ) with H 0 : θ = 0 and σ 2 unknown; or y N ( θ, σ 2 ) with H 0 : θ 0 and σ 2 known. David Bolin - [email protected] MC Based Statistical Methods, L12 3/24 Testing EM Examples HA 3 Composite Bootstrap Monte Carlo test of a simple hypothesis Algorithm 1. Draw N samples, y (1) ,..., y (N) , from the distribution specified by H 0 . 2. Calculate the test statistic, t (i) = t(y (i) ) for each sample. 3. Estimate the p-value using Monte Carlo integration as p(y ) = 1 N N i=1 1 t (i) t(y ) . 4. If p(y ) α reject H 0 . David Bolin - [email protected] MC Based Statistical Methods, L12 4/24

Upload: others

Post on 11-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: L12 — EM-algorithm€¦ · David Bolin - bolin@maths.lth.se MC Based Statistical Methods, L12 11/24 Testing EM Examples HA 3 Censoreddata Truncated data GMM Examples — Censored

Testing EM Examples HA 3

Monte-Carlo and Empirical Methods for Statistical Inference,FMS091/MASM11

L12 — EM-algorithm

David Bolin - [email protected] MC Based Statistical Methods, L12 1/24

Testing EM Examples HA 3

The EM-algorithm

! Hypothesis testing

! The EM-algorithm! Examples

! Censored data! Truncated data! Mixture models

! Project 3

David Bolin - [email protected] MC Based Statistical Methods, L12 2/24

Testing EM Examples HA 3 Composite Bootstrap

Hypothesises testing

The basis of a hypothesis test consists of

! A null hypothesis that we wish to test.

! A test statistic t(y), i.e. a function of the data.

! A rejection or critical region R.

If the test statistic falls in the rejection region, then wereject the null hypothesis. We divide hypothesises into

Simple Specifies a single distribution for the data, e.g.y ! N

!!,!2

"with H0 : ! = 0 and !2 known.

Composite Specifies more than one distribution for thedata, e.g.y ! N

!!,!2

"with H0 : ! = 0 and !2 unknown; or

y ! N!!,!2

"with H0 : ! " 0 and !2 known.

David Bolin - [email protected] MC Based Statistical Methods, L12 3/24

Testing EM Examples HA 3 Composite Bootstrap

Monte Carlo test of a simple hypothesis

Algorithm

1. Draw N samples, y(1), . . . , y(N), from the distributionspecified by H0.

2. Calculate the test statistic, t(i) = t(y(i)) for each sample.

3. Estimate the p-value using Monte Carlo integration as

#p(y) =1

N

N$

i=1

1

%t(i) # t(y)

&.

4. If #p(y) " " reject H0.

David Bolin - [email protected] MC Based Statistical Methods, L12 4/24

Page 2: L12 — EM-algorithm€¦ · David Bolin - bolin@maths.lth.se MC Based Statistical Methods, L12 11/24 Testing EM Examples HA 3 Censoreddata Truncated data GMM Examples — Censored

Testing EM Examples HA 3 Composite Bootstrap

Testing composite hypothesis

For a composite hypothesis H0 defines a set of distributions.Thus, we can’t simply simulate from H0. Three alternativesexist:

Pivotal tests Find a pivotal test statistic such that t(Y) hasthe same distribution for all distributions in H0.(Ex: Rank based tests.)

Conditional tests Convert the composite hypothesis to asimple by conditioning on a sufficient statistic.(Ex: Permutation based test.)

Bootstrap tests Replacing the composite hypothesis by asimple based on an estimate from data.

David Bolin - [email protected] MC Based Statistical Methods, L12 5/24

Testing EM Examples HA 3 Composite Bootstrap

Bootstrap tests

! Remember the relation between hypothesis testing andconfidence intervals.

! Knowing how to calculate Bootstrap confidenceintervals, we directly have a tool for doing tests on theform H0 : ! = !0 or H0 : ! > !0.

! In general, the idea in a Bootstrap test is to estimate P0

from data with #P .

! Given #P we revert to a simple Monte Carlo test,simulating from #P .

! Note that this is similar to ordinary Bootstrap with theexception that we require #P to fulfill thenull-hypothesis.

! As an example we compare:! y ! N

!!,!2

"with H0 : ! = 0 and !2 unknown.

David Bolin - [email protected] MC Based Statistical Methods, L12 6/24

Testing EM Examples HA 3 Algorithm Properties

The EM-algorithm

! In some cases we might have incomplete observations.! Censored data! Missing data! Latent variabel models

! Difficult to handle using ordinary estimationtechniques.

! Bayesian statistics, with the model estimated usingMCMC provides one alternative.

! Another alternative is the EM-algorithm (Dempster,Laird & Rubin; 1977).

! “proposed many times in special circumstances”.

David Bolin - [email protected] MC Based Statistical Methods, L12 7/24

Testing EM Examples HA 3 Algorithm Properties

The EM-algorithm

Basic setup:

! We have observed some data y.

! Additional data z is “missing”.

! The estimation problem would be “easy” if z wasknown.

In principle we could write out the likelihood for theobserved data as

L(!; y) =

'f(y, z|!) dz.

However the integral over the unknown data is often hardto compute.Instead we want to construct an approximation of thelog-likelihood based on the observed data.

David Bolin - [email protected] MC Based Statistical Methods, L12 8/24

Page 3: L12 — EM-algorithm€¦ · David Bolin - bolin@maths.lth.se MC Based Statistical Methods, L12 11/24 Testing EM Examples HA 3 Censoreddata Truncated data GMM Examples — Censored

Testing EM Examples HA 3 Algorithm Properties

The EM-algorithm

1. Write out the log-likelihood assuming that all the datais known, log L(!; y, z).

2. Compute the expected value of the log-likelihood overthe unknown observations Ez(log L(!; y, z)|y, !guess).

3. Take Q(!, !guess) = Ez(log L|y, !guess). Q can now be seenas the average possible value of the log-likelihoodgiven known observations and guessed parameters.

4. Update our guess of the parameters by maximasingQ(!, !guess).

5. Repeat from 2.

The result is the Expectation-Maximization algorithm.

David Bolin - [email protected] MC Based Statistical Methods, L12 9/24

Testing EM Examples HA 3 Algorithm Properties

The EM-algorithm

Algorithm:Choose a starting value !(0) and repeat for i = 1,2, . . . untilconvergence.

E-step Compute the expectation of the log-likelihoodwith respect to the unknown data, conditionalon the parameter guess and known data.

Q(!, !(i!1)) = Ez

%log L(!; y, z)|y, !(i!1)

&.

M-step Compute the maximum with respect to theunknown parameter

!(i) = argmax!

Q(!, !(i!1)).

David Bolin - [email protected] MC Based Statistical Methods, L12 10/24

Testing EM Examples HA 3 Algorithm Properties

The EM-algorithm

It can be shown that:

! If !(i) = argmax!

Q(!, !(i!1)) then L(!(i); y) # L(!(i!1); y).

! If Q(!, !(i!1)) is continuous then the algorithm willconverge to a stationary point !" = argmax

!

Q(!, !").

! Under additional weak conditions, the stationary pointwill be a local maxima.

David Bolin - [email protected] MC Based Statistical Methods, L12 11/24

Testing EM Examples HA 3 Censored data Truncated data GMM

Examples — Censored data

! The lifetime for a type of light bulbs is assumed tofollow an exponential distribution with expectation !.

! To estimate !, n independent light bulbs are observedand their lifetimes yi are recorded.

! The experiment is run for a pre-determined duration, h.

! At the end of the experiment m light bulbs are stillworking.

! This gives n $ m observations of lifetimes, yi, and munknown lifetimes zi that are larger than h.

Derive an EM-algorithm to estimate !.

David Bolin - [email protected] MC Based Statistical Methods, L12 12/24

Page 4: L12 — EM-algorithm€¦ · David Bolin - bolin@maths.lth.se MC Based Statistical Methods, L12 11/24 Testing EM Examples HA 3 Censoreddata Truncated data GMM Examples — Censored

Testing EM Examples HA 3 Censored data Truncated data GMM

Censored data (cont.)

1. Write down the log-likelihood as if all observationswhere known.

2. Separate the log-likelihood into parts with known, y,and unknown, z, data.

3. Take the expectation of the unknown parts, withrespect to the unknown data z, conditional on a guess,!old, of the parameters, and the known data y.

4. Maximise the resulting expression with respect to ! toobtain a new guess !new.

David Bolin - [email protected] MC Based Statistical Methods, L12 13/24

Testing EM Examples HA 3 Censored data Truncated data GMM

Censored v.s. Truncated data

Censored data Data above/below a certain value has not beenobserved; the number of exceedances havebeen registered.

Truncated data Data above/below a certain value has notbeen observed; the number of exceedanceshave not been registered.

! Both can be handled using the EM-algorithm buttruncated data is harder.

! For censored data we had a term

Ez(m$

i=1

zi|y, !guess).

! For truncated data the corresponding term is

Ez,M(M$

i=1

zi|y, !guess).

David Bolin - [email protected] MC Based Statistical Methods, L12 14/24

Testing EM Examples HA 3 Censored data Truncated data GMM

(Gaussian) Mixture Models

! Assume that we have observations from one of several(Normal) distributions called classes.

! The probability of data coming from class k is #k.

! The distribution of each class is[y|from class k] ! N ($k,%k).

! This generates a Gaussian mixture model with density

p(y|#, $,%) =K$

k=1

#kp(y|from class k, $k,%k).

! Possible usages! Modeling heavy tailed distributions.! Classification/clustering of data.

David Bolin - [email protected] MC Based Statistical Methods, L12 15/24

Testing EM Examples HA 3 Censored data Truncated data GMM

(Gaussian) Mixture Models (cont.)

! If we knew the class belonging, zi, of each observationyi the problem would be trivial.

! Thus the problem consists of two parts:1. Determine the class belongings z.2. Estimating the parameters " = {#, $,%}.

! Augmenting the model with the unknown classbelongings zi the log-likelihood becomes

log L(& ; y, z) = logn(

i=1

#zip(yi|zi, $k,%k)

=

n$

i=1

K$

k=1

1(zi = k) log

)#kp(yi|zi = k, $k,%k)

*.

David Bolin - [email protected] MC Based Statistical Methods, L12 16/24

Page 5: L12 — EM-algorithm€¦ · David Bolin - bolin@maths.lth.se MC Based Statistical Methods, L12 11/24 Testing EM Examples HA 3 Censoreddata Truncated data GMM Examples — Censored

Testing EM Examples HA 3 Censored data Truncated data GMM

(Gaussian) Mixture Models — E-step

! The expectations of the log-likelihood is

Ez

%log L(& ; y, z)|y,&guess

&

=

n$

i=1

K$

k=1

Ez

%1(zi = k)|y,&guess

&log

)#kp(yi|zi = k, $k,%k)

*.

! We note that

Ez(1(zi = k)|y,&guess) = P(zi = k|y,&guess).

David Bolin - [email protected] MC Based Statistical Methods, L12 17/24

Testing EM Examples HA 3 Censored data Truncated data GMM

(Gaussian) Mixture Models — E-step

The probability of belonging to class k is

P(zi = k|y,&guess) =p(zi = k, y|&guess)

p(y|&guess)

=p(y|zi = k,&guess) p(zi = k|&guess)+

k p(zi = k, y|&guess)

=p(y|zi = k,&guess) #k+k p(y|zi = k,&guess) #k

= wik

David Bolin - [email protected] MC Based Statistical Methods, L12 18/24

Testing EM Examples HA 3 Censored data Truncated data GMM

Gaussian Mixture Models — M-step

Having calculated the expectation of the log-likelihood wehave that

Q(& ,&guess) = Ez

%log L(& ; y, z)|y,&guess

&

=

n$

i=1

K$

k=1

wik

)log(#k)+ log

%p(yi|zi = k, $k,%k)

&*

=

n$

i=1

K$

k=1

wik

)log(#k)$

1

2log |%k|$

d

2log(2#)$

(yi $ $k)#%!1

k (yi $ $k)

2

*.

David Bolin - [email protected] MC Based Statistical Methods, L12 19/24

Testing EM Examples HA 3 Censored data Truncated data GMM

Gaussian Mixture Models — M-step

The new estimates of {#, $,%} are obtained by maximizingthe Q-function.

#k =1

n

n$

i=1

P(zi = k|yi,&guess) =1

n

n$

i=1

wik

For Gaussian mixture densities, yi|zi = k ! N ($k,%k), the

new estimates of $k and %k are given by

$k =1

n#k

n$

i=1

wikyi

and

%k =1

n#k

n$

i=1

wik(yi $ $k)#(yi $ $k).

David Bolin - [email protected] MC Based Statistical Methods, L12 20/24

Page 6: L12 — EM-algorithm€¦ · David Bolin - bolin@maths.lth.se MC Based Statistical Methods, L12 11/24 Testing EM Examples HA 3 Censoreddata Truncated data GMM Examples — Censored

Testing EM Examples HA 3 Censored data Truncated data GMM

Gaussian Mixture Models — Old Faithful

David Bolin - [email protected] MC Based Statistical Methods, L12 21/24

Testing EM Examples HA 3 Censored data Truncated data GMM

Gaussian Mixture Models — Old Faithful

Time before eruption and duration of eruption.

3.540

70

100

42

0 20 400

0.5

1

!

0 20 400

5

µ1

0 20 400

10

20

"11

David Bolin - [email protected] MC Based Statistical Methods, L12 22/24

Testing EM Examples HA 3 Oral Exam

Home Assignment 3

! Hidden Markov models (HMM) can be considered ageneralization of mixture models where the hiddenvariables are related through a Markov process ratherthan independent of each other.

! Are used in many different applied subjects, we look attwo examples in signal processing and genetics.

! The model is estimated using the EM algorithm.! The E-step can not be calculated analytically so we use

Gibbs sampling to estimate it.! The resulting algorithm is sometimes called the MCEM

algorithm.! We use Bootstrap tests to test some model properties.! This project thus combines Bootstrap, MC-integration,

and the EM-algorithm and is perhaps the first exampleof a truly computationally intensive problem so far inthe course.

David Bolin - [email protected] MC Based Statistical Methods, L12 23/24

Testing EM Examples HA 3 Oral Exam

Home Assignment 3 — Oral exam

! The oral exam will be in groups of 2 to 4.

! Should take around 30 minutes.

! General discussion about the course in general andprojects, what you did, how and why.

! People from the same project groups in different oralexam groups

! Grading is based primarily on the projects.

! Possible times are Friday in the exam week, week 1, andweek 2 of the next study period.

David Bolin - [email protected] MC Based Statistical Methods, L12 24/24