biostatistics 602 - statistical inference lecture 24 e …...recap. . . . . . . . . . . . . . . ....

147
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Recap . . . . . . . . . . . . . . . . E-M . . . P1 . . . . P2 . . . P3 . Summary . . Biostatistics 602 - Statistical Inference Lecture 24 E-M Algorithm & Practice Examples Hyun Min Kang April 16th, 2013 Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 1 / 33

Upload: others

Post on 13-Jul-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

.

......

Biostatistics 602 - Statistical InferenceLecture 24

E-M Algorithm & Practice Examples

Hyun Min Kang

April 16th, 2013

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 1 / 33

Page 2: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Last Lecture

• What is an interval estimator?• What is the coverage probability, confidence coefficient, and

confidence interval?• How can a 1− α confidence interval typically be constructed?• To obtain a lower-bounded (upper-tail) CI, whose acceptance region

of a test should be inverted?(a) H0 : θ = θ0 vs H1 : θ > θ0(b) H0 : θ = θ0 vs H1 : θ < θ0

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 2 / 33

Page 3: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Last Lecture

• What is an interval estimator?

• What is the coverage probability, confidence coefficient, andconfidence interval?

• How can a 1− α confidence interval typically be constructed?• To obtain a lower-bounded (upper-tail) CI, whose acceptance region

of a test should be inverted?(a) H0 : θ = θ0 vs H1 : θ > θ0(b) H0 : θ = θ0 vs H1 : θ < θ0

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 2 / 33

Page 4: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Last Lecture

• What is an interval estimator?• What is the coverage probability, confidence coefficient, and

confidence interval?

• How can a 1− α confidence interval typically be constructed?• To obtain a lower-bounded (upper-tail) CI, whose acceptance region

of a test should be inverted?(a) H0 : θ = θ0 vs H1 : θ > θ0(b) H0 : θ = θ0 vs H1 : θ < θ0

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 2 / 33

Page 5: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Last Lecture

• What is an interval estimator?• What is the coverage probability, confidence coefficient, and

confidence interval?• How can a 1− α confidence interval typically be constructed?

• To obtain a lower-bounded (upper-tail) CI, whose acceptance regionof a test should be inverted?(a) H0 : θ = θ0 vs H1 : θ > θ0(b) H0 : θ = θ0 vs H1 : θ < θ0

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 2 / 33

Page 6: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Last Lecture

• What is an interval estimator?• What is the coverage probability, confidence coefficient, and

confidence interval?• How can a 1− α confidence interval typically be constructed?• To obtain a lower-bounded (upper-tail) CI, whose acceptance region

of a test should be inverted?(a) H0 : θ = θ0 vs H1 : θ > θ0(b) H0 : θ = θ0 vs H1 : θ < θ0

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 2 / 33

Page 7: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Interval Estimation

θ(X) is usually represented as a point estimator

.Interval Estimator..

......

Let [L(X),U(X)], where L(X) and U(X) are functions of sample X andL(X) ≤ U(X). Based on the observed sample x, we can make an inferencethat

θ ∈ [L(X),U(X)]

Then we call [L(X),U(X)] an interval estimator of θ.

Three types of intervals• Two-sided interval [L(X),U(X)]

• One-sided (with lower-bound) interval [L(X),∞)

• One-sided (with upper-bound) interval (−∞,U(X)]

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 3 / 33

Page 8: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Interval Estimation

θ(X) is usually represented as a point estimator.Interval Estimator..

......

Let [L(X),U(X)], where L(X) and U(X) are functions of sample X andL(X) ≤ U(X). Based on the observed sample x, we can make an inferencethat

θ ∈ [L(X),U(X)]

Then we call [L(X),U(X)] an interval estimator of θ.

Three types of intervals• Two-sided interval [L(X),U(X)]

• One-sided (with lower-bound) interval [L(X),∞)

• One-sided (with upper-bound) interval (−∞,U(X)]

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 3 / 33

Page 9: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Interval Estimation

θ(X) is usually represented as a point estimator.Interval Estimator..

......

Let [L(X),U(X)], where L(X) and U(X) are functions of sample X andL(X) ≤ U(X). Based on the observed sample x, we can make an inferencethat

θ ∈ [L(X),U(X)]

Then we call [L(X),U(X)] an interval estimator of θ.

Three types of intervals• Two-sided interval [L(X),U(X)]

• One-sided (with lower-bound) interval [L(X),∞)

• One-sided (with upper-bound) interval (−∞,U(X)]

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 3 / 33

Page 10: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Interval Estimation

θ(X) is usually represented as a point estimator.Interval Estimator..

......

Let [L(X),U(X)], where L(X) and U(X) are functions of sample X andL(X) ≤ U(X). Based on the observed sample x, we can make an inferencethat

θ ∈ [L(X),U(X)]

Then we call [L(X),U(X)] an interval estimator of θ.

Three types of intervals• Two-sided interval [L(X),U(X)]

• One-sided (with lower-bound) interval [L(X),∞)

• One-sided (with upper-bound) interval (−∞,U(X)]

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 3 / 33

Page 11: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Interval Estimation

θ(X) is usually represented as a point estimator.Interval Estimator..

......

Let [L(X),U(X)], where L(X) and U(X) are functions of sample X andL(X) ≤ U(X). Based on the observed sample x, we can make an inferencethat

θ ∈ [L(X),U(X)]

Then we call [L(X),U(X)] an interval estimator of θ.

Three types of intervals

• Two-sided interval [L(X),U(X)]

• One-sided (with lower-bound) interval [L(X),∞)

• One-sided (with upper-bound) interval (−∞,U(X)]

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 3 / 33

Page 12: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Interval Estimation

θ(X) is usually represented as a point estimator.Interval Estimator..

......

Let [L(X),U(X)], where L(X) and U(X) are functions of sample X andL(X) ≤ U(X). Based on the observed sample x, we can make an inferencethat

θ ∈ [L(X),U(X)]

Then we call [L(X),U(X)] an interval estimator of θ.

Three types of intervals• Two-sided interval [L(X),U(X)]

• One-sided (with lower-bound) interval [L(X),∞)

• One-sided (with upper-bound) interval (−∞,U(X)]

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 3 / 33

Page 13: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Interval Estimation

θ(X) is usually represented as a point estimator.Interval Estimator..

......

Let [L(X),U(X)], where L(X) and U(X) are functions of sample X andL(X) ≤ U(X). Based on the observed sample x, we can make an inferencethat

θ ∈ [L(X),U(X)]

Then we call [L(X),U(X)] an interval estimator of θ.

Three types of intervals• Two-sided interval [L(X),U(X)]

• One-sided (with lower-bound) interval [L(X),∞)

• One-sided (with upper-bound) interval (−∞,U(X)]

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 3 / 33

Page 14: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Interval Estimation

θ(X) is usually represented as a point estimator.Interval Estimator..

......

Let [L(X),U(X)], where L(X) and U(X) are functions of sample X andL(X) ≤ U(X). Based on the observed sample x, we can make an inferencethat

θ ∈ [L(X),U(X)]

Then we call [L(X),U(X)] an interval estimator of θ.

Three types of intervals• Two-sided interval [L(X),U(X)]

• One-sided (with lower-bound) interval [L(X),∞)

• One-sided (with upper-bound) interval (−∞,U(X)]

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 3 / 33

Page 15: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Definitions

.Definition : Coverage Probability..

......

Given an interval estimator [L(X),U(X)] of θ, its coverage probability isdefined as

Pr(θ ∈ [L(X),U(X)])

In other words, the probability of a random variable in interval[L(X),U(X)] covers the parameter θ.

.Definition: Confidence Coefficient..

......

Confidence coefficient is defined asinfθ∈Ω

Pr(θ ∈ [L(X),U(X)])

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 4 / 33

Page 16: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Definitions

.Definition : Coverage Probability..

......

Given an interval estimator [L(X),U(X)] of θ, its coverage probability isdefined as

Pr(θ ∈ [L(X),U(X)])

In other words, the probability of a random variable in interval[L(X),U(X)] covers the parameter θ.

.Definition: Confidence Coefficient..

......

Confidence coefficient is defined asinfθ∈Ω

Pr(θ ∈ [L(X),U(X)])

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 4 / 33

Page 17: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Definitions

.Definition : Coverage Probability..

......

Given an interval estimator [L(X),U(X)] of θ, its coverage probability isdefined as

Pr(θ ∈ [L(X),U(X)])

In other words, the probability of a random variable in interval[L(X),U(X)] covers the parameter θ.

.Definition: Confidence Coefficient..

......

Confidence coefficient is defined asinfθ∈Ω

Pr(θ ∈ [L(X),U(X)])

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 4 / 33

Page 18: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Definitions

.Definition : Coverage Probability..

......

Given an interval estimator [L(X),U(X)] of θ, its coverage probability isdefined as

Pr(θ ∈ [L(X),U(X)])

In other words, the probability of a random variable in interval[L(X),U(X)] covers the parameter θ.

.Definition: Confidence Coefficient..

......

Confidence coefficient is defined as

infθ∈Ω

Pr(θ ∈ [L(X),U(X)])

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 4 / 33

Page 19: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Definitions

.Definition : Coverage Probability..

......

Given an interval estimator [L(X),U(X)] of θ, its coverage probability isdefined as

Pr(θ ∈ [L(X),U(X)])

In other words, the probability of a random variable in interval[L(X),U(X)] covers the parameter θ.

.Definition: Confidence Coefficient..

......

Confidence coefficient is defined asinfθ∈Ω

Pr(θ ∈ [L(X),U(X)])

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 4 / 33

Page 20: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Definitions

.Definition : Confidence Interval..

......Given an interval estimator [L(X),U(X)] of θ, if its confidence coefficientis 1− α, we call it a (1− α) confidence interval

.Definition: Expected Length..

......

Given an interval estimator [L(X),U(X)] of θ, its expected length isdefined as

E[U(X)− L(X)]

where X are random samples from fX(x|θ). In other words, it is theaverage length of the interval estimator.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 5 / 33

Page 21: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Definitions

.Definition : Confidence Interval..

......Given an interval estimator [L(X),U(X)] of θ, if its confidence coefficientis 1− α, we call it a (1− α) confidence interval

.Definition: Expected Length..

......

Given an interval estimator [L(X),U(X)] of θ, its expected length isdefined as

E[U(X)− L(X)]

where X are random samples from fX(x|θ). In other words, it is theaverage length of the interval estimator.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 5 / 33

Page 22: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Definitions

.Definition : Confidence Interval..

......Given an interval estimator [L(X),U(X)] of θ, if its confidence coefficientis 1− α, we call it a (1− α) confidence interval

.Definition: Expected Length..

......

Given an interval estimator [L(X),U(X)] of θ, its expected length isdefined as

E[U(X)− L(X)]

where X are random samples from fX(x|θ). In other words, it is theaverage length of the interval estimator.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 5 / 33

Page 23: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Definitions

.Definition : Confidence Interval..

......Given an interval estimator [L(X),U(X)] of θ, if its confidence coefficientis 1− α, we call it a (1− α) confidence interval

.Definition: Expected Length..

......

Given an interval estimator [L(X),U(X)] of θ, its expected length isdefined as

E[U(X)− L(X)]

where X are random samples from fX(x|θ). In other words, it is theaverage length of the interval estimator.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 5 / 33

Page 24: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Confidence set and confidence interval

There is no guarantee that the confidence set obtained from Theorem9.2.2 is an interval, but quite often

..1 To obtain (1− α) two-sided CI [L(X),U(X)], we invert theacceptance region of a level α test for H0 : θ = θ0 vs. H1 : θ = θ0

..2 To obtain a lower-bounded CI [L(X),∞), then we invert theacceptance region of a test for H0 : θ = θ0 vs. H1 : θ > θ0, whereΩ = θ : θ ≥ θ0.

..3 To obtain a upper-bounded CI (−∞,U(X)], then we invert theacceptance region of a test for H0 : θ = θ0 vs. H1 : θ < θ0, whereΩ = θ : θ ≤ θ0.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 6 / 33

Page 25: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Confidence set and confidence interval

There is no guarantee that the confidence set obtained from Theorem9.2.2 is an interval, but quite often

..1 To obtain (1− α) two-sided CI [L(X),U(X)], we invert theacceptance region of a level α test for H0 : θ = θ0 vs. H1 : θ = θ0

..2 To obtain a lower-bounded CI [L(X),∞), then we invert theacceptance region of a test for H0 : θ = θ0 vs. H1 : θ > θ0, whereΩ = θ : θ ≥ θ0.

..3 To obtain a upper-bounded CI (−∞,U(X)], then we invert theacceptance region of a test for H0 : θ = θ0 vs. H1 : θ < θ0, whereΩ = θ : θ ≤ θ0.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 6 / 33

Page 26: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Confidence set and confidence interval

There is no guarantee that the confidence set obtained from Theorem9.2.2 is an interval, but quite often

..1 To obtain (1− α) two-sided CI [L(X),U(X)], we invert theacceptance region of a level α test for H0 : θ = θ0 vs. H1 : θ = θ0

..2 To obtain a lower-bounded CI [L(X),∞), then we invert theacceptance region of a test for H0 : θ = θ0 vs. H1 : θ > θ0, whereΩ = θ : θ ≥ θ0.

..3 To obtain a upper-bounded CI (−∞,U(X)], then we invert theacceptance region of a test for H0 : θ = θ0 vs. H1 : θ < θ0, whereΩ = θ : θ ≤ θ0.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 6 / 33

Page 27: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Confidence set and confidence interval

There is no guarantee that the confidence set obtained from Theorem9.2.2 is an interval, but quite often

..1 To obtain (1− α) two-sided CI [L(X),U(X)], we invert theacceptance region of a level α test for H0 : θ = θ0 vs. H1 : θ = θ0

..2 To obtain a lower-bounded CI [L(X),∞), then we invert theacceptance region of a test for H0 : θ = θ0 vs. H1 : θ > θ0, whereΩ = θ : θ ≥ θ0.

..3 To obtain a upper-bounded CI (−∞,U(X)], then we invert theacceptance region of a test for H0 : θ = θ0 vs. H1 : θ < θ0, whereΩ = θ : θ ≤ θ0.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 6 / 33

Page 28: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Typical strategies for finding MLEs

..1 Write the joint (log-)likelihood function, L(θ|x) = fX(x|θ).

..2 Find candidates that makes first order derivative to be zero

..3 Check second-order derivative to check local maximum.• For one-dimensional parameter, negative second order derivative

implies local maximum.

..4 Check boundary points to see whether boundary gives globalmaximum.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 7 / 33

Page 29: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Typical strategies for finding MLEs

..1 Write the joint (log-)likelihood function, L(θ|x) = fX(x|θ).

..2 Find candidates that makes first order derivative to be zero

..3 Check second-order derivative to check local maximum.• For one-dimensional parameter, negative second order derivative

implies local maximum.

..4 Check boundary points to see whether boundary gives globalmaximum.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 7 / 33

Page 30: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Typical strategies for finding MLEs

..1 Write the joint (log-)likelihood function, L(θ|x) = fX(x|θ).

..2 Find candidates that makes first order derivative to be zero

..3 Check second-order derivative to check local maximum.• For one-dimensional parameter, negative second order derivative

implies local maximum.

..4 Check boundary points to see whether boundary gives globalmaximum.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 7 / 33

Page 31: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Typical strategies for finding MLEs

..1 Write the joint (log-)likelihood function, L(θ|x) = fX(x|θ).

..2 Find candidates that makes first order derivative to be zero

..3 Check second-order derivative to check local maximum.• For one-dimensional parameter, negative second order derivative

implies local maximum.

..4 Check boundary points to see whether boundary gives globalmaximum.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 7 / 33

Page 32: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Example: A mixture distribution

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 8 / 33

Page 33: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

A general mixture distribution

f(x|π, ϕ, η) =k∑

i=1

πif(x|ϕi, η)

x observed dataπ mixture proportion of each componentf the probability density functionϕ parameters specific to each componentη parameters shared among componentsk number of mixture components

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 9 / 33

Page 34: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

A general mixture distribution

f(x|π, ϕ, η) =k∑

i=1

πif(x|ϕi, η)

x observed data

π mixture proportion of each componentf the probability density functionϕ parameters specific to each componentη parameters shared among componentsk number of mixture components

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 9 / 33

Page 35: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

A general mixture distribution

f(x|π, ϕ, η) =k∑

i=1

πif(x|ϕi, η)

x observed dataπ mixture proportion of each component

f the probability density functionϕ parameters specific to each componentη parameters shared among componentsk number of mixture components

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 9 / 33

Page 36: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

A general mixture distribution

f(x|π, ϕ, η) =k∑

i=1

πif(x|ϕi, η)

x observed dataπ mixture proportion of each componentf the probability density function

ϕ parameters specific to each componentη parameters shared among componentsk number of mixture components

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 9 / 33

Page 37: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

A general mixture distribution

f(x|π, ϕ, η) =k∑

i=1

πif(x|ϕi, η)

x observed dataπ mixture proportion of each componentf the probability density functionϕ parameters specific to each component

η parameters shared among componentsk number of mixture components

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 9 / 33

Page 38: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

A general mixture distribution

f(x|π, ϕ, η) =k∑

i=1

πif(x|ϕi, η)

x observed dataπ mixture proportion of each componentf the probability density functionϕ parameters specific to each componentη parameters shared among components

k number of mixture components

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 9 / 33

Page 39: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

A general mixture distribution

f(x|π, ϕ, η) =k∑

i=1

πif(x|ϕi, η)

x observed dataπ mixture proportion of each componentf the probability density functionϕ parameters specific to each componentη parameters shared among componentsk number of mixture components

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 9 / 33

Page 40: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

MLE Problem for mixture of normals

.Problem..

......

f(x|θ = (π, µ, σ2)) =

k∑i=1

πifi(x|µi, σ2i )

fi(x|µi, σ2i ) =

1√2πσ2

i

exp[−(x − µi)2

2σ2i

]n∑

i=1

πi = 1

Find MLEs for θ = (π, µ, σ2).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 10 / 33

Page 41: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

MLE Problem for mixture of normals

.Problem..

......

f(x|θ = (π, µ, σ2)) =

k∑i=1

πifi(x|µi, σ2i )

fi(x|µi, σ2i ) =

1√2πσ2

i

exp[−(x − µi)2

2σ2i

]

n∑i=1

πi = 1

Find MLEs for θ = (π, µ, σ2).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 10 / 33

Page 42: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

MLE Problem for mixture of normals

.Problem..

......

f(x|θ = (π, µ, σ2)) =

k∑i=1

πifi(x|µi, σ2i )

fi(x|µi, σ2i ) =

1√2πσ2

i

exp[−(x − µi)2

2σ2i

]n∑

i=1

πi = 1

Find MLEs for θ = (π, µ, σ2).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 10 / 33

Page 43: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

MLE Problem for mixture of normals

.Problem..

......

f(x|θ = (π, µ, σ2)) =

k∑i=1

πifi(x|µi, σ2i )

fi(x|µi, σ2i ) =

1√2πσ2

i

exp[−(x − µi)2

2σ2i

]n∑

i=1

πi = 1

Find MLEs for θ = (π, µ, σ2).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 10 / 33

Page 44: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution when k = 1

f(x|θ) =k∑

i=1

πifi(x|µi, σ2i )

• π = π1 = 1

• µ = µ1 = x• σ2 = σ2

1 =∑n

i=1(xi − x)2/n

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 11 / 33

Page 45: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution when k = 1

f(x|θ) =k∑

i=1

πifi(x|µi, σ2i )

• π = π1 = 1

• µ = µ1 = x• σ2 = σ2

1 =∑n

i=1(xi − x)2/n

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 11 / 33

Page 46: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Incomplete data problem when k > 1

f(x|θ) =n∏

i=1

k∑j=1

πifi(xi|µj, σ2j )

The MLE solution is not analytically tractable, because it involves multiplesums of exponential functions.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 12 / 33

Page 47: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Incomplete data problem when k > 1

f(x|θ) =n∏

i=1

k∑j=1

πifi(xi|µj, σ2j )

The MLE solution is not analytically tractable, because it involves multiplesums of exponential functions.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 12 / 33

Page 48: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Converting to a complete data problem

Let zi ∈ 1, · · · , k denote the source distribution where each xi wassampled from.

f(x|z, θ) =

n∏i=1

k∑j=1

I(zi = j)fi(xi|µj, σ2j )

=

n∏i=1

fi(xi|µzi , σ2zi)

πi =

∑ni=1 I(zi = i)

n

µi =

∑ni=1 I(zi = i)xi∑ni=1 I(zi = i)

σ2i =

∑ni=1 I(zi = i)(xi − µi)2∑n

i=1 I(zi = i)

The MLE solution is analytically tractable, if z is known.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 13 / 33

Page 49: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Converting to a complete data problem

Let zi ∈ 1, · · · , k denote the source distribution where each xi wassampled from.

f(x|z, θ) =

n∏i=1

k∑j=1

I(zi = j)fi(xi|µj, σ2j )

=

n∏i=1

fi(xi|µzi , σ2zi)

πi =

∑ni=1 I(zi = i)

n

µi =

∑ni=1 I(zi = i)xi∑ni=1 I(zi = i)

σ2i =

∑ni=1 I(zi = i)(xi − µi)2∑n

i=1 I(zi = i)

The MLE solution is analytically tractable, if z is known.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 13 / 33

Page 50: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Converting to a complete data problem

Let zi ∈ 1, · · · , k denote the source distribution where each xi wassampled from.

f(x|z, θ) =

n∏i=1

k∑j=1

I(zi = j)fi(xi|µj, σ2j )

=

n∏i=1

fi(xi|µzi , σ2zi)

πi =

∑ni=1 I(zi = i)

n

µi =

∑ni=1 I(zi = i)xi∑ni=1 I(zi = i)

σ2i =

∑ni=1 I(zi = i)(xi − µi)2∑n

i=1 I(zi = i)

The MLE solution is analytically tractable, if z is known.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 13 / 33

Page 51: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Converting to a complete data problem

Let zi ∈ 1, · · · , k denote the source distribution where each xi wassampled from.

f(x|z, θ) =

n∏i=1

k∑j=1

I(zi = j)fi(xi|µj, σ2j )

=

n∏i=1

fi(xi|µzi , σ2zi)

πi =

∑ni=1 I(zi = i)

n

µi =

∑ni=1 I(zi = i)xi∑ni=1 I(zi = i)

σ2i =

∑ni=1 I(zi = i)(xi − µi)2∑n

i=1 I(zi = i)

The MLE solution is analytically tractable, if z is known.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 13 / 33

Page 52: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Converting to a complete data problem

Let zi ∈ 1, · · · , k denote the source distribution where each xi wassampled from.

f(x|z, θ) =

n∏i=1

k∑j=1

I(zi = j)fi(xi|µj, σ2j )

=

n∏i=1

fi(xi|µzi , σ2zi)

πi =

∑ni=1 I(zi = i)

n

µi =

∑ni=1 I(zi = i)xi∑ni=1 I(zi = i)

σ2i =

∑ni=1 I(zi = i)(xi − µi)2∑n

i=1 I(zi = i)

The MLE solution is analytically tractable, if z is known.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 13 / 33

Page 53: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Converting to a complete data problem

Let zi ∈ 1, · · · , k denote the source distribution where each xi wassampled from.

f(x|z, θ) =

n∏i=1

k∑j=1

I(zi = j)fi(xi|µj, σ2j )

=

n∏i=1

fi(xi|µzi , σ2zi)

πi =

∑ni=1 I(zi = i)

n

µi =

∑ni=1 I(zi = i)xi∑ni=1 I(zi = i)

σ2i =

∑ni=1 I(zi = i)(xi − µi)2∑n

i=1 I(zi = i)

The MLE solution is analytically tractable, if z is known.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 13 / 33

Page 54: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Converting to a complete data problem

Let zi ∈ 1, · · · , k denote the source distribution where each xi wassampled from.

f(x|z, θ) =

n∏i=1

k∑j=1

I(zi = j)fi(xi|µj, σ2j )

=

n∏i=1

fi(xi|µzi , σ2zi)

πi =

∑ni=1 I(zi = i)

n

µi =

∑ni=1 I(zi = i)xi∑ni=1 I(zi = i)

σ2i =

∑ni=1 I(zi = i)(xi − µi)2∑n

i=1 I(zi = i)

The MLE solution is analytically tractable, if z is known.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 13 / 33

Page 55: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

E-M Algorithm

E-M (Expectation-Maximization) algorithm is

• A procedure for typically solving for the MLE.• Guaranteed to converge the MLE (!)• Particularly suited to the ”missing data” problems where analytic

solution of MLE is not tractableThe algorithm was derived and used in various special cases by a numberof authors, but it was not identified as a general algorithm until theseminal paper by Dempster, Laird, and Rubin in Journal of RoyalStatistical Society Series B (1977).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 14 / 33

Page 56: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

E-M Algorithm

E-M (Expectation-Maximization) algorithm is• A procedure for typically solving for the MLE.

• Guaranteed to converge the MLE (!)• Particularly suited to the ”missing data” problems where analytic

solution of MLE is not tractableThe algorithm was derived and used in various special cases by a numberof authors, but it was not identified as a general algorithm until theseminal paper by Dempster, Laird, and Rubin in Journal of RoyalStatistical Society Series B (1977).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 14 / 33

Page 57: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

E-M Algorithm

E-M (Expectation-Maximization) algorithm is• A procedure for typically solving for the MLE.• Guaranteed to converge the MLE (!)

• Particularly suited to the ”missing data” problems where analyticsolution of MLE is not tractable

The algorithm was derived and used in various special cases by a numberof authors, but it was not identified as a general algorithm until theseminal paper by Dempster, Laird, and Rubin in Journal of RoyalStatistical Society Series B (1977).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 14 / 33

Page 58: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

E-M Algorithm

E-M (Expectation-Maximization) algorithm is• A procedure for typically solving for the MLE.• Guaranteed to converge the MLE (!)• Particularly suited to the ”missing data” problems where analytic

solution of MLE is not tractable

The algorithm was derived and used in various special cases by a numberof authors, but it was not identified as a general algorithm until theseminal paper by Dempster, Laird, and Rubin in Journal of RoyalStatistical Society Series B (1977).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 14 / 33

Page 59: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

E-M Algorithm

E-M (Expectation-Maximization) algorithm is• A procedure for typically solving for the MLE.• Guaranteed to converge the MLE (!)• Particularly suited to the ”missing data” problems where analytic

solution of MLE is not tractableThe algorithm was derived and used in various special cases by a numberof authors, but it was not identified as a general algorithm until theseminal paper by Dempster, Laird, and Rubin in Journal of RoyalStatistical Society Series B (1977).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 14 / 33

Page 60: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Overview of E-M Algorithm

.Basic Structure..

......

• y is observed (or incomplete) data• z is missing (or augmented) data• x = (y, z) is complete data

.Complete and incomplete data likelihood..

......

• Complete data likelihood : f(x|θ) = f(y, z|θ)

• Incomplete data likelihood : g(y|θ) =∫

f(y, z|θ)dz

We are interested in MLE for L(θ|y) = g(y|θ).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 15 / 33

Page 61: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Overview of E-M Algorithm

.Basic Structure..

......

• y is observed (or incomplete) data• z is missing (or augmented) data• x = (y, z) is complete data

.Complete and incomplete data likelihood..

......

• Complete data likelihood : f(x|θ) = f(y, z|θ)

• Incomplete data likelihood : g(y|θ) =∫

f(y, z|θ)dz

We are interested in MLE for L(θ|y) = g(y|θ).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 15 / 33

Page 62: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Overview of E-M Algorithm

.Basic Structure..

......

• y is observed (or incomplete) data• z is missing (or augmented) data• x = (y, z) is complete data

.Complete and incomplete data likelihood..

......

• Complete data likelihood : f(x|θ) = f(y, z|θ)

• Incomplete data likelihood : g(y|θ) =∫

f(y, z|θ)dz

We are interested in MLE for L(θ|y) = g(y|θ).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 15 / 33

Page 63: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Overview of E-M Algorithm

.Basic Structure..

......

• y is observed (or incomplete) data• z is missing (or augmented) data• x = (y, z) is complete data

.Complete and incomplete data likelihood..

......

• Complete data likelihood : f(x|θ) = f(y, z|θ)

• Incomplete data likelihood : g(y|θ) =∫

f(y, z|θ)dz

We are interested in MLE for L(θ|y) = g(y|θ).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 15 / 33

Page 64: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Maximizing incomplete data likelihood

L(θ|y, z) = f(y, z|θ)

L(θ|y) = g(y|θ)

k(z|θ, y) =f(y, z|θ)g(y|θ)

log L(θ|y) = log L(θ|y, z)− log k(z|θ, y)

Because z is missing data, we replace the right side with its expectationunder k(z|θ′, y), creating the new identity

log L(θ|y) = E[log L(θ|y,Z)|θ′, y

]− E

[log k(Z|θ, y)|θ′, y

]Iteratively maximizing the first term in the right-hand side results in E-Malgorithm.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 16 / 33

Page 65: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Maximizing incomplete data likelihood

L(θ|y, z) = f(y, z|θ)L(θ|y) = g(y|θ)

k(z|θ, y) =f(y, z|θ)g(y|θ)

log L(θ|y) = log L(θ|y, z)− log k(z|θ, y)

Because z is missing data, we replace the right side with its expectationunder k(z|θ′, y), creating the new identity

log L(θ|y) = E[log L(θ|y,Z)|θ′, y

]− E

[log k(Z|θ, y)|θ′, y

]Iteratively maximizing the first term in the right-hand side results in E-Malgorithm.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 16 / 33

Page 66: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Maximizing incomplete data likelihood

L(θ|y, z) = f(y, z|θ)L(θ|y) = g(y|θ)

k(z|θ, y) =f(y, z|θ)g(y|θ)

log L(θ|y) = log L(θ|y, z)− log k(z|θ, y)

Because z is missing data, we replace the right side with its expectationunder k(z|θ′, y), creating the new identity

log L(θ|y) = E[log L(θ|y,Z)|θ′, y

]− E

[log k(Z|θ, y)|θ′, y

]Iteratively maximizing the first term in the right-hand side results in E-Malgorithm.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 16 / 33

Page 67: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Maximizing incomplete data likelihood

L(θ|y, z) = f(y, z|θ)L(θ|y) = g(y|θ)

k(z|θ, y) =f(y, z|θ)g(y|θ)

log L(θ|y) = log L(θ|y, z)− log k(z|θ, y)

Because z is missing data, we replace the right side with its expectationunder k(z|θ′, y), creating the new identity

log L(θ|y) = E[log L(θ|y,Z)|θ′, y

]− E

[log k(Z|θ, y)|θ′, y

]Iteratively maximizing the first term in the right-hand side results in E-Malgorithm.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 16 / 33

Page 68: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Maximizing incomplete data likelihood

L(θ|y, z) = f(y, z|θ)L(θ|y) = g(y|θ)

k(z|θ, y) =f(y, z|θ)g(y|θ)

log L(θ|y) = log L(θ|y, z)− log k(z|θ, y)

Because z is missing data, we replace the right side with its expectationunder k(z|θ′, y), creating the new identity

log L(θ|y) = E[log L(θ|y,Z)|θ′, y

]− E

[log k(Z|θ, y)|θ′, y

]Iteratively maximizing the first term in the right-hand side results in E-Malgorithm.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 16 / 33

Page 69: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Maximizing incomplete data likelihood

L(θ|y, z) = f(y, z|θ)L(θ|y) = g(y|θ)

k(z|θ, y) =f(y, z|θ)g(y|θ)

log L(θ|y) = log L(θ|y, z)− log k(z|θ, y)

Because z is missing data, we replace the right side with its expectationunder k(z|θ′, y), creating the new identity

log L(θ|y) = E[log L(θ|y,Z)|θ′, y

]− E

[log k(Z|θ, y)|θ′, y

]

Iteratively maximizing the first term in the right-hand side results in E-Malgorithm.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 16 / 33

Page 70: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Maximizing incomplete data likelihood

L(θ|y, z) = f(y, z|θ)L(θ|y) = g(y|θ)

k(z|θ, y) =f(y, z|θ)g(y|θ)

log L(θ|y) = log L(θ|y, z)− log k(z|θ, y)

Because z is missing data, we replace the right side with its expectationunder k(z|θ′, y), creating the new identity

log L(θ|y) = E[log L(θ|y,Z)|θ′, y

]− E

[log k(Z|θ, y)|θ′, y

]Iteratively maximizing the first term in the right-hand side results in E-Malgorithm.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 16 / 33

Page 71: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Overview of E-M Algorithm (cont’d)

.Objective..

......

• Maximize L(θ|y) or l(θ|y).

• Let f(y, z|θ) denotes the pdf of complete data. In E-M algorithm,rather than working with l(θ|y) directly, we work with the surrogatefunction

Q(θ|θ(r)) = E[log f(y,Z|θ)|y, θ(r)

]where θ(r) is the estimation of θ in r-th iteration.

• Q(θ|θ(r)) is the expected log-likelihood of complete data, conditioningon the observed data and θ(r).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 17 / 33

Page 72: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Overview of E-M Algorithm (cont’d)

.Objective..

......

• Maximize L(θ|y) or l(θ|y).• Let f(y, z|θ) denotes the pdf of complete data. In E-M algorithm,

rather than working with l(θ|y) directly, we work with the surrogatefunction

Q(θ|θ(r)) = E[log f(y,Z|θ)|y, θ(r)

]where θ(r) is the estimation of θ in r-th iteration.

• Q(θ|θ(r)) is the expected log-likelihood of complete data, conditioningon the observed data and θ(r).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 17 / 33

Page 73: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Overview of E-M Algorithm (cont’d)

.Objective..

......

• Maximize L(θ|y) or l(θ|y).• Let f(y, z|θ) denotes the pdf of complete data. In E-M algorithm,

rather than working with l(θ|y) directly, we work with the surrogatefunction

Q(θ|θ(r)) = E[log f(y,Z|θ)|y, θ(r)

]

where θ(r) is the estimation of θ in r-th iteration.• Q(θ|θ(r)) is the expected log-likelihood of complete data, conditioning

on the observed data and θ(r).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 17 / 33

Page 74: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Overview of E-M Algorithm (cont’d)

.Objective..

......

• Maximize L(θ|y) or l(θ|y).• Let f(y, z|θ) denotes the pdf of complete data. In E-M algorithm,

rather than working with l(θ|y) directly, we work with the surrogatefunction

Q(θ|θ(r)) = E[log f(y,Z|θ)|y, θ(r)

]where θ(r) is the estimation of θ in r-th iteration.

• Q(θ|θ(r)) is the expected log-likelihood of complete data, conditioningon the observed data and θ(r).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 17 / 33

Page 75: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Overview of E-M Algorithm (cont’d)

.Objective..

......

• Maximize L(θ|y) or l(θ|y).• Let f(y, z|θ) denotes the pdf of complete data. In E-M algorithm,

rather than working with l(θ|y) directly, we work with the surrogatefunction

Q(θ|θ(r)) = E[log f(y,Z|θ)|y, θ(r)

]where θ(r) is the estimation of θ in r-th iteration.

• Q(θ|θ(r)) is the expected log-likelihood of complete data, conditioningon the observed data and θ(r).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 17 / 33

Page 76: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Key Steps of E-M algorithm.Expectation Step..

......

• Compute Q(θ|θ(r)).• This typically involves in estimating the conditional distribution Z|Y,

assuming θ = θ(r).• After computing Q(θ|θ(r)), move to the M-step

.Maximization Step..

......

• Maximize Q(θ|θ(r)) with respect to θ.• The arg maxθ Q(θ|θ(r)) will be the (r + 1)-th θ to be fed into the

E-step.• Repeat E-step until convergence

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 18 / 33

Page 77: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Key Steps of E-M algorithm.Expectation Step..

......

• Compute Q(θ|θ(r)).• This typically involves in estimating the conditional distribution Z|Y,

assuming θ = θ(r).• After computing Q(θ|θ(r)), move to the M-step

.Maximization Step..

......

• Maximize Q(θ|θ(r)) with respect to θ.• The arg maxθ Q(θ|θ(r)) will be the (r + 1)-th θ to be fed into the

E-step.• Repeat E-step until convergence

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 18 / 33

Page 78: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

E-M algorithm for mixture of normals.E-step..

......

Q(θ|θ(r)) = E[log f(y,Z|θ)|y, θ(r)

]

=∑

zk(z|θ(r), y) log f(y, z|θ)

=n∑

i=1

k∑zi=1

k(zi|θ(r), yi) log f(yi, zi|θ)

=

n∑i=1

k∑zi=1

f(yi, zi|θ(r))g(yi|θ(r))

log f(yi, zi|θ)

f(yi, zi|θ) ∼ N (µzi , σ2zi)

g(yi|θ) =

k∑j=1

πif(yi, zi = j|θ)

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 19 / 33

Page 79: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

E-M algorithm for mixture of normals.E-step..

......

Q(θ|θ(r)) = E[log f(y,Z|θ)|y, θ(r)

]=

∑z

k(z|θ(r), y) log f(y, z|θ)

=n∑

i=1

k∑zi=1

k(zi|θ(r), yi) log f(yi, zi|θ)

=

n∑i=1

k∑zi=1

f(yi, zi|θ(r))g(yi|θ(r))

log f(yi, zi|θ)

f(yi, zi|θ) ∼ N (µzi , σ2zi)

g(yi|θ) =

k∑j=1

πif(yi, zi = j|θ)

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 19 / 33

Page 80: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

E-M algorithm for mixture of normals.E-step..

......

Q(θ|θ(r)) = E[log f(y,Z|θ)|y, θ(r)

]=

∑z

k(z|θ(r), y) log f(y, z|θ)

=n∑

i=1

k∑zi=1

k(zi|θ(r), yi) log f(yi, zi|θ)

=

n∑i=1

k∑zi=1

f(yi, zi|θ(r))g(yi|θ(r))

log f(yi, zi|θ)

f(yi, zi|θ) ∼ N (µzi , σ2zi)

g(yi|θ) =

k∑j=1

πif(yi, zi = j|θ)

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 19 / 33

Page 81: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

E-M algorithm for mixture of normals.E-step..

......

Q(θ|θ(r)) = E[log f(y,Z|θ)|y, θ(r)

]=

∑z

k(z|θ(r), y) log f(y, z|θ)

=n∑

i=1

k∑zi=1

k(zi|θ(r), yi) log f(yi, zi|θ)

=

n∑i=1

k∑zi=1

f(yi, zi|θ(r))g(yi|θ(r))

log f(yi, zi|θ)

f(yi, zi|θ) ∼ N (µzi , σ2zi)

g(yi|θ) =

k∑j=1

πif(yi, zi = j|θ)

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 19 / 33

Page 82: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

E-M algorithm for mixture of normals.E-step..

......

Q(θ|θ(r)) = E[log f(y,Z|θ)|y, θ(r)

]=

∑z

k(z|θ(r), y) log f(y, z|θ)

=n∑

i=1

k∑zi=1

k(zi|θ(r), yi) log f(yi, zi|θ)

=

n∑i=1

k∑zi=1

f(yi, zi|θ(r))g(yi|θ(r))

log f(yi, zi|θ)

f(yi, zi|θ) ∼ N (µzi , σ2zi)

g(yi|θ) =

k∑j=1

πif(yi, zi = j|θ)

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 19 / 33

Page 83: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

E-M algorithm for mixture of normals.E-step..

......

Q(θ|θ(r)) = E[log f(y,Z|θ)|y, θ(r)

]=

∑z

k(z|θ(r), y) log f(y, z|θ)

=n∑

i=1

k∑zi=1

k(zi|θ(r), yi) log f(yi, zi|θ)

=

n∑i=1

k∑zi=1

f(yi, zi|θ(r))g(yi|θ(r))

log f(yi, zi|θ)

f(yi, zi|θ) ∼ N (µzi , σ2zi)

g(yi|θ) =

k∑j=1

πif(yi, zi = j|θ)

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 19 / 33

Page 84: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

E-M algorithm for mixture of normals (cont’d).M-step..

......

Q(θ|θ(r)) =n∑

i=1

k∑zi=1

f(yi, zi|θ(r))g(yi|θ(r))

log f(yi, zi|θ)

π(r+1)j =

1

n

n∑i=1

k(zi = j|yi, θ(r)) =

1

n

n∑i=1

f(yi, zi = j|θ(r))g(yi|θ(r))

µ(r+1)j =

∑ni=1 xik(zi = j|yi, θ(r))∑ni=1 k(zi = j|yi, θ(r))

=

∑ni=1 xik(zi = j|yi, θ(r))

nπ(r+1)j

σ2,(r+1)j =

∑ni=1(xi − µ

(r+1)j )2k(zi = j|yi, θ(r))∑n

i=1 k(zi = j|yi, θ(r))

=

∑ni=1(xi − µ

(r+1)j )2k(zi = j|yi, θ(r))

nπ(r+1)j

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 20 / 33

Page 85: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

E-M algorithm for mixture of normals (cont’d).M-step..

......

Q(θ|θ(r)) =n∑

i=1

k∑zi=1

f(yi, zi|θ(r))g(yi|θ(r))

log f(yi, zi|θ)

π(r+1)j =

1

n

n∑i=1

k(zi = j|yi, θ(r)) =

1

n

n∑i=1

f(yi, zi = j|θ(r))g(yi|θ(r))

µ(r+1)j =

∑ni=1 xik(zi = j|yi, θ(r))∑ni=1 k(zi = j|yi, θ(r))

=

∑ni=1 xik(zi = j|yi, θ(r))

nπ(r+1)j

σ2,(r+1)j =

∑ni=1(xi − µ

(r+1)j )2k(zi = j|yi, θ(r))∑n

i=1 k(zi = j|yi, θ(r))

=

∑ni=1(xi − µ

(r+1)j )2k(zi = j|yi, θ(r))

nπ(r+1)j

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 20 / 33

Page 86: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

E-M algorithm for mixture of normals (cont’d).M-step..

......

Q(θ|θ(r)) =n∑

i=1

k∑zi=1

f(yi, zi|θ(r))g(yi|θ(r))

log f(yi, zi|θ)

π(r+1)j =

1

n

n∑i=1

k(zi = j|yi, θ(r)) =

1

n

n∑i=1

f(yi, zi = j|θ(r))g(yi|θ(r))

µ(r+1)j =

∑ni=1 xik(zi = j|yi, θ(r))∑ni=1 k(zi = j|yi, θ(r))

=

∑ni=1 xik(zi = j|yi, θ(r))

nπ(r+1)j

σ2,(r+1)j =

∑ni=1(xi − µ

(r+1)j )2k(zi = j|yi, θ(r))∑n

i=1 k(zi = j|yi, θ(r))

=

∑ni=1(xi − µ

(r+1)j )2k(zi = j|yi, θ(r))

nπ(r+1)j

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 20 / 33

Page 87: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

E-M algorithm for mixture of normals (cont’d).M-step..

......

Q(θ|θ(r)) =n∑

i=1

k∑zi=1

f(yi, zi|θ(r))g(yi|θ(r))

log f(yi, zi|θ)

π(r+1)j =

1

n

n∑i=1

k(zi = j|yi, θ(r)) =

1

n

n∑i=1

f(yi, zi = j|θ(r))g(yi|θ(r))

µ(r+1)j =

∑ni=1 xik(zi = j|yi, θ(r))∑ni=1 k(zi = j|yi, θ(r))

=

∑ni=1 xik(zi = j|yi, θ(r))

nπ(r+1)j

σ2,(r+1)j =

∑ni=1(xi − µ

(r+1)j )2k(zi = j|yi, θ(r))∑n

i=1 k(zi = j|yi, θ(r))

=

∑ni=1(xi − µ

(r+1)j )2k(zi = j|yi, θ(r))

nπ(r+1)j

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 20 / 33

Page 88: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

E-M algorithm for mixture of normals (cont’d).M-step..

......

Q(θ|θ(r)) =n∑

i=1

k∑zi=1

f(yi, zi|θ(r))g(yi|θ(r))

log f(yi, zi|θ)

π(r+1)j =

1

n

n∑i=1

k(zi = j|yi, θ(r)) =

1

n

n∑i=1

f(yi, zi = j|θ(r))g(yi|θ(r))

µ(r+1)j =

∑ni=1 xik(zi = j|yi, θ(r))∑ni=1 k(zi = j|yi, θ(r))

=

∑ni=1 xik(zi = j|yi, θ(r))

nπ(r+1)j

σ2,(r+1)j =

∑ni=1(xi − µ

(r+1)j )2k(zi = j|yi, θ(r))∑n

i=1 k(zi = j|yi, θ(r))

=

∑ni=1(xi − µ

(r+1)j )2k(zi = j|yi, θ(r))

nπ(r+1)j

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 20 / 33

Page 89: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Does E-M iteration converge to MLE?

.Theorem 7.2.20 - Monotonic EM sequence..

......

The sequence θ(r) defined by the E-M procedure satisfies

L(θ(r+1)|y

)≥ L

(θ(r)|y

)with equality holding if and only if successive iterations yield the samevalue of the maximized expected complete-data log likelihood, that is

E[log L

(θ(r+1)|y,Z

)|θ(r), y

]= E

[log L

(θ(r)|y,Z

)|θ(r), y

]Theorem 7.5.2 further guarantees that L(θ(r)|y) converges monotonicallyto L(θ|y) for some stationary point θ.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 21 / 33

Page 90: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Does E-M iteration converge to MLE?

.Theorem 7.2.20 - Monotonic EM sequence..

......

The sequence θ(r) defined by the E-M procedure satisfiesL(θ(r+1)|y

)≥ L

(θ(r)|y

)

with equality holding if and only if successive iterations yield the samevalue of the maximized expected complete-data log likelihood, that is

E[log L

(θ(r+1)|y,Z

)|θ(r), y

]= E

[log L

(θ(r)|y,Z

)|θ(r), y

]Theorem 7.5.2 further guarantees that L(θ(r)|y) converges monotonicallyto L(θ|y) for some stationary point θ.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 21 / 33

Page 91: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Does E-M iteration converge to MLE?

.Theorem 7.2.20 - Monotonic EM sequence..

......

The sequence θ(r) defined by the E-M procedure satisfiesL(θ(r+1)|y

)≥ L

(θ(r)|y

)with equality holding if and only if successive iterations yield the samevalue of the maximized expected complete-data log likelihood, that is

E[log L

(θ(r+1)|y,Z

)|θ(r), y

]= E

[log L

(θ(r)|y,Z

)|θ(r), y

]Theorem 7.5.2 further guarantees that L(θ(r)|y) converges monotonicallyto L(θ|y) for some stationary point θ.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 21 / 33

Page 92: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Does E-M iteration converge to MLE?

.Theorem 7.2.20 - Monotonic EM sequence..

......

The sequence θ(r) defined by the E-M procedure satisfiesL(θ(r+1)|y

)≥ L

(θ(r)|y

)with equality holding if and only if successive iterations yield the samevalue of the maximized expected complete-data log likelihood, that is

E[log L

(θ(r+1)|y,Z

)|θ(r), y

]= E

[log L

(θ(r)|y,Z

)|θ(r), y

]

Theorem 7.5.2 further guarantees that L(θ(r)|y) converges monotonicallyto L(θ|y) for some stationary point θ.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 21 / 33

Page 93: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Does E-M iteration converge to MLE?

.Theorem 7.2.20 - Monotonic EM sequence..

......

The sequence θ(r) defined by the E-M procedure satisfiesL(θ(r+1)|y

)≥ L

(θ(r)|y

)with equality holding if and only if successive iterations yield the samevalue of the maximized expected complete-data log likelihood, that is

E[log L

(θ(r+1)|y,Z

)|θ(r), y

]= E

[log L

(θ(r)|y,Z

)|θ(r), y

]Theorem 7.5.2 further guarantees that L(θ(r)|y) converges monotonicallyto L(θ|y) for some stationary point θ.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 21 / 33

Page 94: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

A working example (from BIOSTAT615/815 Fall 2012).Example Data (n=1,500)..

......

.Running example of implemented software..

......

user@host~/> ./mixEM ./mix.datMaximum log-likelihood = 3043.46, at pi = (0.667842,0.332158)between N(-0.0299457,1.00791) and N(5.0128,0.913825)

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 22 / 33

Page 95: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

A working example (from BIOSTAT615/815 Fall 2012).Example Data (n=1,500)..

......

.Running example of implemented software..

......

user@host~/> ./mixEM ./mix.datMaximum log-likelihood = 3043.46, at pi = (0.667842,0.332158)between N(-0.0299457,1.00791) and N(5.0128,0.913825)

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 22 / 33

Page 96: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Practice Problem 1.Problem..

......

Let X1, · · · ,Xn be a random sample from a population with pdff(x|θ) = 1

2θ− θ < x < θ, θ > 0

Find, if one exists, a best unbiased estimator of θ.

.Strategy to solve the problem..

......

• Can we use the Cramer-Rao bound? No, because theinterchangeability condition does not hold

• Then, can we use complete sufficient statistics?..1 Find a complete sufficient statistic T...2 For a trivial unbiased estimator W for θ, and compute ϕ(T) = E[W|T]..3 or Make a function ϕ(T) such that E[ϕ(T)] = θ.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 23 / 33

Page 97: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Practice Problem 1.Problem..

......

Let X1, · · · ,Xn be a random sample from a population with pdff(x|θ) = 1

2θ− θ < x < θ, θ > 0

Find, if one exists, a best unbiased estimator of θ..Strategy to solve the problem..

......

• Can we use the Cramer-Rao bound?

No, because theinterchangeability condition does not hold

• Then, can we use complete sufficient statistics?..1 Find a complete sufficient statistic T...2 For a trivial unbiased estimator W for θ, and compute ϕ(T) = E[W|T]..3 or Make a function ϕ(T) such that E[ϕ(T)] = θ.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 23 / 33

Page 98: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Practice Problem 1.Problem..

......

Let X1, · · · ,Xn be a random sample from a population with pdff(x|θ) = 1

2θ− θ < x < θ, θ > 0

Find, if one exists, a best unbiased estimator of θ..Strategy to solve the problem..

......

• Can we use the Cramer-Rao bound? No, because theinterchangeability condition does not hold

• Then, can we use complete sufficient statistics?..1 Find a complete sufficient statistic T...2 For a trivial unbiased estimator W for θ, and compute ϕ(T) = E[W|T]..3 or Make a function ϕ(T) such that E[ϕ(T)] = θ.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 23 / 33

Page 99: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Practice Problem 1.Problem..

......

Let X1, · · · ,Xn be a random sample from a population with pdff(x|θ) = 1

2θ− θ < x < θ, θ > 0

Find, if one exists, a best unbiased estimator of θ..Strategy to solve the problem..

......

• Can we use the Cramer-Rao bound? No, because theinterchangeability condition does not hold

• Then, can we use complete sufficient statistics?

..1 Find a complete sufficient statistic T.

..2 For a trivial unbiased estimator W for θ, and compute ϕ(T) = E[W|T]

..3 or Make a function ϕ(T) such that E[ϕ(T)] = θ.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 23 / 33

Page 100: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Practice Problem 1.Problem..

......

Let X1, · · · ,Xn be a random sample from a population with pdff(x|θ) = 1

2θ− θ < x < θ, θ > 0

Find, if one exists, a best unbiased estimator of θ..Strategy to solve the problem..

......

• Can we use the Cramer-Rao bound? No, because theinterchangeability condition does not hold

• Then, can we use complete sufficient statistics?..1 Find a complete sufficient statistic T.

..2 For a trivial unbiased estimator W for θ, and compute ϕ(T) = E[W|T]

..3 or Make a function ϕ(T) such that E[ϕ(T)] = θ.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 23 / 33

Page 101: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Practice Problem 1.Problem..

......

Let X1, · · · ,Xn be a random sample from a population with pdff(x|θ) = 1

2θ− θ < x < θ, θ > 0

Find, if one exists, a best unbiased estimator of θ..Strategy to solve the problem..

......

• Can we use the Cramer-Rao bound? No, because theinterchangeability condition does not hold

• Then, can we use complete sufficient statistics?..1 Find a complete sufficient statistic T...2 For a trivial unbiased estimator W for θ, and compute ϕ(T) = E[W|T]

..3 or Make a function ϕ(T) such that E[ϕ(T)] = θ.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 23 / 33

Page 102: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Practice Problem 1.Problem..

......

Let X1, · · · ,Xn be a random sample from a population with pdff(x|θ) = 1

2θ− θ < x < θ, θ > 0

Find, if one exists, a best unbiased estimator of θ..Strategy to solve the problem..

......

• Can we use the Cramer-Rao bound? No, because theinterchangeability condition does not hold

• Then, can we use complete sufficient statistics?..1 Find a complete sufficient statistic T...2 For a trivial unbiased estimator W for θ, and compute ϕ(T) = E[W|T]..3 or Make a function ϕ(T) such that E[ϕ(T)] = θ.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 23 / 33

Page 103: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

SolutionFirst, we need to find a complete sufficient statistic.

fX(x|θ) =1

2θI(|x| < θ)

fX(x|θ) =1

(2θ)n I(maxi

|xi| < θ)

Let T(X) = maxi |Xi|, then fT(t|θ) = ntn−1

θn I(0 ≤ t < θ)

E[g(T)] =

∫ θ

0

ntn−1g(t)θn dt = 0∫ θ

0tn−1g(t)dt = 0

θn−1g(θ) = 0

g(θ) = 0

Therefore the family of T is complete.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 24 / 33

Page 104: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

SolutionFirst, we need to find a complete sufficient statistic.

fX(x|θ) =1

2θI(|x| < θ)

fX(x|θ) =1

(2θ)n I(maxi

|xi| < θ)

Let T(X) = maxi |Xi|, then fT(t|θ) = ntn−1

θn I(0 ≤ t < θ)

E[g(T)] =

∫ θ

0

ntn−1g(t)θn dt = 0∫ θ

0tn−1g(t)dt = 0

θn−1g(θ) = 0

g(θ) = 0

Therefore the family of T is complete.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 24 / 33

Page 105: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

SolutionFirst, we need to find a complete sufficient statistic.

fX(x|θ) =1

2θI(|x| < θ)

fX(x|θ) =1

(2θ)n I(maxi

|xi| < θ)

Let T(X) = maxi |Xi|, then fT(t|θ) = ntn−1

θn I(0 ≤ t < θ)

E[g(T)] =

∫ θ

0

ntn−1g(t)θn dt = 0∫ θ

0tn−1g(t)dt = 0

θn−1g(θ) = 0

g(θ) = 0

Therefore the family of T is complete.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 24 / 33

Page 106: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

SolutionFirst, we need to find a complete sufficient statistic.

fX(x|θ) =1

2θI(|x| < θ)

fX(x|θ) =1

(2θ)n I(maxi

|xi| < θ)

Let T(X) = maxi |Xi|, then fT(t|θ) = ntn−1

θn I(0 ≤ t < θ)

E[g(T)] =

∫ θ

0

ntn−1g(t)θn dt = 0

∫ θ

0tn−1g(t)dt = 0

θn−1g(θ) = 0

g(θ) = 0

Therefore the family of T is complete.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 24 / 33

Page 107: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

SolutionFirst, we need to find a complete sufficient statistic.

fX(x|θ) =1

2θI(|x| < θ)

fX(x|θ) =1

(2θ)n I(maxi

|xi| < θ)

Let T(X) = maxi |Xi|, then fT(t|θ) = ntn−1

θn I(0 ≤ t < θ)

E[g(T)] =

∫ θ

0

ntn−1g(t)θn dt = 0∫ θ

0tn−1g(t)dt = 0

θn−1g(θ) = 0

g(θ) = 0

Therefore the family of T is complete.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 24 / 33

Page 108: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

SolutionFirst, we need to find a complete sufficient statistic.

fX(x|θ) =1

2θI(|x| < θ)

fX(x|θ) =1

(2θ)n I(maxi

|xi| < θ)

Let T(X) = maxi |Xi|, then fT(t|θ) = ntn−1

θn I(0 ≤ t < θ)

E[g(T)] =

∫ θ

0

ntn−1g(t)θn dt = 0∫ θ

0tn−1g(t)dt = 0

θn−1g(θ) = 0

g(θ) = 0

Therefore the family of T is complete.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 24 / 33

Page 109: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

SolutionFirst, we need to find a complete sufficient statistic.

fX(x|θ) =1

2θI(|x| < θ)

fX(x|θ) =1

(2θ)n I(maxi

|xi| < θ)

Let T(X) = maxi |Xi|, then fT(t|θ) = ntn−1

θn I(0 ≤ t < θ)

E[g(T)] =

∫ θ

0

ntn−1g(t)θn dt = 0∫ θ

0tn−1g(t)dt = 0

θn−1g(θ) = 0

g(θ) = 0

Therefore the family of T is complete.Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 24 / 33

Page 110: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution

We need to make a ϕ(T) such that E[ϕ(T)] = θ.

First, let’s see what the expectation of T is

E[T] =

∫ θ

0tntn−1

θn dt

=

∫ θ

0

ntn

θn dt

=n

n + 1θ

ϕ(T) = n+1n T is an unbiased estimator and a function of a complete

sufficient statistic.Therefore, ϕ(T) is the best unbiased estimator by Theorem 7.3.23.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 25 / 33

Page 111: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution

We need to make a ϕ(T) such that E[ϕ(T)] = θ.First, let’s see what the expectation of T is

E[T] =

∫ θ

0tntn−1

θn dt

=

∫ θ

0

ntn

θn dt

=n

n + 1θ

ϕ(T) = n+1n T is an unbiased estimator and a function of a complete

sufficient statistic.Therefore, ϕ(T) is the best unbiased estimator by Theorem 7.3.23.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 25 / 33

Page 112: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution

We need to make a ϕ(T) such that E[ϕ(T)] = θ.First, let’s see what the expectation of T is

E[T] =

∫ θ

0tntn−1

θn dt

=

∫ θ

0

ntn

θn dt

=n

n + 1θ

ϕ(T) = n+1n T is an unbiased estimator and a function of a complete

sufficient statistic.Therefore, ϕ(T) is the best unbiased estimator by Theorem 7.3.23.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 25 / 33

Page 113: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution

We need to make a ϕ(T) such that E[ϕ(T)] = θ.First, let’s see what the expectation of T is

E[T] =

∫ θ

0tntn−1

θn dt

=

∫ θ

0

ntn

θn dt

=n

n + 1θ

ϕ(T) = n+1n T is an unbiased estimator and a function of a complete

sufficient statistic.Therefore, ϕ(T) is the best unbiased estimator by Theorem 7.3.23.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 25 / 33

Page 114: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution

We need to make a ϕ(T) such that E[ϕ(T)] = θ.First, let’s see what the expectation of T is

E[T] =

∫ θ

0tntn−1

θn dt

=

∫ θ

0

ntn

θn dt

=n

n + 1θ

ϕ(T) = n+1n T is an unbiased estimator and a function of a complete

sufficient statistic.Therefore, ϕ(T) is the best unbiased estimator by Theorem 7.3.23.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 25 / 33

Page 115: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution

We need to make a ϕ(T) such that E[ϕ(T)] = θ.First, let’s see what the expectation of T is

E[T] =

∫ θ

0tntn−1

θn dt

=

∫ θ

0

ntn

θn dt

=n

n + 1θ

ϕ(T) = n+1n T is an unbiased estimator and a function of a complete

sufficient statistic.

Therefore, ϕ(T) is the best unbiased estimator by Theorem 7.3.23.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 25 / 33

Page 116: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution

We need to make a ϕ(T) such that E[ϕ(T)] = θ.First, let’s see what the expectation of T is

E[T] =

∫ θ

0tntn−1

θn dt

=

∫ θ

0

ntn

θn dt

=n

n + 1θ

ϕ(T) = n+1n T is an unbiased estimator and a function of a complete

sufficient statistic.Therefore, ϕ(T) is the best unbiased estimator by Theorem 7.3.23.

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 25 / 33

Page 117: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Practice Problem 2

.Problem..

......

Let X1, · · · ,Xn+1 be the iid Bernoulli(p), and define the function h(p) by

h(p) = Pr( n∑

i=1

Xi > Xn+1

∣∣∣∣∣ p)

the probability that the first n observations exceed the (n + 1)-st...1 Show that

W(X1, · · · ,Xn+1) = I( n∑

i=1

Xi > Xn+1

)

is an unbiased estimator of h(p)...2 Find the best unbiased estimator of h(p).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 26 / 33

Page 118: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Practice Problem 2

.Problem..

......

Let X1, · · · ,Xn+1 be the iid Bernoulli(p), and define the function h(p) by

h(p) = Pr( n∑

i=1

Xi > Xn+1

∣∣∣∣∣ p)

the probability that the first n observations exceed the (n + 1)-st...1 Show that

W(X1, · · · ,Xn+1) = I( n∑

i=1

Xi > Xn+1

)

is an unbiased estimator of h(p)...2 Find the best unbiased estimator of h(p).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 26 / 33

Page 119: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Practice Problem 2

.Problem..

......

Let X1, · · · ,Xn+1 be the iid Bernoulli(p), and define the function h(p) by

h(p) = Pr( n∑

i=1

Xi > Xn+1

∣∣∣∣∣ p)

the probability that the first n observations exceed the (n + 1)-st.

..1 Show that

W(X1, · · · ,Xn+1) = I( n∑

i=1

Xi > Xn+1

)

is an unbiased estimator of h(p)...2 Find the best unbiased estimator of h(p).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 26 / 33

Page 120: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Practice Problem 2

.Problem..

......

Let X1, · · · ,Xn+1 be the iid Bernoulli(p), and define the function h(p) by

h(p) = Pr( n∑

i=1

Xi > Xn+1

∣∣∣∣∣ p)

the probability that the first n observations exceed the (n + 1)-st...1 Show that

W(X1, · · · ,Xn+1) = I( n∑

i=1

Xi > Xn+1

)

is an unbiased estimator of h(p).

..2 Find the best unbiased estimator of h(p).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 26 / 33

Page 121: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Practice Problem 2

.Problem..

......

Let X1, · · · ,Xn+1 be the iid Bernoulli(p), and define the function h(p) by

h(p) = Pr( n∑

i=1

Xi > Xn+1

∣∣∣∣∣ p)

the probability that the first n observations exceed the (n + 1)-st...1 Show that

W(X1, · · · ,Xn+1) = I( n∑

i=1

Xi > Xn+1

)

is an unbiased estimator of h(p)...2 Find the best unbiased estimator of h(p).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 26 / 33

Page 122: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution for (a)

E[W] =∑

XW(X)Pr(X)

=∑

XI( n∑

i=1

Xi > Xn+1

)Pr(X)

=∑

∑ni=1 Xi>Xn+1

Pr(X)

= Pr( n∑

i=1

Xi > Xn+1

)= h(p)

Therefore T is an unbiased estimator of h(p).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 27 / 33

Page 123: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution for (a)

E[W] =∑

XW(X)Pr(X)

=∑

XI( n∑

i=1

Xi > Xn+1

)Pr(X)

=∑

∑ni=1 Xi>Xn+1

Pr(X)

= Pr( n∑

i=1

Xi > Xn+1

)= h(p)

Therefore T is an unbiased estimator of h(p).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 27 / 33

Page 124: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution for (a)

E[W] =∑

XW(X)Pr(X)

=∑

XI( n∑

i=1

Xi > Xn+1

)Pr(X)

=∑

∑ni=1 Xi>Xn+1

Pr(X)

= Pr( n∑

i=1

Xi > Xn+1

)= h(p)

Therefore T is an unbiased estimator of h(p).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 27 / 33

Page 125: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution for (a)

E[W] =∑

XW(X)Pr(X)

=∑

XI( n∑

i=1

Xi > Xn+1

)Pr(X)

=∑

∑ni=1 Xi>Xn+1

Pr(X)

= Pr( n∑

i=1

Xi > Xn+1

)= h(p)

Therefore T is an unbiased estimator of h(p).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 27 / 33

Page 126: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution for (a)

E[W] =∑

XW(X)Pr(X)

=∑

XI( n∑

i=1

Xi > Xn+1

)Pr(X)

=∑

∑ni=1 Xi>Xn+1

Pr(X)

= Pr( n∑

i=1

Xi > Xn+1

)= h(p)

Therefore T is an unbiased estimator of h(p).

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 27 / 33

Page 127: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution for (b)T =

∑n+1i=1 Xi is complete sufficient statistic for p.

ϕ(T) = E[W|T] = Pr(W = 1|T)

= Pr( n∑

i=1

Xi > Xn+1|T)

• If T = 0, then∑n

i=1 Xi = Xn+1

• If T = 1, then• Pr(

∑ni=1 Xi = 1 > Xn+1 = 0) = n/(n + 1)

• Pr(∑n

i=1 Xi = 0 < Xn+1 = 1) = 1/(n + 1)

• If T = 2 then• Pr(

∑ni=1 Xi = 2 > Xn+1 = 0) =

(n2

)/(n+1

2

)= (n − 1)/(n + 1)

• Pr(∑n

i=1 Xi = 1 = Xn+1 = 1) = 2/(n + 1)

• If T > 2, then∑n

i=1 Xi ≥ 2 > 1 ≥ Xn+1

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 28 / 33

Page 128: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution for (b)T =

∑n+1i=1 Xi is complete sufficient statistic for p.

ϕ(T) = E[W|T] = Pr(W = 1|T)

= Pr( n∑

i=1

Xi > Xn+1|T)

• If T = 0, then∑n

i=1 Xi = Xn+1

• If T = 1, then• Pr(

∑ni=1 Xi = 1 > Xn+1 = 0) = n/(n + 1)

• Pr(∑n

i=1 Xi = 0 < Xn+1 = 1) = 1/(n + 1)

• If T = 2 then• Pr(

∑ni=1 Xi = 2 > Xn+1 = 0) =

(n2

)/(n+1

2

)= (n − 1)/(n + 1)

• Pr(∑n

i=1 Xi = 1 = Xn+1 = 1) = 2/(n + 1)

• If T > 2, then∑n

i=1 Xi ≥ 2 > 1 ≥ Xn+1

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 28 / 33

Page 129: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution for (b)T =

∑n+1i=1 Xi is complete sufficient statistic for p.

ϕ(T) = E[W|T] = Pr(W = 1|T)

= Pr( n∑

i=1

Xi > Xn+1|T)

• If T = 0, then∑n

i=1 Xi = Xn+1

• If T = 1, then• Pr(

∑ni=1 Xi = 1 > Xn+1 = 0) = n/(n + 1)

• Pr(∑n

i=1 Xi = 0 < Xn+1 = 1) = 1/(n + 1)

• If T = 2 then• Pr(

∑ni=1 Xi = 2 > Xn+1 = 0) =

(n2

)/(n+1

2

)= (n − 1)/(n + 1)

• Pr(∑n

i=1 Xi = 1 = Xn+1 = 1) = 2/(n + 1)

• If T > 2, then∑n

i=1 Xi ≥ 2 > 1 ≥ Xn+1

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 28 / 33

Page 130: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution for (b)T =

∑n+1i=1 Xi is complete sufficient statistic for p.

ϕ(T) = E[W|T] = Pr(W = 1|T)

= Pr( n∑

i=1

Xi > Xn+1|T)

• If T = 0, then∑n

i=1 Xi = Xn+1

• If T = 1, then• Pr(

∑ni=1 Xi = 1 > Xn+1 = 0) = n/(n + 1)

• Pr(∑n

i=1 Xi = 0 < Xn+1 = 1) = 1/(n + 1)

• If T = 2 then• Pr(

∑ni=1 Xi = 2 > Xn+1 = 0) =

(n2

)/(n+1

2

)= (n − 1)/(n + 1)

• Pr(∑n

i=1 Xi = 1 = Xn+1 = 1) = 2/(n + 1)

• If T > 2, then∑n

i=1 Xi ≥ 2 > 1 ≥ Xn+1

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 28 / 33

Page 131: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution for (b)T =

∑n+1i=1 Xi is complete sufficient statistic for p.

ϕ(T) = E[W|T] = Pr(W = 1|T)

= Pr( n∑

i=1

Xi > Xn+1|T)

• If T = 0, then∑n

i=1 Xi = Xn+1

• If T = 1, then• Pr(

∑ni=1 Xi = 1 > Xn+1 = 0) = n/(n + 1)

• Pr(∑n

i=1 Xi = 0 < Xn+1 = 1) = 1/(n + 1)

• If T = 2 then• Pr(

∑ni=1 Xi = 2 > Xn+1 = 0) =

(n2

)/(n+1

2

)= (n − 1)/(n + 1)

• Pr(∑n

i=1 Xi = 1 = Xn+1 = 1) = 2/(n + 1)

• If T > 2, then∑n

i=1 Xi ≥ 2 > 1 ≥ Xn+1

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 28 / 33

Page 132: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution for (b)T =

∑n+1i=1 Xi is complete sufficient statistic for p.

ϕ(T) = E[W|T] = Pr(W = 1|T)

= Pr( n∑

i=1

Xi > Xn+1|T)

• If T = 0, then∑n

i=1 Xi = Xn+1

• If T = 1, then• Pr(

∑ni=1 Xi = 1 > Xn+1 = 0) = n/(n + 1)

• Pr(∑n

i=1 Xi = 0 < Xn+1 = 1) = 1/(n + 1)

• If T = 2 then• Pr(

∑ni=1 Xi = 2 > Xn+1 = 0) =

(n2

)/(n+1

2

)= (n − 1)/(n + 1)

• Pr(∑n

i=1 Xi = 1 = Xn+1 = 1) = 2/(n + 1)

• If T > 2, then∑n

i=1 Xi ≥ 2 > 1 ≥ Xn+1

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 28 / 33

Page 133: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution for (b)T =

∑n+1i=1 Xi is complete sufficient statistic for p.

ϕ(T) = E[W|T] = Pr(W = 1|T)

= Pr( n∑

i=1

Xi > Xn+1|T)

• If T = 0, then∑n

i=1 Xi = Xn+1

• If T = 1, then• Pr(

∑ni=1 Xi = 1 > Xn+1 = 0) = n/(n + 1)

• Pr(∑n

i=1 Xi = 0 < Xn+1 = 1) = 1/(n + 1)

• If T = 2 then• Pr(

∑ni=1 Xi = 2 > Xn+1 = 0) =

(n2

)/(n+1

2

)= (n − 1)/(n + 1)

• Pr(∑n

i=1 Xi = 1 = Xn+1 = 1) = 2/(n + 1)

• If T > 2, then∑n

i=1 Xi ≥ 2 > 1 ≥ Xn+1

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 28 / 33

Page 134: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Solution for (b) (cont’d)

Therefore, the best unbiased estimator is

ϕ(T) = Pr( n∑

i=1

Xi > Xn+1|T)

=

0 T = 0n/(n + 1) T = 1(n − 1)/(n + 1) T = 21 T ≥ 3

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 29 / 33

Page 135: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Practice Problem 3

.Problem..

......

Suppose X1, · · · ,Xn are iid samples from f(x|θ) = θ exp(−θx). Supposethe prior distribution of θ is

π(θ) =1

Γ(α)βαθα−1e−θ/β

where α, β are known.

(a) Derive the posterior distribution of θ.(b) If we use the loss function L(θ, a) = (a − θ)2, what is the Bayes rule

estimator for θ?

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 30 / 33

Page 136: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Practice Problem 3

.Problem..

......

Suppose X1, · · · ,Xn are iid samples from f(x|θ) = θ exp(−θx). Supposethe prior distribution of θ is

π(θ) =1

Γ(α)βαθα−1e−θ/β

where α, β are known.(a) Derive the posterior distribution of θ.

(b) If we use the loss function L(θ, a) = (a − θ)2, what is the Bayes ruleestimator for θ?

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 30 / 33

Page 137: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Practice Problem 3

.Problem..

......

Suppose X1, · · · ,Xn are iid samples from f(x|θ) = θ exp(−θx). Supposethe prior distribution of θ is

π(θ) =1

Γ(α)βαθα−1e−θ/β

where α, β are known.(a) Derive the posterior distribution of θ.(b) If we use the loss function L(θ, a) = (a − θ)2, what is the Bayes rule

estimator for θ?

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 30 / 33

Page 138: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

(a) Posterior distribution of θ

f(x, θ) = π(θ)f(x|θ)π(θ)

=1

Γ(α)βαθα−1e−θ/β

n∏i=1

[θ exp (−θxi)]

=1

Γ(α)βαθα−1e−θ/βθn exp

(−θ

n∑i=1

xi

)

=1

Γ(α)βαθα+n−1 exp

[−θ

(1/β +

n∑i=1

xi

)]

∝ Gamma(α+ n − 1,

1

β−1 +∑n

i=1 xi

)π(θ|x) = Gamma

(α+ n − 1,

1

β−1 +∑n

i=1 xi

)

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 31 / 33

Page 139: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

(a) Posterior distribution of θ

f(x, θ) = π(θ)f(x|θ)π(θ)

=1

Γ(α)βαθα−1e−θ/β

n∏i=1

[θ exp (−θxi)]

=1

Γ(α)βαθα−1e−θ/βθn exp

(−θ

n∑i=1

xi

)

=1

Γ(α)βαθα+n−1 exp

[−θ

(1/β +

n∑i=1

xi

)]

∝ Gamma(α+ n − 1,

1

β−1 +∑n

i=1 xi

)π(θ|x) = Gamma

(α+ n − 1,

1

β−1 +∑n

i=1 xi

)

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 31 / 33

Page 140: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

(a) Posterior distribution of θ

f(x, θ) = π(θ)f(x|θ)π(θ)

=1

Γ(α)βαθα−1e−θ/β

n∏i=1

[θ exp (−θxi)]

=1

Γ(α)βαθα−1e−θ/βθn exp

(−θ

n∑i=1

xi

)

=1

Γ(α)βαθα+n−1 exp

[−θ

(1/β +

n∑i=1

xi

)]

∝ Gamma(α+ n − 1,

1

β−1 +∑n

i=1 xi

)π(θ|x) = Gamma

(α+ n − 1,

1

β−1 +∑n

i=1 xi

)

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 31 / 33

Page 141: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

(a) Posterior distribution of θ

f(x, θ) = π(θ)f(x|θ)π(θ)

=1

Γ(α)βαθα−1e−θ/β

n∏i=1

[θ exp (−θxi)]

=1

Γ(α)βαθα−1e−θ/βθn exp

(−θ

n∑i=1

xi

)

=1

Γ(α)βαθα+n−1 exp

[−θ

(1/β +

n∑i=1

xi

)]

∝ Gamma(α+ n − 1,

1

β−1 +∑n

i=1 xi

)

π(θ|x) = Gamma(α+ n − 1,

1

β−1 +∑n

i=1 xi

)

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 31 / 33

Page 142: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

(a) Posterior distribution of θ

f(x, θ) = π(θ)f(x|θ)π(θ)

=1

Γ(α)βαθα−1e−θ/β

n∏i=1

[θ exp (−θxi)]

=1

Γ(α)βαθα−1e−θ/βθn exp

(−θ

n∑i=1

xi

)

=1

Γ(α)βαθα+n−1 exp

[−θ

(1/β +

n∑i=1

xi

)]

∝ Gamma(α+ n − 1,

1

β−1 +∑n

i=1 xi

)π(θ|x) = Gamma

(α+ n − 1,

1

β−1 +∑n

i=1 xi

)Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 31 / 33

Page 143: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

(b) Bayes’ rule estimator with squared error loss

Bayes’ rule estimator with squared error loss is posterior mean. Note thatthe mean of Gamma(α, β) is αβ.

π(θ|x) = Gamma(α+ n − 1,

1

β−1 +∑n

i=1 xi

)E[θ|x] = E[π(θ|x)]

=α+ n − 1

β−1 +∑n

i=1 xi

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 32 / 33

Page 144: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

(b) Bayes’ rule estimator with squared error loss

Bayes’ rule estimator with squared error loss is posterior mean. Note thatthe mean of Gamma(α, β) is αβ.

π(θ|x) = Gamma(α+ n − 1,

1

β−1 +∑n

i=1 xi

)

E[θ|x] = E[π(θ|x)]

=α+ n − 1

β−1 +∑n

i=1 xi

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 32 / 33

Page 145: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

(b) Bayes’ rule estimator with squared error loss

Bayes’ rule estimator with squared error loss is posterior mean. Note thatthe mean of Gamma(α, β) is αβ.

π(θ|x) = Gamma(α+ n − 1,

1

β−1 +∑n

i=1 xi

)E[θ|x] = E[π(θ|x)]

=α+ n − 1

β−1 +∑n

i=1 xi

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 32 / 33

Page 146: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Summary

.Today..

......

• E-M Algorithm• Practice Problems for the Final Exam

.Next Lectures..

......

• Bayesian Tests• Bayesian Intervals• More practice problems

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 33 / 33

Page 147: Biostatistics 602 - Statistical Inference Lecture 24 E …...Recap. . . . . . . . . . . . . . . . E-M. . . P1. . . . P2. . . P3. Summary.. Biostatistics 602 - Statistical Inference

..........

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

.....

.....

......

.....

......

.....

.....

.

. . . . .Recap

. . . . . . . . . . . . . . . .E-M

. . .P1

. . . .P2

. . .P3

.Summary

Summary

.Today..

......

• E-M Algorithm• Practice Problems for the Final Exam

.Next Lectures..

......

• Bayesian Tests• Bayesian Intervals• More practice problems

Hyun Min Kang Biostatistics 602 - Lecture 24 April 16th, 2013 33 / 33