mathematical statistics homework assignment...

40
MATHEMATICAL STATISTICS Homework assignment Instructions Please turn in the homework with this cover page. You do not need to edit the solutions. Just make sure the handwriting is legible. You may discuss the problems with your peers but the final solutions should be your work. There is no specific deadline but you need to complete everything to get the grade. Statement: With my signature I confirm that the solutions are the product of my own work. Name: Signature: .

Upload: others

Post on 15-May-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

MATHEMATICAL STATISTICS

Homework assignment

Instructions

Please turn in the homework with this cover page. You do not need to edit thesolutions. Just make sure the handwriting is legible. You may discuss the problemswith your peers but the final solutions should be your work. There is no specificdeadline but you need to complete everything to get the grade.

Statement: With my signature I confirm that the solutions are the product of myown work. Name: Signature: .

Page 2: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Homework assignment

Random variables and random vectors

1. Suppose Z1, Z2, . . . , Zn are independent, identically ditributed random variableshaving the Beta(1, q) distribution.

a. DefineX1 = Z1

X2 = Z2(1− Z1)X3 = Z3(1− Z2)(1− Z1) .

Find the joint distribution of X1, X2 and X3. Consider taking logarithms.

b. More generally, define

Xi = Zi

i−1∏j=1

(1− Zj) .

for i = 1, 2, . . . , n. Find the joint distribution of (X1, X2, . . . , Xn).

2. Let X1, X2, . . . , Xn, Xn+1 be random variables such that E(Xk) = 0 for k =1, . . . , n+ 1 and covariance matrix Σ (a (n+ 1)× (n+ 1) matrix). We would liketo find the best linear predictor for Xn+1 based on the variables X1, X2, . . . , Xn.This means that we are looking for the linear combination Xn+1 = b0 + b1X1 +· · ·+ bnXn for which the expected square error

E(Xn+1 − Xn+1)2

will be as small as possible. Find the coefficients b0, b1, . . . , bn.

Hint: Write the square error as a function of b0, b1, . . . , bn and use partial deriva-tives.

3. Let X and Y be random variables with density

fX,Y (x, y) = xe−x(y+1)

for x, y ≥ 0.

a. Find the conditional densities fX|Y=y(x) and fY |X=x(y).

b. Find E(X|Y ) and E(Y |X) and check that

E(Xg(Y )) = E(E(X|Y )g(Y )) and E(Y g(X)) = E(E(Y |X)g(X))

for an arbitrary bounded function g.

2

Page 3: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Homework assignment

Multivariate normal distribution

4. Suppose Z is a random vector whose components are independent standard nor-mal random variables and let A be a rectangular matrix such that AAT isinvertible. Prove that the density of X = AZ + µ is still given by the formula

fX(x) =1

(2π)n/2√

det(AAT )exp

(−1

2(x− µ)T (AAT )−1(x− µ)

).

5. Suppose X is a mutivariate normal vector with expectation 0 and variance Σ.Write

X =

(X1

X2

)and Σ =

(Σ11 Σ12

Σ21 Σ22

).

Assume Σ is invertible. Compute the conditional density of X2 given X1 = x1

by using the usual formula

fX2|X1=x1(x2) =fX(x)

fX1(x1).

Hint: Use the inversion lemma

Σ−1 =

((Σ11 −Σ12Σ

−122 Σ21)

−1 −(Σ11 −Σ12Σ−122 Σ21)

−1Σ12Σ−122

−Σ−122 Σ21(Σ11 −Σ12Σ−122 Σ21)

−1 (Σ22 −Σ21Σ−111 Σ12)

−1

)Compare this proof to the slicker one using independence of linear transforma-tions of multivariate normal vectors. Comment.

6. Suppose X ∼ Np(µ,Σ) and that the matrix QΣQT is an invertile q × q ma-trix. Show that the conditional distribution of X given Qx = q is normal withconditional mean

µ+ ΣQT (QΣQT )−1(q−Qµ)

and conditional (singular) covariance matrix

Σ−ΣQT (QΣQT )−1QΣ .

7. Suppose X and Y are p-dimensional random vectors such that(XY

)∼ N2p(0,Σ)

where the covariance matrix is of the form

Σ =

(I ρ11T

ρ11T I

)The matrix I represents the p × p identity matrix, 1 = (1, 1, . . . , 1)T and ρ is ascalar constant such that |ρ| ≤ 1/

√p(p− 1).

3

Page 4: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Homework assignment

a. Compute E(XTX).

b. Compute E(XTX|Y).

8. If A and B are p × p symmetric idempotent matrices of rank r and s and ifAB = 0, show that, for X ∼ Np(0, σ

2I)

a. Show thatXTAX/r

XTBX/s∼ Fr,s .

b. Show thatXTAX

XT (A + B)X∼ B(r/2, s/2) .

c. Show that(p− r)XTAX

rXT (I−A)X∼ Fr,p−r .

Parameter estimation

9. The log-normal distribution has the density

fX(x) =1√

2πσxe−(log x−µ)

2/(2σ2)

for x > 0.

a. Assume that σ is known and you have i.i.d. observations X1, X2, . . . , Xn.Find the maximum likelihood estimate for µ.

b. Find the approximate standard error of your estimator.

10. The Pareto distribution with parameters α and λ has density

f(x, α, λ) =αλα

(λ+ x)α+1

for x > 0 where α, λ > 0.

a. Write down the equations for the MLE of the parameters given i.i.d. ob-servations x1, x2, . . . , xn.

b. Compute the approximate standard error for the MLE od α.

4

Page 5: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Homework assignment

11. Let X1, X2, . . . , Xn be an i.i.d. sample from the inverse Gaussian distributionI(µ, τ) with density

√τ

2πx3exp

{− τ

2xµ2(x− µ)2

}, x > 0, τ > 0, µ > 0.

The expectation of the inverse Gaussian distribution is E(X1) = µ. Assume thatall densities are smooth enough to apply the asymptotic theorems.

a. (10) Find the MLE for (µ, τ) based on observations x1, . . . , xn.

b. (10) Compute the Fisher information matrix I(µ, τ).

c. (5) Give a formula for the approximate 95% confidence interval for µ basedon x1, x2, . . . , xn.

12. Suppose X = (X1, X2, . . . , Xn) ∼ N(µ, σ2Σ) where σ2 is an unknown parameterand Σ is a known invertible matrix.

a. Suppose the expectation µ is known and you have one observation X1. Howwould you estimate σ2? Is your estimate unbiased? What is the varianceof the estimate you found?

Hint: What is the distribution of Σ−1/2X?

b. How would you go about the questions in a. if µ was not known but youknew that all components of µ were the same?

Hypothesis testing

13. We have observations X1, X2, . . . , Xn from the normal distribution N(µ, σ2). Wewould like to test H0 : µ = 0 versus H1 : µ 6= 0.

a. One can test H0 at confidence level α in two ways:

- H0 is rejected if |X| > c for a suitable c.

- One estimates µ in σ2 and sets up a confidence interval. If the confi-dence interval does not cover 0 the null-hypothesis is rejected.

Are the above tests the same? Comment. What is the answer if we assumethat the parameter σ is known?

b. Find the likelohood ratio statistic for the above testing problem in bothcases, when σ is known and when σ is unknown.

5

Page 6: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Homework assignment

14. Bartlett’s test is a commonly used test for equal variances. The testing problemassumes that all observations {xij} for i = 1, 2, . . . , k and j = 1, 2, . . . , ni foreach i are like independent random variables where Xij ∼ N(µi, σ

2i ). One tests

H0 : σ21 = σ2

2 = ... = σ2k

versusH1 : the σ2

i are not all equal

Assume we have samples of size ni from the i-th population, i = 1, 2, . . . , k, andthe usual variance estimates from each sample

s21, s22, . . . , s

2k

where

s2i =1

ni − 1

ni∑j=1

(xij − xi)2 .

Introduce the following notation νj = nj − 1 and

ν =k∑i=1

νi

and

s2 =1

ν

k∑i=1

νis2i

The Bartlett’s test statistic M is defined by

M = ν log s2 −k∑i=1

νi log s2i .

a. The approximate distribution of Bartlett’s M is χ2(r). What is in youropinion r? Explain why.

b. Assume that the maximum likelihood estimates for parameters µi and σ2i

are

µi = xi =1

ni

ni∑i=1

xij and σ2i =

1

ni

ni∑i=1

(xij − xi)2

for i = 1, 2, . . . , k. Write down the likelihood ratio statistic for the testingproblem in question. What is its approximate distribution? Any similarityto Bartlett’s test? Comment.

Hint: If you assume σ21 = σ2 = · · · = σ2

k, the MLE estimates for µi are stillthe means xi for i = 1, 2, . . . , k.

6

Page 7: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Homework assignment

15. The one sample Wilcoxon test is used to test whether a continuous distributionis symmetric. On the basis of n i.i.d. observations X1, X2, . . . , Xn from anunknown continuous distribution F one tests the hypothesis

H0 : F (x) = 1− F (−x) for all x

versusH1 : F (x) < 1− F (−x) for some x .

Let Ri be the rank of |Xi| among the |X1|, |X2|, . . . , |Xn|. The sign test is basedon the statistic

W =n∑i=1

1(Xi > 0)Ri

i.e. the sum of ranks of positive Xi’s.

a. Show that if H0 is true W has a distribution that does not depend on F .

b. Show that

W =n∑

i,j=1

1(|Xi| ≤ Xj) .

c. Show that if H0 is true then E(W ) = n(n+ 1)/4.

d. Compute the variance of W .

e. How would you find critical values for testing H0?

7

Page 8: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Homework assignment

On the following pages you will find the take-home finals from previous years. Doone first, one second, one third and one fourth problem from the exams. The problemscan be chosen from different years.

8

Page 9: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

MATHEMATICAL STATISTICS

Final take-home examination

April 8th-April 15th, 2013

Instructions

You do not need to edit the solutions. Just make sure the handwriting is legible.The final solutions should be your work. The deadline for completion is March 15th,2013 by 4pm. Turn in your solutions to Petra Vranjes. For any questions contact meby e-mail.

Statement: With my signature I confirm that the solutions are the product of myown work. Name: Signature: .

Page 10: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

1. (25) Suppose a population of size N is divided into K = N/M groups of size M .We select a sample of size km the following way:

• First we select k groups out of K groups by simple random sampling with re-placement.

• We then select m units in each group selected on the first step by simple randomsample with replacement.

• The estimate of the population mean is the average Y of the sample.

Let µi be the population average in the i-the group for i = 1, 2, . . . , K. Let

σ2u =

1

K

K∑i=1

(µi − µ)2 ,

where µ =∑K

i=1 µi/K. Let

σ2w =

1

N

K∑i=1

M∑j=1

(yij − µi)2 ,

where yij denotes the value of the variable for the j-the unit in the i-th group.

a. Let k = 1. Show that we can write the estimator as

Y =K∑i=1

IiYi ,

where

Ii =

{1 if the i-th group is selected.0 otherwise

and var(Yi) = σ2i /m. Argue that it is reasonable to assume that Yi and Ii are all

independent. Let σ2i be the population variance for the i-th subgroup. Compute

var(Y ).

b. If we repeat the procedure we get independent estimators Y1, Y2, . . . , Yk, andestimate the population average by

Y =1

k

k∑i=1

Yk .

Show that

var(Y ) =σ2u

k+σ2w

km.

Argue that this expression is the variance of the estimator described in theintroduction.

2

Page 11: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

c. The assumption that we sample with replacement is unrealistic. Let k = 1 andassume that the sample of size m is selected by simple random sample withoutreplacement. Argue that

Y =K∑i=1

IiYi ,

where

Ii =

{1 if we select the ith subgroup.0 otherwise

Compute the variance of the estimator in this case.

d. Assume that the k groups are selected by simple random sample without replace-ment. In this case the estimator is

Y =1

k

K∑i=1

IiYi ,

where

Ii =

{1 if we select the ith subgroup.0 otherwise

Argue that it is reasonable to assume that I1, . . . , IK and Y1, . . . , YK are inde-pendent. Compute the standard error of the estimator.

e. Explain why the sampling distribution in d. is approximately normal.

3

Page 12: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

2. (25) Suppose {p(x, θ), θ ∈ Θ ⊂ Rk} is a (regular) family of distributions. Definethe vector valued score function s as the column vector with components

s(x, θ) =∂

∂θlog(p(x, θ)) = grad(log(p(x, θ)) .

and the Fisher information matrix as

I(θ) = var(s) .

Remark: If p(x, θ) = 0 define log (p(X, θ)) = 0.

a. Let t(X) be an unbiased estimator of θ based on the likelihood function, i.e.

Eθ(t(X)) = θ .

Prove thatE(s) = 0 and E(stT ) = I .

Deduce that cov(s, t) = I.

Remark: Make liberal assumptions about interchanging integration and differen-tiation.

b. Let a, c be two arbitrary k–dimensional vectors. Prove that

corr2(aT t, cT s

)=

(aTc)2

aTvar(t)a · cT I(θ)c.

The correlation coefficient squared is always less or equal 1. Maximize the expres-sion for the correlation coefficient over c and deduce the Rao-Cramer inequality.

4

Page 13: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

3. (25) Suppose X1,X2, . . . ,Xn are i.i.d. observations from a multivariate normaldistribution N(µ,Σ) where Σ is known. Further assume that R is a given matrix andr a given vector. Use the likelihood ratio procedure to produce a test statistic for

H0 : Rµ = r vs. H1 : Rµ 6= r .

Give explicit formulae for the test statistic and the critical values.

5

Page 14: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

4. (25) Let Y = Xβ+ ε be a linear model where we assume E(ε) = 0 in var(ε) = σ2Σfor a known invertible matrix Σ.

a. Show that the BLUE for β is given by

β = (XTΣ−1X)−1XTΣ−1Y .

Assume that XTΣ−1X is invertible and use the Gauss-Markov theorem.

b. Assume that the linear model is of the form

Ykl = α + βxkl + uk + εkl,

k = 1, 2, . . . , K in l = 1, 2, . . . , Lk where εkl are N(0, σ2) and uk are N(0, τ 2) andall random quantities are independent. Assume that the ratio τ 2/σ2 is known.Show that the BLUE is given by(

α

β

)=

( ∑wk

∑wkxk∑

wkxk Sxx +∑wkx

2k

)−1( ∑wkyk

Sxy +∑wkxkyk

),

where

wk = Lkσ2/(σ2 + Lkτ

2)

Sxx =∑k

∑l

(xkl − xk)2

Sxy =∑k

∑l

(xkl − xk)(ykl − yk) .

Hint: For c 6= −1/n one has (I + c11T )−1 = I − c(1 + nc)−111T where 1T =(1, 1, . . . , 1).

c. What would you do if the ratio τ 2/σ2 were unknown?

d. How would you test the hypothesis H0 : β = 0 versus H1 : β 6= 0? What is thedistribution of the test statistic under the null-hypothesis?

6

Page 15: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

MATHEMATICAL STATISTICS

Final take-home examination

May 7th-May 16th, 2014

Instructions

You do not need to edit the solutions. Just make sure the handwriting is legible.The final solutions should be your work. The deadline for completion is May 16th,2014 by 4pm. Turn in your solutions to Petra Vranjes. For any questions contact meby e-mail.

Statement: With my signature I confirm that the solutions are the product of myown work. Name: Signature: .

Page 16: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

1. (25) Suppose a population of size N is divided into K = N/M groups of size M .We select a sample of size n = km the following way:

• First we select k groups out of K groups by simple random sampling.

• We then select m units in each group selected on the first step by simple randomsampling.

• The estimate of the population mean is the average Y of the sample.

Let µi be the population average in the i-th group for i = 1, 2, . . . , K, and let σ2i

be the population variance in the i-th group for i = 1, 2, . . . , K.

a. (10) Show that we can write the estimator as

Y =1

k

K∑i=1

YiIi ,

where

Ii =

{1 if the i-th group is selected.0 otherwise

and Yi is the sample average in the i-th group for i = 1, 2, . . . , K. Argue thatit is reasonable to assume that the random variables Y1, . . . , YK are independentand independent from I1, . . . , IK . Show that Y is an unbiased estimator of thepopulation mean µ and show that the variance of Y is

var(Y ) =M −m

k(M − 1)m· 1

K

K∑i=1

σ2i +

K − kk(K − 1)

· 1

K

K∑i=1

(µi − µ)2 .

b. (15) Suggest an estimate for the quantity

σ2b =

1

K

K∑i=1

(µk − µ)2 =1

K

K∑i=1

µ2k − µ2 .

Is your estimate unbiased? Can you modify it to be an unbiased estimate?

2

Page 17: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

2. (25) Suppose Θ1,Θ2, . . . ,Θn are i.i.d. random variables with values in [0, 2π) eachhaving the von Mises density

f(θ;µ, k) =1

2πI0(k)exp (k cos(θ − µ))

for 0 ≤ θ < 2π where k ≥ 0 and µ ∈ [0, 2π] are the unknown parameters. I0 is themodified Bessel function of the first kind and order 0.

Suppose you have an i.i.d. sample θ1, θ2, . . . , θn.

a. (10) Let ν = (cos(µ), sin(µ)). Derive the MLE for ν.

b. (5) Describe how you would find the MLE for k.

c. (10) Let a = k cos(µ) and b = k sin(µ) and let an and bn be their respectiveMLE based on n i.i.d. observations. Show that the asymptotic distribution of√n(an − a, bn − b) is bivariate normal N(0,Σ−1), where Σ is the covariance

matrix of the random vector (cos(Θ1), sin(Θ1)).

3

Page 18: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

3. (25) Suppose X1,X2, . . . ,Xn are i.i.d. observations from a multivariate normaldistribution N(µ,Σ) where Σ is known. Further assume that a is a given vector. Usethe likelihood ratio procedure to produce a test statistic for

H0 : aTµ = 0 vs. H1 : aTµ 6= 0

a. (15) Give explicit formulae for the test statistic and the critical values.

b. (10) What changes if the covariance matrix of the Xi is of the form σ2Σ withunknown σ2 and known Σ?

4

Page 19: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

4. (25) Let Y = Xβ+ε be a linear model where we assume E(ε) = 0 in var(ε) = σ2Σfor a known invertible matrix Σ.

a. (10) Show that the BLUE for β is given by

β = (XTΣ−1X)−1XTΣ−1Y .

Assume that XTΣ−1X is invertible and use the Gauss-Markov theorem.

b. (15) Assume that the linear model is of the form

Ykl = α + βxkl + uk + εkl,

k = 1, 2, . . . , K in l = 1, 2, . . . , Lk where εkl are N(0, σ2) and uk are N(0, τ 2) andall random quantities are independent. Assume that the ratio τ 2/σ2 is known.Show that the BLUE is given by(

α

β

)=

( ∑wk

∑wkxk∑

wkxk Sxx +∑wkx

2k

)−1( ∑wkyk

Sxy +∑wkxkyk

),

where

wk = Lkσ2/(σ2 + Lkτ

2)

Sxx =∑k

∑l

(xkl − xk)2

Sxy =∑k

∑l

(xkl − xk)(ykl − yk) .

Hint: For c 6= −1/n one has (I + c11T )−1 = I − c(1 + nc)−111T where 1T =(1, 1, . . . , 1).

5

Page 20: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

MATHEMATICAL STATISTICS

Final take-home examination

May 4th-May 12th, 2015

Instructions

You do not need to edit the solutions. Just make sure the handwriting is legible.The final solutions should be your work. The deadline for completion is May 12th,2015 by 4pm. Turn in your solutions to Petra Vranjes. For any questions contact meby e-mail or call me at 041 725 497.

Statement: With my signature I confirm that the solutions are the product of myown work. Name: Signature: .

Page 21: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

1. (25) Suppose a population of size N is divided into K = N/M groups of size M .We select a sample of size km the following way:

• First we select k groups out of K groups by simple random sampling.

• We then select m units in each group selected on the first step by simple randomsampling. Samples in selected groups are assumed to be independent.

Denote by µ the population average and by µi the population average in the i-thgroup. Similarly denote by σ2

i the population variance in the i-th group.

a. Suggest an estimate for the population average µ. Is the estimate unbiased?

b. Derive the formula for the standard error of the estimate from a.

c. How would you estimate the quantity

γ =K∑i=1

(µi − µ)2 ?

Is the estimate you suggest unbiased?

d. Give an estimate of the standard error based on the sample.

2

Page 22: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

2. (25) Assume the data pairs (y1, z1), . . . , (yn, zn) are an i.i.d. sample from thedistribution with density

f(y, z, θ, σ) = e−y · 1√2πyσ

e− (z−θy)2

2yσ2

for y > 0 and σ > 0.

a. Find the maximum likelihood estimators of θ and σ2. Are the estimators unbi-ased?

b. Find the exact standard errors of θ and σ2.

c. Compute the Fisher information matrix.

d. Find the standard errors of the maximum likelihood estimators using the Fisherinformation matrix. Comment on your findings.

3

Page 23: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

3. (25) Assume that the data x1,x2, . . . ,xn are an i.i.d. sample from the multivariatenormal distribution of the form

X1 ∼ N

((µ(1)

µ(2)

),

(Σ11 Σ12

Σ21 Σ22

)).

Assume that the parameters µ and Σ are unknown. Assume the following theorem:

If A(p × p) is a given symmetric positive definite matrix then the positive definitematrix Σ that maximizes the expression

1

det(Σ)n/2· exp

(−1

2Tr(Σ−1A

))is the matrix

Σ =1

nA .

The testing problem is

H0 : Σ12 = 0 versus H1 : Σ12 6= 0 .

a. Find the maximum likelihood estimates of µ and Σ in the unconstrained case.

b. Find the maximum likelihood estimates of µ and Σ in the constrained case.

c. Write the likelihood ratio statistic for the testing problem as explicitly as possi-ble.

d. What can you say about the distribution of the likelihood ratio statistic?

4

Page 24: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

4. (25) Assume the regression model

Yi1 = α + βxi1 + εiYi2 = α + βxi2 + ηi

for i = 1, 2, . . . , n. In other words the observation come in pairs. Assume that E(εi) =E(ηi) = 0, var(εi) = var(ηi) = σ2 and corr(εi, ηi) = ρ ∈ (−1, 1). Assume that the pairs(ε1, η1), . . . , (εn, ηn) are uncorrelated. Furthermore assume that

n∑i=1

xi1xi2 = 0 .

a. Assume that ρ is known. Find the best linear unbiased estimate of the regressionparameters α and β. Find an unbiased estimator of σ2.

b. Assume that ρ is unknown and let α and β be the ordinary least squares esti-mators of the regression parameters. Compute the standard errors of the twoestimators.

c. Let εi and ηi be the residuals from ordinary least squares. Express

E

[n∑

i=1

(ε2i + η2i

)]

and

E

[n∑

i=1

εiηi

]with the elements of the hat matrix H.

d. Give an estimate of var(α) and var(β). Are the estimators unbiased?

5

Page 25: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

MATHEMATICAL STATISTICS

Take-home final examination

May 16th-May 23th, 2016

Instructions

You do not need to edit the solutions. Just make sure the handwriting is legible.The final solutions should be your work. The deadline for completion is May 23rd,2016 by 4pm. Turn in your solutions to Petra Vranjes or send me a scanned versionof solutions. For any questions contact me by e-mail or call me at +386 41 725 497.

Statement: With my signature I confirm that the solutions are the product of myown work. Name: Signature: .

Page 26: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

1. (25) For purposes of sampling the population od sizeN = N1 + · · ·+NK is dividedinto K strata of sizes N1, N2, . . . , NK . The sampling procedure is as follows: first asimple random sample of size k ≤ K of strata is selected. The selection procedure isindependent of the sizes of strata. The second step is then to select a simple randomsample in each of the selected strata. If stratum i is selected then we choose a simplerandom sample of size ni in this stratum for i = 1, 2, . . . , K. Assume the selectionprocess on the second step is independent of the selection process on the first step.Note that the sample size is not determined in advance in this case.

Let µi and σ2i be the population means and the population variances in strata

i = 1, 2, . . . , K. Let µ be the population mean for the entire population and σ2 thepopulation variance for for the entire population.

a. Suggest an unbiased estimator of the population mean µ. Prove that it is unbi-ased.

b. Compute the standard error of your unbiased estimator.

c. Suggest an unbiased estimator of the population variance σ2. Prove that it isunbiased.

2

Page 27: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

2. (25) Assume the data pairs (x1, y1), . . . , (xn, yn) are an i.i.d. sample from thedistribution with density

f(x, y) =

√v11v22 − v212

2πexp

(− 1

2

(v11x

2 + 2v12xy + v22y2))

.

In other words the observed pairs are a sample form the bivariate normal distributionwith expectation µ = 0 and covariance matrix

Σ =

(σ11 σ12σ12 σ22

)=

(v11 v12v12 v22

)−1

.

Denote

V =

(v11 v12v12 v22

).

a. Find the maximum likelihood estimates of the parameters v11, v12, v22.

b. Show that the Fisher information matrix is of the form

F =1

2(v11v22 − v212)2

v222 v212 −2v22v12v212 v211 −2v11v12

−2v22v12 −2v11v12 2(v11v22 + v212)

.

Give approximate standard errors for the maximum likelihood estimators v11,v22 and v12.

c. Show that

det (F) =1

4[det (V)]3 .

Show that the Fisher information matrix can also be written as

F =1

2

σ222 σ2

12 −2σ22σ12σ212 σ2

11 −2σ11σ12−2σ22σ12 −2σ11σ12 2(σ11σ22 + σ2

12

.

d. Give approximate standard errors for the maximum likelihood estimators σ11,σ22 and σ12 of the parameters σ11, σ22 and σ12. You can either use the Fisherinformation or do it directly.

3

Page 28: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

3. (25) Assume the data x1, . . . ,xn are a i.i.d. sample from the multivariate normaldistribution with parameters µ and Σ. The problem is to test the hypothesis

H0 : µ = kµ0 versus Hi : µ and µ0 are not colinear.

a. Assume that Σ is known and invertible. Find the likelihood ratio test statisticfor the above testing problem. What is the approximate distribution of the teststatistic if H0 holds.

b. Find the exact distribution of the test statistic from a. if H0 is true.

c. Assume Σ is unknown but assumed to be invertible. Find the likelihood ra-tio statistic in this case. You may assume the following: if A(p × p) is a givensymmetric positive definite matrix then the positive definite matrix Σ that max-imizes the expression

1

det(Σ)n/2· exp

(−1

2Tr(Σ−1A

))is the matrix

Σ =1

nA .

d. What is the approximate distribution of the test statistic in this case?

4

Page 29: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

4. (25) Assume the regression model

Yi1 = α + βxi1 + εiYi2 = α + βxi2 + ηi

for i = 1, 2, . . . , n. In other words the observation come in pairs. Assume that E(εi) =E(ηi) = 0, var(εi) = var(ηi) = σ2 and corr(εi, ηi) = ρ ∈ (−1, 1). Assume thatthe pairs (ε1, η1), . . . , (εn, ηn) are independent with the bivariate normal distribution.Furthermore, assume that∑

i,j

xij = 0 andn∑

i=1

xi1xi2 = 0 .

Assume that the design matrix

X =

1 x111 x121 x211 x22...

...1 xn11 xn2

is of full rank.

a. Assume the regression model

Y = Xβ + ε

with E (ε) = 0 and var (ε) = σ2Σ for a known invertible matrix Σ and unknownparameters β and σ2. Show that the best linear unbiased estimator of β is givenby

β =(XTΣ−1X

)−1XTΣ−1Y .

b. Give explicitly the best linear unbiased estimates for α and β in the modeldescribed in the introduction assuming ρ to be known.

c. For the model described in the introduction ordinary least squares can be usedto produce the estimator

β =(XTX

)−1XTY .

Show that this estimator is unbiased and define residuals

εij = Yij − α− βxij .

5

Page 30: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

Compute

E

(∑i,j

ε2ij

)as explicitely as possible.

d. Give an estimate of var(α) and var(β).

6

Page 31: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

MATHEMATICAL STATISTICS

Take-home final examination

July 3rd-July 10th, 2017

Instructions

You do not need to edit the solutions. Just make sure the handwriting is legible.The final solutions should be your work. The deadline for completion is July 10th,2017 by 4pm. Turn in your solutions to Petra Vranjes or send me a scanned versionof solutions. For any questions contact me by e-mail or call me at +386 41 725 497.

Statement: With my signature I confirm that the solutions are the product of myown work. Name: Signature: .

Page 32: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

1. (25) Often it is difficult to obtain honest answers from sample subjects to questionslike “Have you ever used heroin” or “Have you ever cheated on an exam”. To reducebias the method or randomized response is used. The sample subject is given one ofthe two statements below at random:

(1) “I have property A.’(2) “I do not have property A.’

The subject responds YES or NO to the given question. The pollster does not knowto which of the two statements the subject is responding. We assume:

• The subjects are a simple random sample of size n from a larger population ofsize N .• The statements are assigned to the chosen subjects independently.• The assignment of statements is independent of the sampling procedure.• The subjects respond honestly to the statements they are given.

Let

• p be the probability the a subject will be assigned the statement (1). Thisprobability is known and is part of the design.• q be the proportion of subjects in the population with property A.• r be the probability that a randomly selected subject responds YES to the state-

ment assigned.• R be the proportion of subjects in the sample who respond YES.

a. Justify that the probability that a randomly selected subject in the populationresponds YES to the statement assigned is equal for all subjects. Express thisprobability with p and q. Show that R is an unbiased estimate of r. Takeinto account that the assignment of statements is independent of the selectionprocedure.

b. Suggest an unbiased estimator of q. When is this possible? Express the varianceof the estimator with var(R).

c. Let NA be the random number of sample subjects with property A, and let NY bethe random number of sample subjects who respond YES. Compute E(NY |NA)and var(NY |NA).

d. Compute var(R). Give the standard error for the unbiased estimate of q.

2

Page 33: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

2. (25) Assume the data pairs (x1, y1), . . . , (xn, yn) are an i.i.d. sample from thedistribution with density

f(x, y) =

√v11v22 − v212

2πexp

(− 1

2

(v11x

2 + 2v12xy + v22y2))

.

In other words the observed pairs are a sample form the bivariate normal distributionwith expectation µ = 0 and covariance matrix

Σ =

(σ11 σ12σ12 σ22

)=

(v11 v12v12 v22

)−1

.

Denote

V =

(v11 v12v12 v22

).

a. Find the maximum likelihood estimates of the parameters v11, v12, v22.

b. Show that the Fisher information matrix is of the form

F =1

2(v11v22 − v212)2

v222 v212 −2v22v12v212 v211 −2v11v12

−2v22v12 −2v11v12 2(v11v22 + v212)

.

Give approximate standard errors for the maximum likelihood estimators v11,v22 and v12.

c. Show that

det (F) =1

4[det (V)]3 .

Show that the Fisher information matrix can also be written as

F =1

2

σ222 σ2

12 −2σ22σ12σ212 σ2

11 −2σ11σ12−2σ22σ12 −2σ11σ12 2(σ11σ22 + σ2

12

.

d. Give approximate standard errors for the maximum likelihood estimators σ11,σ22 and σ12 of the parameters σ11, σ22 and σ12. You can either use the Fisherinformation or do it directly.

3

Page 34: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

3. (25) Assume the data x1, . . . ,xn are a i.i.d. sample from the multivariate normaldistribution with parameters µ and Σ. The problem is to test the hypothesis

H0 : µ = kµ0 versus Hi : µ and µ0 are not colinear.

a. Assume that Σ is known and invertible. Find the likelihood ratio test statisticfor the above testing problem. What is the approximate distribution of the teststatistic if H0 holds.

b. Find the exact distribution of the test statistic from a. if H0 is true.

c. Assume Σ is unknown but assumed to be invertible. Find the likelihood ra-tio statistic in this case. You may assume the following: if A(p × p) is a givensymmetric positive definite matrix then the positive definite matrix Σ that max-imizes the expression

1

det(Σ)n/2· exp

(−1

2Tr(Σ−1A

))is the matrix

Σ =1

nA .

d. What is the approximate distribution of the test statistic in this case?

4

Page 35: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

4. (25) Assume the usual linear regression model with

Y = Xβ + ε

where X is fixed and known and E(ε) = 0 and var(ε) = σ2I.For a given n-dimensional vector a 6= 0 we would like to find the best linear

unbiased estimator v = LY with the property E(LY) = 0 of the quantity v = aTε inthe sense that the mean squared error

E[(v − v)2

]is as small as possible.

a. Show that LY = aT(Y −Xβ

)is the best estimator with the required proper-

ties. Here β is the least squares estimator of the parameter β.

b. Let β be the least squares estimator of β. Compute

E[(

aT ε− aTε)2]

,

where ε = (I−H) Y and H = X(XTX

)−1XT .

c. Let b be an m-dimensional fixed vector and define

v = aTε + bTβ .

We would like to find the linear estimator v = LY of v such that E(v) = bTβand the mean squared error

E[(v − v)2

]is as small as possible. Show first that for linear estimators with the givenproperties

E[(LY)− v)2

]= E

[(LY − E(v|Y))2

]+ E

[(E(v|Y)− v)2

].

d. Assume ε is multivariate normal which means that conditional expectations arelinear. Use c. to argue that v = aT ε + bT β is the best linear estimator of v inthe above sense. Is the assumption about normality essential?

5

Page 36: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

MATHEMATICAL STATISTICS

Take-home final examination

February 19th-February 26th, 2018

Instructions

You do not need to edit the solutions. Just make sure the handwriting is legible.The final solutions should be your work. The deadline for completion is February26th, 2018 by 4pm. Turn in your solutions to Petra Vranjes or Lidija Urek or send mea scanned version of solutions. For any questions contact me by e-mail or call me at+386 41 725 497.

Statement: With my signature I confirm that the solutions are the product of myown work. Name: Signature: .

Page 37: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

1. (25) Often it is difficult to obtain honest answers from sample subjects to questionslike “Have you ever used heroin” or “Have you ever cheated on an exam”. To reducebias the method or randomized response is used. The sample subject is given one ofthe two statements below at random:

(1) “I have property A.’(2) “I do not have property A.’

The subject responds YES or NO to the given question. The pollster does not knowto which of the two statements the subject is responding. We assume:

• The subjects are a simple random sample of size n from a larger population ofsize N .• The statements are assigned to the chosen subjects independently.• The assignment of statements is independent of the sampling procedure.• The subjects respond honestly to the statements they are given.

Let

• p be the probability the a subject will be assigned the statement (1). Thisprobability is known and is part of the design.• q be the proportion of subjects in the population with property A.• r be the probability that a randomly selected subject responds YES to the state-

ment assigned.• R be the proportion of subjects in the sample who respond YES.

a. Justify that the probability that a randomly selected subject in the populationresponds YES to the statement assigned is equal for all subjects. Express thisprobability with p and q. Show that R is an unbiased estimate of r. Takeinto account that the assignment of statements is independent of the selectionprocedure.

b. Suggest an unbiased estimator of q. When is this possible? Express the varianceof the estimator with var(R).

c. Let NA be the random number of sample subjects with property A, and let NY bethe random number of sample subjects who respond YES. Compute E(NY |NA)and var(NY |NA).

d. Compute var(R). Give the standard error for the unbiased estimate of q.

2

Page 38: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

2. (25) Assume the data pairs (x1, y1), . . . , (xn, yn) are an i.i.d. sample from thebivariate normal distribution with parameters

µ =

(µ1

µ2

)and Σ =

(σ11 σ12σ21 σ22

).

a. Find the maximum likelihood estimators of the parameters. Fix the estimatorsif necessary so that they will be unbiased and compute their variances.

b. Suppose the parameters µ1, σ11 and σ12 are known. Can you use this informationto improve the estimator of µ2. Compute the variance of the improved estimator.

c. Repeat the argument if only µ1 and σ11 are known. What would you do? Canyou compute the variance of the new estimator?

d. Suppose the parameters µ1, µ2, σ11 and σ12 are known. Can you give an improvedestimate of σ22? Prove that it is better than the maximum likelihood estimator.

3

Page 39: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

3. (25) Assume the data x1, . . . ,xn are a i.i.d. sample from the multivariate normaldistribution with parameters µ and Σ. The problem is to test the hypothesis

H0 : µ = kµ0 versus Hi : µ and µ0 are not colinear.

a. Assume that Σ is known and invertible. Find the likelihood ratio test statisticfor the above testing problem. What is the approximate distribution of the teststatistic if H0 holds.

b. Find the exact distribution of the test statistic from a. if H0 is true.

c. Assume Σ is unknown but assumed to be invertible. Find the likelihood ra-tio statistic in this case. You may assume the following: if A(p × p) is a givensymmetric positive definite matrix then the positive definite matrix Σ that max-imizes the expression

1

det(Σ)n/2· exp

(−1

2Tr(Σ−1A

))is the matrix

Σ =1

nA .

d. What is the approximate distribution of the test statistic in this case?

4

Page 40: MATHEMATICAL STATISTICS Homework assignment Instructionsvaljhun.fmf.uni-lj.si/~mihael/ef/ps/pdfdn/assgn.pdf · 2019-01-16 · MATHEMATICAL STATISTICS Homework assignment Instructions

Mathematical Statistics Final examination

4. (25) Assume the usual linear regression model with

Y = Xβ + ε

where X is fixed and known and E(ε) = 0 and var(ε) = σ2I.For a given n-dimensional vector a 6= 0 we would like to find the best linear

unbiased estimator v = LY with the property E(LY) = 0 of the quantity v = aTε inthe sense that the mean squared error

E[(v − v)2

]is as small as possible.

a. Show that LY = aT(Y −Xβ

)is the best estimator with the required proper-

ties. Here β is the least squares estimator of the parameter β.

b. Let β be the least squares estimator of β. Compute

E[(

aT ε− aTε)2]

,

where ε = (I−H) Y and H = X(XTX

)−1XT .

c. Let b be an m-dimensional fixed vector and define

v = aTε + bTβ .

We would like to find the linear estimator v = LY of v such that E(v) = bTβand the mean squared error

E[(v − v)2

]is as small as possible. Show first that for linear estimators with the givenproperties

E[(LY)− v)2

]= E

[(LY − E(v|Y))2

]+ E

[(E(v|Y)− v)2

].

d. Assume ε is multivariate normal which means that conditional expectations arelinear. Use c. to argue that v = aT ε + bT β is the best linear estimator of v inthe above sense. Is the assumption about normality essential?

5