christopher dougherty ec220 - introduction to econometrics (chapter 8) slideshow: measurement error...

Christopher Dougherty

EC220 - Introduction to econometrics (chapter 8)Slideshow: measurement error

Original citation:

Dougherty, C. (2012) EC220 - Introduction to econometrics (chapter 8). [Teaching Resource]

© 2012 The Author

This version available at: http://learningresources.lse.ac.uk/134/

Available in LSE Learning Resources Online: May 2012

This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 License. This license allows the user to remix, tweak, and build upon the work even for commercial purposes, as long as the user credits the author and licenses their new creations under the identical terms. http://creativecommons.org/licenses/by-sa/3.0/

http://learningresources.lse.ac.uk/

MEASUREMENT ERROR

vZY 21

wZX

1

In this sequence we will investigate the consequences of measurement errors in the variables in a regression model. To keep the analysis simple, we will confine it to the simple regression model.

MEASUREMENT ERROR

vZY 21

wZX

2

We will start with measurement errors in the explanatory variable. Suppose that Y is determined by a variable Z, but Z is subject to measurement error, w. We will denote the measured explanatory variable X.

MEASUREMENT ERROR

vZY 21

wZX

uX

wvX

vwXY

21

221

21 )(

3

Substituting for Z from the second equation, we can rewrite the model as shown.

MEASUREMENT ERROR

vZY 21

wZX

uX

wvX

vwXY

21

221

21 )(

4

We are thus able to express Y as a linear function of the observable variable X, with the disturbance term being a compound of the disturbance term in the original model and the measurement error.

wvu 2

MEASUREMENT ERROR

vZY 21

wZX

uX

wvX

vwXY

21

221

21 )(

w

5

However if we fit this model using OLS, Assumption B.7 will be violated. X has a random component, the measurement error w.

MEASUREMENT ERROR

vZY 21

wZX

6

And w is also one of the components of the compound disturbance term. Hence u is not distributed independently of X.

uX

wvX

vwXY

21

221

21 )(

w w

MEASUREMENT ERROR

vZY 21

wZX

uXY 21

wvu 2

7

We will demonstrate that the OLS estimator of the slope coefficient is inconsistent and that in large samples it is biased downwards if 2 is positive, and upwards if 2 is negative.

2222

22

22

1

1

][

XXn

uuXXn

XX

uuXX

XX

uuXXXX

XX

YYXXb

i

ii

i

ii

i

iii

i

ii

MEASUREMENT ERROR

vZY 21

wZX

uXY 21

wvu 2

8

We begin by writing down the OLS estimator and substituting for Y from the true model. In this case there are alternative versions of the true model. The analysis is simpler if you use the equation relating Y to X.

2222

22

22

1

1

][

XXn

uuXXn

XX

uuXX

XX

uuXXXX

XX

YYXXb

i

ii

i

ii

i

iii

i

ii

MEASUREMENT ERROR

vZY 21

wZX

uXY 21

wvu 2

9

Simplifying, we decompose the slope coefficient into the true value and an error term as usual.

2222

22

22

1

1

][

XXn

uuXXn

XX

uuXX

XX

uuXXXX

XX

YYXXb

i

ii

i

ii

i

iii

i

ii

MEASUREMENT ERROR

10

We have reached this point many times before. We would like to investigate whether b2 is biased. This means taking the expectation of the error term.

vZY 21

wZX

uXY 21

wvu 2

2222

XX

uuXXβ

XX

YYXXb

i

ii

i

ii

MEASUREMENT ERROR

11

However, it is not possible to obtain a closed-form expression for the expectation of the error term. Both its numerator and its denominator are functions of w and there are no expected value rules that can allow us to simplify.

vZY 21

wZX

uXY 21

wvu 2

w

w

2222

XX

uuXXβ

XX

YYXXb

i

ii

i

ii

MEASUREMENT ERROR

12

As a second-best measure, we take plims and investigate what would happen in large samples. The plim quotient rule allows us to write the plim of the error term as the plim of the numerator divided by the plim of the denominator, provided that these plims exist.

vZY 21

wZX

uXY 21

wvu 2

)var(),cov(

1 plim

1 plim

plim 22

22 XuX

βXX

n

uuXXnβb

i

ii

2222

XX

uuXXβ

XX

YYXXb

i

ii

i

ii

MEASUREMENT ERROR

13

For the plims to exist, the numerator and denominator must both be divided by n. Otherwise they would increase indefinitely with the sample size.

vZY 21

wZX

uXY 21

wvu 2

)var(),cov(

1 plim

1 plim

plim 22

22 XuX

βXX

n

uuXXnβb

i

ii

2222

XX

uuXXβ

XX

YYXXb

i

ii

i

ii

MEASUREMENT ERROR

14

Having divided by n, it can be shown that the plim of the numerator is cov(X, u) and the plim of the denominator is var(X).

vZY 21

wZX

uXY 21

wvu 2

2222

XX

uuXXβ

XX

YYXXb

i

ii

i

ii

)var(),cov(

1 plim

1 plim

plim 22

22 XuX

βXX

n

uuXXnβb

i

ii

MEASUREMENT ERROR

vZY 21

wZX

15

uXY 21

wvu 2

22

2

2222 )var(),cov(

plimwZ

w

XuX

βb

2

2

22

2

000

,cov,cov,cov,cov

,cov,cov

w

wwwZvwvZ

wvwZuX

We can decompose both the numerator and the denominator of the error term. We will start by substituting for X and u in the numerator.

MEASUREMENT ERROR

vZY 21

wZX

16

uXY 21

wvu 2

22

2

2222 )var(),cov(

plimwZ

w

XuX

βb

2

2

22

2

000

,cov,cov,cov,cov

,cov,cov

w

wwwZvwvZ

wvwZuX

We expand the expression using the first covariance rule.

MEASUREMENT ERROR

vZY 21

wZX

17

uXY 21

wvu 2

22

2

2222 )var(),cov(

plimwZ

w

XuX

βb

2

2

22

2

000

,cov,cov,cov,cov

,cov,cov

w

wwwZvwvZ

wvwZuX

If we assume that Z, v, and w are distributed indepndently of each other, the first 3 terms are 0. The last term gives us –2w

2.

MEASUREMENT ERROR

vZY 21

wZX

18

uXY 21

wvu 2

22

2

2222 )var(),cov(

plimwZ

w

XuX

βb

2

2

22

2

000

,cov,cov,cov,cov

,cov,cov

w

wwwZvwvZ

wvwZuX

We next expand the denominator of the error term. The first two terms are variances. The covariance is 0 if we assume w is distributed independently of Z.

0),cov(2)var()var()var()var( 22 wZwZwZwZX

MEASUREMENT ERROR

vZY 21

wZX

19

uXY 21

wvu 2

22

2

2222 )var(),cov(

plimwZ

w

XuX

βb

2

2

22

2

000

,cov,cov,cov,cov

,cov,cov

w

wwwZvwvZ

wvwZuX


Thus in large samples, b2 is biased towards 0 and the size of the bias depends on the relative sizes of the variances of w and Z.

MEASUREMENT ERROR

vZY 21

wZX

20

uXY 21

wvu 2

22

2

2222 )var(),cov(

plimwZ

w

XuX

βb

2

2

22

2

000

,cov,cov,cov,cov

,cov,cov

w

wwwZvwvZ

wvwZuX


Since b2 is an inconsistent estimator, it is safe to assume that it is biased in finite samples as well.

MEASUREMENT ERROR

vZY 21

wZX

21

uXY 21

wvu 2

22

2

2222 )var(),cov(

plimwZ

w

XuX

βb

2

2

22

2

000

,cov,cov,cov,cov

,cov,cov

w

wwwZvwvZ

wvwZuX


If our assumptions concerning Z, v, and w are incorrect, b2 would almost certainly still be a biased estimator, but the expression for the bias would be more complicated.

MEASUREMENT ERROR

vZY 21

wZX

22

uXY 21

wvu 2

22

2

2222 )var(),cov(

plimwZ

w

XuX

βb

2

2

22

2

000

,cov,cov,cov,cov

,cov,cov

w

wwwZvwvZ

wvwZuX


A further consequence of the violation of Assumption B.7 is that the standard errors, t tests and F test are invalid.

MEASUREMENT ERROR

vXQ 21

rQY

23

Measurement error in the dependent variable has less serious consequences. Suppose that the true dependent variable is Q, that the measured variable is Y, and that the measurement error is r.

MEASUREMENT ERROR

vXQ 21

rQY

vXrY 21

24

We can rewrite the model in terms of the observable variables by substituting for Q from the second equation.

MEASUREMENT ERROR

vXQ 21

rQY

uX

rvXY

21

21

rvu

vXrY 21

25

In this case the presence of the measurement error does not lead to a violation of Assumption B.7. If v satisfies that assumption in the original model, u will satisfy it in the revised one, unless for some strange reason r is not distributed independently of X.

MEASUREMENT ERROR

26

vXQ 21

rQY

uX

rvXY

21

21

The standard errors and tests will remain valid. However the standard errors will tend to be larger than they would have been if there had been no measurement error, reflecting the fact that the variances of the coefficients are larger.

rvu

vXrY 21

2

22

2

22

2

X

rv

X

ub nn

Copyright Christopher Dougherty 2011.

These slideshows may be downloaded by anyone, anywhere for personal use.Subject to respect for copyright and, where appropriate, attribution, they may be used as a resource for teaching an econometrics course. There is no need to refer to the author.

The content of this slideshow comes from Section 8.4 of C. Dougherty, Introduction to Econometrics, fourth edition 2011, Oxford University Press. Additional (free) resources for both students and instructors may be downloaded from the OUP Online Resource Centre http://www.oup.com/uk/orc/bin/9780199567089/.

Individuals studying econometrics on their own and who feel that they might benefit from participation in a formal course should consider the London School of Economics summer school courseEC212 Introduction to Econometrics http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspxor the University of London International Programmes distance learning course20 Elements of Econometricswww.londoninternational.ac.uk/lse.

11.07.24

christopher dougherty ec220 - introduction to econometrics (chapter 8) slideshow: measurement error...

Documents

error term

true model

original model

compound disturbance

simple regression model

variable z

explanatory variable

true value