[part 3: common effects ] 1/57 econometric analysis of panel data william greene department of...

61
[Part 3: Common Effects ] 1 /57 Econometric Analysis of Panel Data William Greene Department of Economics Stern School of Business

Upload: merry-higgins

Post on 28-Dec-2015

219 views

Category:

Documents


3 download

TRANSCRIPT

[Part 3: Common Effects ] 1/57

Econometric Analysis of Panel Data

William Greene

Department of Economics

Stern School of Business

[Part 3: Common Effects ] 2/57

Benefits of Panel Data Time and individual variation in behavior

unobservable in cross sections or aggregate time series

Observable and unobservable individual heterogeneity

Rich hierarchical structures More complicated models Features that cannot be modeled with only cross

section or aggregate time series data alone Dynamics in economic behavior

[Part 3: Common Effects ] 3/57

Short Term Agenda for Simple Effects Models

Models with individual effects Interpretation of models Computation (practice) and estimation (theory)

Extensions Nonstandard panels: Rotating, Pseudo-, Nested Generalizing the regression model Alternative estimators

Methods Least squares: OLS, GLS, FGLS MLE and Maximum Simulated Likelihood

[Part 3: Common Effects ] 4/57

Fixed and Random Effects Unobserved individual effects in regression: E[yit | xit, ci]

Notation:

Linear specification: Fixed Effects: E[ci | Xi ] = g(Xi). Cov[xit,ci] ≠0 effects are correlated with included variables.

Random Effects: E[ci | Xi ] = μ; effects are uncorrelated with included variables. If Xi contains a constant term, μ=0 WLOG. Common: Cov[xit,ci] =0, but E[ci | Xi ] = μ is needed for the full model

it it i ity = + c + x

i

i1

i2i i

iT

T rows, K columns

x

xX

x

[Part 3: Common Effects ] 5/57

Convenient Notation Fixed Effects – the ‘dummy variable model’

Random Effects – the ‘error components model’

it i it ity = + + x

it it it iy = + + u x

Individual specific constant terms.

Compound (“composed”) disturbance

[Part 3: Common Effects ] 6/57

Balanced and Unbalanced Panels Distinction: Balanced vs. Unbalanced Panels A notation to help with mechanics

zi,t, i = 1,…,N; t = 1,…,Ti

The role of the assumption Mathematical and notational convenience:

Balanced, n=NT Unbalanced:

Is the fixed Ti assumption ever necessary? Almost never. (Baltagi chapter 9 is about algebra, not different models!)

Is unbalancedness due to nonrandom attrition from an otherwise balanced panel? This will require special considerations.

N

ii=1n T

[Part 3: Common Effects ] 7/57

An Unbalanced Panel: RWM’s GSOEP Data on Health Care

N = 7,293 Households

Some households exited then returned

[Part 3: Common Effects ] 8/57

Exogeneity Contemporaneous exogeneity

E[εit|xit,ci]=0 Not sufficient for regression Doesn’t imply how to estimate β

Strict exogeneity – the most common assumption E[εit|xi1, xi2,…,xiT,ci]=0 Can use first difference or fixed effects Cannot hold if xit contains lagged values of yit

Sequential exogeneity? E[εit|xi1, xi2,…,xit,ci] = 0

These assumptions are not testable. They are part of the model.

[Part 3: Common Effects ] 9/57

Assumptions for Asymptotics

Convergence of moments involving cross section Xi. N increasing, T or Ti assumed fixed.

“Fixed T asymptotics” (see text, p. 175) Time series characteristics are not relevant (may be

nonstationary) If T is also growing, need to treat as multivariate time series.

Ranks of matrices. X must have full column rank. (Xi may not, if Ti < K.)

Strict exogeneity and dynamics. If xit contains yi,t-1 then xit cannot be strictly exogenous. Xit will be correlated with the unobservables in period t-1. (To be revisited later.)

Empirical characteristics of microeconomic data

[Part 3: Common Effects ] 10/57

Estimating β

β is the partial effect of interest Can it be estimated (consistently) in the

presence of (unmeasured) ci? Does pooled least squares “work?” Strategies for “controlling for ci” using the

sample data Using a proxy variable.

[Part 3: Common Effects ] 11/57

The Pooled Regression

Presence of omitted effects

Potential bias/inconsistency of OLS – depends on ‘fixed’ or ‘random’

it it i it

i i i i i

i i i i i i i

Ni=1 i

y = +c+ε , observation for person i at time t

= +c + , T observations in group i

= + + , note (c ,c ,...,c )

= + + , T observations in the sample

x β

y Xβ i ε

Xβ c ε c

y Xβ c ε

[Part 3: Common Effects ] 12/57

[Part 3: Common Effects ] 13/57

Most Helpful Customer Reviews31 of 39 people found the following review helpful Too theoretical and poorly writtenBy Doktor Faustus on May 7, 2013Format: Hardcover Econometric Analysis" by William Greene is one of the more widely use graduate-level textbooks in econometrics. I used it in my first year PhD econometrics course. This is unfortunate for several reasons. The book states that its first objective is to introduce students to applied econometrics, especially the basic techniques of linear regression. When reading the book, however, what the reader notices first is that the applications are essentially just footnotes; the meat of each chapter is dense econometric theory. An applied textbook would focus on working with data, but Greene's book has exercises that focus on proving obscure statistical properties (i.e. prove that the asymptotic variance of various estimators goes to zero). Useful for theorists, but not for applied work, which is what the book advertises itself as.

Another problem with the book is its impenetrable text. Reading this book is drudgery even when not trying to make sense of the absurdly huge matrix equations. Greene uses academic, elevated language that does not belong in a technical textbook. Where the student needs clear explanation, he instead reads sentences like the following found in a chapter introduction: "We first consider the consequences for the least squares estimator of the more general form of the regression model. This will include assessing the effect of ignoring the complication of the generalized model and of devising an appropriate estimation strategy, still based on least squares". After reading that second sentence several times I still don't understand what Greene is trying to convey.

Finally the book is much too large and expensive for a class textbook. The book is 1200 pages long and includes numerous asides in every chapter. If the objective of the book is to teach econometrics to graduate students (as it says in the book), then it would be better off focusing on important topics and applications, not on topics that are never used by the vast majority of economists. I do not recommend this book for anyone; there are better econometrics textbooks available for undergraduates, graduate students, and professionals.

[Part 3: Common Effects ] 14/57

October 13, 2014By Daniel PulidoThis review is from: Econometric Analysis (7th Edition) (Hardcover)The delivery was fine. But the book itself is the worst Econometric Analysis book I have ever come across. No examples. Only a continuous list of theorems. I would not recommend anyone this book.

[Part 3: Common Effects ] 15/57

A Popular MisconceptionIf only one variable in X is correlated with , the other

coefficients are consistently estimated. False.

The problem is “smeared” over the other coefficients.

1

111

21-1

1

1

1

Suppose only the first variable is correlated with

0Under the assumptions, plim( /n) = . Then

...

.

q0 q

plim = plim( /n)... .... q

times

K

ε

X'ε

b- β X'X

-1 the first column of Q

[Part 3: Common Effects ] 16/57

OLS with Individual Effects

-1 -1

-1N Ni=1 i i i=1

-1N Ni=1 i=1

=( ) ( ) ( + + )

= + (1/N)Σ (1/N)Σ (part due to the omitted c )

+ (1/N)Σ (1/N)Σ (covariance of and will = 0)

The third term

ii i

i i i i

b XX X'y = XX X' Xβ c ε

β XX X c

X X X ε X ε

-1N N ii=1 i i i=1 i

Ni=1 i i

i

vanishes asymptotically by assumption

T1plim = + plim Σ Σ c (left out variable formula)

N N

So, what becomes of Σ w c ?

plim = if the covariance of and c

i

i

i

b β XX x

x

b β x converges to zero.

[Part 3: Common Effects ] 17/57

Mundlak’s Estimator

ii i i i i1 i1 iT i

i

i i i i i

i i i i

i i

Write c = u , E[c | , ,... ]

Assume c contains all time invariant information

= +c + , T observations in group i

= + + + u

Looks like random effects.

Var[ + u ]=

xδ x x x = xδ

y Xβ i ε

Xβ ixδ ε i

ε i Ω +

May be estimable by 2 step FGLS.

2i uσ ii

Mundlak, Y., “On the Pooling of Time Series and Cross Section Data, Econometrica, 46, 1978, pp. 69-85.

[Part 3: Common Effects ] 18/57

Chamberlain’s (1982) Approach

Use a linear projection, not necessarily the conditional mean.

i

i

i i1 i1 iT i1 1 i2 2 iT T

i i i1 i1 iT i i it

it it i1 1 i2 2 iT T it i

P[c | , ,... ] ...

c P[c | , ,... ] u, cov[u, ]=

y = + ... + ε u

x x x = x +x x

x x x x 0

x β x +x x

This “regression” can be computed T times, using one year at a time.

How would we reconcile the multiple estimators of each parameter?.

[Part 3: Common Effects ] 19/57

Chamberlain’s (1982) Approach

i

i

i i1 i1 iT i1 1 i2 2 iT T

i i i1 i1 iT i i it

it it i1 1 i2 2 iT T it i

i1 i1 1 i2 2 iT T i1 i

i2 i1 1

P[c | , ,... ] ...

c P[c | , ,... ] u, cov[u, ]=

y = + ... + ε u

Period 1

y = ( + ) ... + ε u

Period 2

y =

x x x = x +x x

x x x x 0

x β x +x x

x β +x x

x +x

i2 2 iT T i2 i( + ) ... + ε u

and so on...

β x

[Part 3: Common Effects ] 20/57

Proxy Variables Proxies for unobserved effects: e.g., Test score for

unobserved ability Interest is in δ(xit,ci)=E[yit|xit,ci]/xit

Since ci is unobserved, we seek APE = Ec[δ(xit,ci)] Proxy has two characteristics

Ignorable in the model: E[yit|xit,zi,ci] = E[yit|xit,ci]

‘Explains’ ci in that E[ci|zi,xit] = E[ci|zi]. In the presence of zi, xit does not further ‘explain ci.’

Then, Ec[δ(xit,ci)] = Ez{E[yit|xit,zi]/xit}

Proof: See Wooldridge, pp. 23-24. Loose ends:

Where do you get the proxy? What is E[yit|xit,zi]? Use the linear projection and hope for the

best.

[Part 3: Common Effects ] 21/57

Estimating the Sampling Variance of b

s2(X ́X)-1? Correlation across

observations Heteroscedasticity

A “robust” covariance matrix Robust estimation (in general) The White estimator A Robust estimator for OLS.

[Part 3: Common Effects ] 22/57

A ‘Cluster’ Estimator

it it i it

it it it is

y = +(c+ε )

= +v , Cov[v ,v ] 0

x β

x β

i

i

TN 2 2 2i=1 t=1 it

T2 N 2 Ni=1 t=1 it i=1 i

Pseudo-log likelihood that produces OLS as the estimator

logL*= (-1/2)Σ (logσ +log2π+v /σ

ˆThe solution for will always be [ Σ v ] / T ,

so concentrate on . The solution wβ

i

i

-1

TN 2 Ni=1 t=1 it it i=1

T2 N 2 2i=1 t=1 it it

ill be =( )

logL*/ = Σ v /σ .

logL*/ = - Σ /σ (1/ σ ) and =

Var (- )Var[ ](- )

Var[ ] is usually , but not here because of co

i

-1 -1

b XX Xy

β x g g

β β x x XX=H E[H]

[b]= H g H

g H

Ni=1

rrelation across

observations. Approximate Var[ ] with .i ig gg

[Part 3: Common Effects ] 23/57

Cluster Estimator (cont.)

[Part 3: Common Effects ] 24/57

Cornwell and Rupert Data

Cornwell and Rupert Returns to Schooling Data, 595 Individuals, 7 YearsVariables in the file are

EXP = work experienceWKS = weeks workedOCC = occupation, 1 if blue collar, IND = 1 if manufacturing industrySOUTH = 1 if resides in southSMSA = 1 if resides in a city (SMSA)MS = 1 if marriedFEM = 1 if femaleUNION = 1 if wage set by union contractED = years of educationLWAGE = log of wage = dependent variable in regressions

These data were analyzed in Cornwell, C. and Rupert, P., "Efficient Estimation with Panel Data: An Empirical Comparison of Instrumental Variable Estimators," Journal of Applied Econometrics, 3, 1988, pp. 149-155.  See Baltagi, page 122 for further analysis.  The data were downloaded from the website for Baltagi's text.

[Part 3: Common Effects ] 25/57

Application: Cornell and Rupert

[Part 3: Common Effects ] 26/57

Bootstrapping

Some assumptions that underlie it - the sampling mechanism

Method:1. Estimate using full sample: --> b2. Repeat R times:

Draw n observations from the n, with replacement

Estimate with b(r). 3. Estimate variance with

V = (1/R)r [b(r) - b][b(r) - b]’

[Part 3: Common Effects ] 27/57

Bootstrap Applicationmatr;bboot=init(7,21,0.)$ Store results herename;x=one,occ,…,exp$ Define Xregr;lhs=lwage;rhs=x$ Compute bcalc;i=0$ CounterProc Define procedureregr;lhs=lwage;rhs=x;quietly$ …

Regressionmatr;{i=i+1};bboot(*,i)=b$... Store b(r)Endproc Ends procedureexec;n=20;bootstrap=b$ 20 bootstrap repsmatr;list;bboot' $ Display results

[Part 3: Common Effects ] 28/57

Results of Bootstrap Procedure

[Part 3: Common Effects ] 29/57

Bootstrap Replications

Full sample result

Bootstrapped sample results

[Part 3: Common Effects ] 30/57

Bootstrap variance for a panel data estimatorPanel Bootstrap = Block BootstrapData set is N groups of size Ti

Bootstrap sample is N groups of size Ti drawn with replacement.

[Part 3: Common Effects ] 31/57

[Part 3: Common Effects ] 32/57

Bootstrapping Naïve bootstrap: Why is it naïve? Cases when it fails

Time series “Clustered data” Order statistics Parameters on the edge of the parameter space

Alternatives Block bootstrap “Wild” bootstrap (injects extra randomness)

[Part 3: Common Effects ] 33/57

Using First Differences

Eliminating the heterogeneity

it it i,t-1 it i it

it it

y = y -y = ( ) + c + ε

= ( ) + u

Note: Time invariant variables become zero

Time trend becomes the constant term

Time dummy variables become

x β

x β

(0,...,1,-1,0,0...)

it it i ity = +c+ε , observation for person i at time tx β

[Part 3: Common Effects ] 34/57

OLS with First Differences

With strict exogeneity of (Xi,ci), OLS regression of Δyit on Δxit is unbiased and consistent but inefficient.

i i

2 2i,2 i,1

2 2 2i,3 i,2

2 2

2 2i,T i,T 1

2 0 0

2Var (Toeplitz form)

0

0 2

GLS is unpleasantly complicated. In order to compute a first step estimator of σε

2 we would use fixed effects. We should just stop there. Or, use OLS in first differences and use Newey-West with one lag.

[Part 3: Common Effects ] 35/57

Two PeriodsWith two periods and strict exogeneity,

This is a classical regression model. If there are no regressors,

it i2 i,1 0 i2 i1 i

i

i 0 1 i i

i

y = y -y = + ( - ) + u

Consider a "treatment, D ," that takes place between

time 1 and time 2 for some of the individuals

y= + ( ) + D + u

D = the "treatment dummy"i

x x β

x β

1

0 i

ˆ y| treatment - y| control

= "difference in differences" estimator.

ˆ Average change in y for the "treated"

[Part 3: Common Effects ] 36/57

Difference-in-Differences ModelWith two periods and strict exogeneity of D and T,

This is a linear regression model. If there are no regressors,

it 0 1 it 2 t 3 t it it

it

t

y = D T TD

D = dummy variable for a treatment that takes place

between time 1 and time 2 for some of the individuals,

T = a time period dummy variable, 0 in period 1,

1 in period 2.

3 2 1 D 1 2 1 D 0

Using least squares,

b (y y ) (y y )

[Part 3: Common Effects ] 37/57

Difference in Differences

it 0 1 2 3

it 2 3 2

2 3 2

it it

3

y = D T D T , 1,2

y = D ( )

= D ( ) u

y | D 1 y | D 0

( | D 1) ( | D 0)

If the same individual is observed in both states

it t it t it it

i it it

i it i

it it

tβx

βx

β x

β x x

,

the second term is zero. If the effect is estimated by

averaging individuals with D = 1 and different individuals

with D=0, then part of the 'effect' is explained by change

in the covariates, not the treatment.

[Part 3: Common Effects ] 38/57

http://dera.ioe.ac.uk/14610/1/oft1416.pdf

[Part 3: Common Effects ] 39/57

Outcome is the fees charged.

Activity is collusion on fees.

[Part 3: Common Effects ] 40/57

Treatment Schools: Treatment is an intervention by the Office of Fair Trading

Control Schools were not involved in the conspiracy

Treatment is not voluntary

[Part 3: Common Effects ] 41/57

[Part 3: Common Effects ] 42/57

[Part 3: Common Effects ] 43/57

Treatment (Intervention) Effect = 1 + 2 if SS school

[Part 3: Common Effects ] 44/57

In order to test robustness two versions of the fixed effects model were run. The first is Ordinary Least Squares, and the second is heteroscedasticity and auto-correlation robust (HAC) standard errors in order to check for heteroscedasticity and autocorrelation.

[Part 3: Common Effects ] 45/57

[Part 3: Common Effects ] 46/57

[Part 3: Common Effects ] 47/57

[Part 3: Common Effects ] 48/57

D-in-D Model: Natural ExperimentWith two periods and strict exogeneity,

This is a classical regression model. If there are no regressors,

it 0 1 2 2 3

i2

y = D T T D

D = dummy variable for a treatment that takes place

between time 1 and time 2 for some of the individuals,

T = a time period dummy variable, 0 in period 1,

1

i t t it it

t

in period 2.

3 2 1 D 1 2 1 D 0

Using least squares,

b (y y ) (y y )

[Part 3: Common Effects ] 49/57

D-i-D Card and Krueger: “Minimum Wages and Employment: A

Case Study of the Fast Food Industry in New Jersey and Pennsylvania,” AER, 84(4), 1994, 772-793.

Pennsylvania vs. New Jersey

1991, NJ raises minimum wage

Compare change in employment PA after the change to change in employment in NJ after the change.

Differences cancel out other things specific to the state that would explain change in employment.

[Part 3: Common Effects ] 50/57

A Tale of Two Cities A sharp change in policy can constitute a

natural experiment The Mariel boatlift from Cuba to Miami (May-

September, 1980) increased the Miami labor force by 7%. Did it reduce wages or employment of non-immigrants?

Compare Miami to Los Angeles, a comparable (assumed) city.

Card, David, “The Impact of the Mariel Boatlift on the Miami Labor Market,” Industrial and Labor Relations Review, 43, 1990, pp. 245-257.

[Part 3: Common Effects ] 51/57

Difference in Differences

i i,T

i,0

i individual, T = 0 for no immigration, T=1 for immigration

(Y | T) Y 1 if unemployed, 0 if employed.

c = city, t = period.

Unemployment rate in city c at time t is E[Y | c,t] with no migration

Unemplo

i,1

i,0 t c

i,1 t c

i,0

yment rate in city c at time t is E[Y | c, t] with migration

Assume E[Y | c,t]

E[Y | c,t]

E[Y | c,t]

the effect of the immigration on the unemployment rate.

[Part 3: Common Effects ] 52/57

Applying the Model c = M for Miami, L for Los Angeles Immigration occurs in Miami, not Los Angeles T = 1979, 1981 (pre- and post-) Sample moment equations: E[Yi|c,t,T]

E[Yi|M,79] = β79 + γM

E[Yi|M,81] = β81 + γM + δ

E[Yi|L,79] = β79 + γL

E[Yi|M,79] = β81 + γL

It is assumed that unemployment growth in the two cities would be the same if there were no immigration.

[Part 3: Common Effects ] 53/57

Implications for Differences Neither city exposed to migration

E[Yi,0|M,81] - E[Yi,0|M,79] = [β81 + γM ] – [β79 + γM] ( Miami)

E[Yi,0|L,81] - E[Yi,0|L,79] = [β81 + γL ] – [β79 + γL] (LA)

Both cities exposed to migration E[Yi,1|M,81] - E[Yi,1|M,79] = [β81 + γM ] – [β79 + γM] + δ (Miami)

E[Yi,1|L,81] - E[Yi,1|L,79] = [β81 + γL ] – [β79 + γL] + δ (LA)

One city (Miami) exposed to migration: The difference in differences is. Miami change - Los Angeles change {E[Yi,1|M,81] - E[Yi,1|M,79]} – {E[Yi,0|L,81] - E[Yi,0|L,79]}

= δ (Miami)

[Part 3: Common Effects ] 54/57

The Tale

1979

1980

1981

1982

1983

1984

1985

In 79, Miami unemployment is 2.0% lowerIn 80, Miami unemployment is 7.1% lowerFrom 79 to 80, Miami gets 5.1% better

In 81, Miami unemployment is 3.0% lowerIn 82, Miami unemployment is 3.3% higherFrom 81 to 82, Miami gets 6.3% worse

[Part 3: Common Effects ] 55/57

Application of a Two Period Model

“Hemoglobin and Quality of Life in Cancer Patients with Anemia,”

Finkelstein (MIT), Berndt (MIT), Greene (NYU), Cremieux (Univ. of Quebec)

1998 With Ortho Biotech – seeking to change

labeling of already approved drug ‘erythropoetin.’ r-HuEPO

[Part 3: Common Effects ] 56/57

[Part 3: Common Effects ] 57/57

QOL Study Quality of life study

i = 1,… 1200+ clinically anemic cancer patients undergoing chemotherapy, treated with transfusions and/or r-HuEPO

t = 0 at baseline, 1 at exit. (interperiod survey by some patients was not used)

yit = self administered quality of life survey, scale = 0,…,100 xit = hemoglobin level, other covariates

Treatment effects model (hemoglobin level) Background – r-HuEPO treatment to affect Hg level

Important statistical issues Unobservable individual effects The placebo effect Attrition – sample selection FDA mistrust of “community based” – not clinical trial based

statistical evidence Objective – when to administer treatment for maximum

marginal benefit

[Part 3: Common Effects ] 58/57

Regression-Treatment Effects Model

it t

7 8 9 157 it 8 it 9 it 15 it

i it

it

7it it

QOL + "other covariates"

+ Hb + Hb + Hb + ... Hb

+ c + ε

Hb hemoglobin level, grams/deciliter, range 3+ to 15

Hb 1(3 Hb < 7.5) (Base

7

8it it

15it it

case; = 0)

Hb 1(7.5 Hb < 8.5)

Hb 1(14.5 Hb 15)

[Part 3: Common Effects ] 59/57

Effects and Covariates Individual effects that would impact a self

reported QOL: Depression, comorbidity factors (smoking), recent financial setback, recent loss of spouse, etc.

Covariates Change in tumor status Measured progressivity of disease Change in number of transfusions Presence of pain and nausea Change in number of chemotherapy cycles Change in radiotherapy types Elapsed days since chemotherapy treatment Amount of time between baseline and exit

[Part 3: Common Effects ] 60/57

First Differences Model

i i1 i0

15 j j Kj 8 j i1 i0 k 1 k ik,1 ik,0 i0 i

QOL QOL QOL

= (Hb Hb ) (x x ) QOL + u

i i1 i0

15 j j K1 0 j 8 j i1 i0 k 1 k ik,1 ik,0 i1 i0

QOL QOL QOL

= ( ) (Hb Hb ) (x x )

0i0 i1 i i0

01 0

Regression to the mean (the "tendency to mediocrity")

u (QOL QOL ) Expect 0 < 1

implies

= QOL

[Part 3: Common Effects ] 61/57

FindingOptimal treatment. Conventional wisdom and assumption of policy.

Study finding

Note the implication of the study for the location of the optimal point for the treatment. Largest marginal benefit moves from the left tail to the center.