chapter 1. linear panel models and heterogeneity types of restrictions can be imposed ... an...

Chapter 1. Linear Panel Models and Heterogeneity

School of Economics and Management - University of Geneva

Christophe Hurlin, Université of Orléans

University of Orléans

February 2018

C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 1 / 258

Introduction

The outline of this chapter is the following:

Section 1: Speci�cation tests and analysis of covariance

Section 2: Linear unobserved e¤ects panel data models

Section 3: Fixed e¤ects estimation methods

Section 4: Random e¤ects estimation methods

Section 5: Speci�cation tests: random or �xed e¤ects?

Subsection 5.1: The Mundlak�s speci�cation

Subsection 5.2: The Hausman�s test

Section 6: Heterogeneous panel data models

Subsection 6.1: Random coe¢ cient models

Subsection 6.2: Other heterogeneous models

Section 1

Speci�cation tests and analysis of covariance

1. Speci�cation tests and analysis of covariance

Objectives

1 De�ne the concept of homogeneous panel data model.

2 De�ne the concept of heterogeneous panel data model.

3 De�ne the concept of individual (unobserved) e¤ects.

4 Introduce the speci�cation tests (Hsiao, 2003).

5 Propose an empirical application for the strike days in OECD.

Notations

Let us consider the following linear model

yit = αit + β0itxit + εit

8 i = 1, .., n, 8 t = 1, ..,T

αit is a scalar that varies across i and t.

βit = (β1it , β2it , ..., βKit )0 is a K � 1 vector of parameters that vary

across i and t,

xit = (x1it , ..., xKit )0 is a K � 1 vector of exogenous variables,

εit is an error term.

Di¤erentrestrictions on the regression coe¢ cients can be tested:

1 the homogeneity of regression slope coe¢ cients

2 the homogeneity of regression intercept coe¢ cients

3 the time stability of parameters (slopes and constants). We will notconsider this issue here (since it is not speci�c to panel data models).

Fact (Time stability)

We assume that the parameters are constant over time (no structuralbreak, no regime switching, etc.), but can vary across individuals.

yit = αi + β0ixit + εit

Three types of restrictions can be imposed on this model.

1 Regression slope coe¢ cients are identical, and intercepts are not(model with individual / unobserved e¤ects).

yit = αi + β0xit + εit

2 Regression intercepts are the same, and slope coe¢ cients are not(unusual).

yit = α+ β0ixit + εit

3 Both slope and intercept coe¢ cients are the same (homogeneous /pooled panel).

yit = α+ β0xit + εit

De�nition (Heterogeneous panel data model)An heterogeneous panel data model is a model in which all parameters(constant and slope coe¢ cients) vary accross individuals.

De�nition (Homogeneous panel data model)

An homogeneous panel data model (or pooled model) is a model inwhich all parameters (constant and slope coe¢ cients) are common

De�nition (individual e¤ects)

In a panel data model, the individual (unobserved) e¤ects are capturedby the constant terms αi .

How to choose the appropriate speci�cation of the panel data model?

Economic interpretation: is it plausible to assume the homogeneityof the parameters across individuals?

Speci�cation tests: testing strategy proposed by Hsiao (2003) forinstance.

Speci�cation Tests

Lemma (Normality assumption)Under the assumption that the εit are independently normally distributedover i and t with mean zero and variance σ2ε :

εiti .i .d .� N

�0, σ2ε

�di¤erent Fisher F -tests can be used to test the restrictions on β and α.

First step (homogeneous/ pooled assumption)

Let us consider the general model

The hypothesis of common intercept and slope can be viewed as a set of(K + 1)(n� 1) linear restrictions:

H10 : βi = β αi = α 8 i 2 f1, ..., ng

H1a : 9 (i , j) 2 f1, ..., ng2 / βi 6= βj or αi 6= αj

Consider the modelyit = αi + β0xit + εit

H10 : βi = β αi = α 8 i 2 f1, ..., ng

Under the alternative H1, there are nK estimated slope coe¢ cientsfor the n vectors βi (K � 1) and n estimated constants.

Under H1, the unrestricted residual sum of squares S1 divided by σ2εhas a chi-square distribution with nT � n (K + 1) degrees of freedom.

De�nition (Homogeneity test)

Under the homogeneous assumption H10 ,

H10 : βi(K ,1)

= β(K ,1)

αi = α 8 i 2 f1, ..., ng

the F statistic, denoted F1,and de�ned by:

F1 =(RSS1,c � RSS1) / [(n� 1) (K + 1)]

RSS1/ [nT � n (K + 1)]

has a Fischer distribution with (n� 1) (K + 1) and nT � n (K + 1)degrees of freedom. RSS1 denotes the residual sum of squares of themodel and RSS1,c the residual sum of squares of the constrained model

Remark 1

Under H1, the residual sum of squares is equal to the sum of the n residualsum of squares associated to the n individual regressions:

RSS1 =n

∑i=1RSS1,i =

∑i=1

bε2it = n

∑i=1

�Syy ,i � S 0xy ,iS�1xx ,iSxy ,i

Syy ,i =T

∑t=1(yit � y i )2 with x i =

∑t=1xit and y i =

∑t=1yit

Sxx ,i =T

∑t=1(xit � x i ) (xit � x i )0 Sxy ,i =

∑t=1(xit � x i ) (yit � y i )

Remark 2

Under H10 , the model becomes:

The least-squares regression of the pooled model yields parameterestimates bβ = S�1xx Sxy

Sxx =n

∑i=1

∑t=1(xit � x) (xit � x)0 with x =

∑i=1

∑t=1xit

Sxy =n

∑i=1

∑t=1(xit � x) (yit � y) with y =

∑i=1

∑t=1yit

Under H10 , the overall RSS is de�ned by

SCR1,c = Syy � S 0xyS�1xx Sxy

Syy =n

∑i=1

∑t=1(yit � y i )2

Sxx ,i =n

∑i=1

∑t=1(xit � x i ) (xit � x i )0

Sxy ,i =n

∑i=1

∑t=1(xit � x i ) (yit � y i )

Second step (individual/unobserved e¤ects)

Let us consider the general model

yit = αi + β0ixit + εit

The hypothesis of heterogeneous intercepts but homogeneous slopes canbe reformulated as subject to (n� 1)K linear restrictions (no restrictionson αi ).

H20 : βi = β 8 i = 1, ..n

De�nition (Test for common slope parameters)

Under the assumption H20 ,

H20 : βi = β 8 i = 1, ..n

the F statistic, denoted F2, and de�ned by:

F2 =(RSS1,c 0 � RSS1) / [(n� 1)K ]RSS1/ [nT � n (K + 1)]

has a Fischer distribution with (n� 1)K et nT � n (K + 1) degrees offreedom under H20 . RSS1 denotes the residual sum of squares of the modeland RSS1,c 0 the residual sum of squares of the constrained model (modelwith individual e¤ects):

Under H20 , the residual sum of squares is:

RSS1,c 0 =n

∑i=1Syy ,i �

∑i=1Sxy ,i

∑i=1Sxx ,i

!�1 n

∑i=1Sxy ,i

Syy ,i =T

∑t=1(yit � y i )2 with x i =

∑t=1xit y i =

∑t=1yit

Sxx ,i =T

∑t=1(xit � x i ) (xit � x i )0

Sxy ,i =T

∑t=1(xit � x i ) (yit � y i )

Third step: homogeneous constants

If H20 is not rejected, one can also apply a conditional test forhomogeneous intercepts (n� 1 linear restrictions).

H30 : αi = α 8 i = 1, .., n given βi = β

Under the null, the model is homogeneous (pooled) and the restrictedresidual sum of squares is SCR1,c .

Under the alternative, the model is yit = αi + β0xit + εit , and there isnT �K � n degrees of freedom

De�nition (Test for homogeneous constant terms)

Under the assumption H30 ,

H30 : αi = α 8 i = 1, .., n given βi = β

the F statistic, denoted F3, and de�ned by:

F3 =(RSS1,c � RSS1,c 0) / (n� 1)RSS1,c 0/ [n (T � 1)�K ]

has a Fischer distribution with n� 1 and n (T � 1)�K degrees offreedom under H20 . RSS1,c 0 denotes the residual sum of squares of themodel with individual e¤ects and SCR1,c the residual sum of squares ofthe pooled model previously de�ned.

Application: Strikes in OECD

Example (strikes in OECD countries)Let us consider a simple panel regression model for the total number ofstrike days in OECD countries. We consider a balanced panel data set for17 countries (n = 17) and annual data form 1951 to 1985 (T = 35).General idea: evaluate the link between strikes and some macroeconomicfactors (in�ation, unemployment, etc..).

We consider the following model

sit = αi + βiuit + γipit + εit

sit the number of strike days for 1,000 workers for the country i attime t.

uit the unemployment rate

pit the in�ation rate

Recall that, we have:yit = αi + β0xit + εit

H10 : βi = β αi = α 8 i 2 f1, ..., ngThe test statistic satis�es

F1 �H 10F (48, 544)

since(n� 1) (K + 1) = (17� 1)� (2+ 1) = 48nT � n (K + 1) = 595� 17� (2+ 1) = 544

For the second test:H20 : βi = β 8 i = 1, ..n

the F statistic, denoted F2, has the following distribution

F2 �H 10F (32, 544)

since(n� 1)K = (17� 1)� 2 = 32

nT � n (K + 1) = 595� 17� (2+ 1) = 544

For the third test:

H30 : αi = α 8 i = 1, .., n given βi = β

the F statistic, denoted F3, satis�es:

F3 �H 10F (16, 576)

since(n� 1) = 17� 1 = 16

n (T � 1)�K = 17� (35� 1)� 2 = 576

Should we use these speci�cation tests?

These heterogeneity / homogeneity tests of the parameters are validunder speci�c assumptions (normality of residuals).

More generally, the assumption of heterogeneity / homogeneity of theparameters (slope coe¢ cients and constants) has to be evaluatedthrough an economic reasoning.

ExampleIt is reasonnable to assume that the slope parameters of the productionfunction are the same accros countries? what does it imply? Should Iimpose a common mean for the TFP for France and Germany? Theanswer is probably no.

Key Concepts Section 1

1 Heterogeneous panel data model

2 Homogeneous panel data model

3 Individual (unobserved) e¤ects

4 Speci�cation tests (Hsiao, 2003)

Section 2

Linear Unobserved E¤ects Panel Data Models

2. Linear unobserved e¤ects panel data models

Objectives

1 De�ne the concept of linear unobserved e¤ects panel data model.

2 De�ne the concept of individual e¤ect.

3 Write the linear model in a vectorial form.

4 De�ne the notion of �xed e¤ects.

5 De�ne the notion of random e¤ects.

De�nition (linear unobserved e¤ects panel data model)

A linear unobserved (individual) e¤ects panel data model is de�ned as:

where αi is a scalar, β = (β1, β2, ..., βK )0 denotes a K � 1 vector of

parameters, xit = (x1it , ..., xKit )0 is a K � 1 vector of exogenous variables,and εit is an error term, assumed to be i .i .d ., with 8 i = 1, .., n,8 t = 1, ..,T

E (εit ) = 0 E�ε2it�= σ2ε

De�nition (individual e¤ects)

There are many names for the scalars αi , i = 1, ..., n: (1) unobservede¤ects, (2) individual e¤ects, (3) unobserved components, and (4)latent variables (for random e¤ects models).

De�nition (error terms)The errors εit are called the idiosyncratic errors or idiosyncraticdisturbances. They change accross t as well as accross i .

Vectorial form (1)

Let us denote

yi(T ,1)

0BB@yi ,1yi ,2...yit

1CCA Xi(T ,K )

0BB@x1,i ,1 x2,i ,1 ... xK ,i ,1x1,i ,2 x2,i ,2 ... xK ,i ,2... ... ... ...x1,it x2,it ... xK ,it

1CCALet us denote e a unit vector and εi the vector of errors:

e(T ,1)

0BB@11...1

1CCA εi(T ,1)

0BB@εi ,1εi ,2...εit

1CCA β(K ,1)

0BB@β1β2...βK

De�nition (vectorial form)

For any individual 8 i = 1, ..., n, the linear unobserved e¤ects paneldata model can be de�ned as follows:

yi = eαi + Xi β+ εi

E (εi ) = 0

E�εi ε0i

�= σ2ε IT

E�εi ε0j

�= 0(T ,T )

if i 6= j

Example (production function)Let us consider the case of a Cobb Douglas production function in log, asde�ned previously, for the case T = 3 and K = 2. We have:

yit = αi + βkkit + βnnit + εit 8 i , 8 t 2 f1, 2, 3g

or in a vectorial form for a country i as:

yi(3,1)

= e(3,1)

αi(1,1)

+ Xi(3,2)

β(2,1)

+ εi(3,1)0@ yi ,1

yi ,2yi ,3

0@ 111

1A αi +

0@ ki ,1 ni ,1ki ,2 ni ,2ki ,3 ni ,3

1A� βkβn

0@ εi ,1εi ,2εi ,3

Vectorial form (2)

It is also possible to stackle all these vectors/matrices as follows

Y = eeeα+ X β+ ε

Y(Tn,1)

0BB@y1y2...yn

1CCA X(Tn,K )

0BB@X1X2...Xn

1CCA ε(Tn,1)

0BB@ε1ε2...εn

1CCAwhere 0T is the null vector (T , 1) .

ee(Tn,n)

= In e =

0BB@e 0T ... 0T0T e ... 0T... ... ... 0T0T 0T ... e

1CCA eα(n,1)

0BB@α1α2...αn

1CCAC. Hurlin (University of Orléans) Advanced Econometrics II February 2018 46 / 258

Example (production function)Consider the case of the production function with T = 3 and n = 2

Y = eeeα+ X β+ ε0BBBBBBB@

y1,1y1,2y1,3y2,1y2,2y2,3

1CCCCCCCA=

0BBBBBB@

1 01 01 00 10 10 1

1CCCCCCA�

α1α2

0BBBBBBB@

k11 n11k12 n12k13 n13

k21 n21k22 n22k,3 n23

1CCCCCCCA�

βkβn

0BBBBBBB@

ε11ε12ε1,3

ε2,1ε2,2ε2,3

1CCCCCCCA

Especially in methodological papers, but also in applications, one oftensees a discussion about whether the individual e¤ects αi have to be treatedas a random e¤ect or a �xed e¤ect.

De�nition (Traditional approach)In the traditional approach to panel data models, αi is called a �randome¤ect�when it is treated as a random variable and a ��xed e¤ect�whenit is treated as a parameter to be estimated for each cross sectionobservation i .

Discussion

For Wooldridge (2010), these discussions about whether the αi shouldbe treated as random variables or as parameters to be estimated arewrongheaded for microeconomic panels.

With a large number of random draws from the cross section, italmost always makes sense to treat the unobserved e¤ects, αi ,as random draws from the population, along with yit and xit .

This approach is certainly appropriate from an omitted variables orneglected heterogeneity perspective.

Fact (Mundlak�s approach)

As suggested by Mundlak (1978), the key issue involving αi is whether ornot it is uncorrelated with the observed explanatory variables xit .

Mundlak Y. (1978), �On the Pooling of Time Series and Cross SectionData�, Econometrica, 46, 69-85

De�nition (a "modern" approach)In a modern approach, �random e¤ect� is synonymous with zerocorrelation between the observed explanatory variables and theunobserved (random) e¤ect αi :

cov (xit , αi ) = 0, 8t, 8i

Remarks

Actually, a stronger conditional mean independence assumption,

E (αi j xi1, ..., xiT ) = 0

is needed to fully justify statistical inference.

In applied papers, when αi is referred to an �individual randome¤ect,� then αi is probably being assumed to be uncorrelated with thexit .

De�nition (Fixed e¤ects)In microeconometric applications, the term ��xed e¤ect�does notusually mean that αi is being treated as nonrandom; rather, it means thatone is allowing for arbitrary correlation between the unobserved e¤ect αiand the observed explanatory variables xit .

Remarks

Wooldridge (2010) avoids referring to αi as a random e¤ect or a �xede¤ect. Instead, he refers to αi as unobserved e¤ect, unobservedheterogeneity, and so on.

Nevertheless, later we will label two di¤erent estimation methods asrandom e¤ects estimation and �xed e¤ects estimation methods.

This terminology is so ingrained that it is pointless to try to change itnow.

Fixed or random e¤ects?

The economic interpretation of the individual e¤ects generally allowsto show that they are probably correlated to the explanatoryvariables.

But, in case of doubt, it is possible to use a speci�cation test(Hausman�s test, 1978)

Example (Production function)Let us consider the simple example of the Cobb Douglass productionfunction.

yit = βikit + γinit + αi + vit

In this case, αi corresponds to the unobserved e¤ect on TFP due toscountry speci�c omitted factor (climate, institutions, organization, etc..).In this case, we might expect that the the level of factors are positivilycorrelated with this component of TFP: the more a country is productive,the more it invests in capital for instance.

cov (αi , kit ) > 0 cov (αi , nit ) > 0

Example (Patents and R&D)

Hausman, Hall, and Griliches (1984) estimate (nonlinear) distributed lagmodels to study the relationship between patents awarded to a �rm andcurrent and past levels of R&D spending. A linear version of their model is:

patentsit = θt + zitγ+ δ0RDit + δ1RDit�1 + ..+ δ5RDit�5 + αi + vit

where RDit is spending on R&D for �rm i at time t and zit contains otherexplanatory variables. αi is a �rm heterogeneity term that may in�uencepatentsit and that may be correlated with current, past, and future R&Dexpenditures.

cov (αi ,RDit�k ) 6= 0 8k

De�nition (Hausman�s test)

The Hausman test (1978), is a test of the null hypothesis

cov (xit , αi ) = 0, 8 (it)

and is generally presented as a speci�cation test (�xed or random) forthe unobserved e¤ects.

1 Linear unobserved e¤ects panel data model.

2 Vectorial form of the linear panel data model

3 Individual e¤ects.

4 Unobserved e¤ects

5 Random e¤ects.

6 Fixed e¤ects.

Section 3

Fixed E¤ects Estimation Methods

3. Fixed e¤ects estimation methods

Objectives

1 Specify the linear regression model with �xed e¤ects.

2 Introduce the LSDV (within) estimator.

3 De�ne the within transformation.

4 Estimate the slope parameters.

5 Estimate the �xed e¤ects.

Notations

Let us denote

yi(T ,1)

1CCA Xi(T ,K )

1CCALet us denote e a unit vector and εi the vector of errors:

e(T ,1)

0BB@11...1

1CCA εi(T ,1)

0BB@εi ,1εi ,2...εit

1CCA β(K ,1)

0BB@β1β2...βK

We consider the �xed e¤ects model:

yi = eαi + Xi β+ εi 8 i = 1, .., n

where αi is assumed to be a constant term or a random variablesatisfying E (αi j xi1, ..., xiT ) = 0.

Assumptions (H1) The errors terms εit are i .i .d . 8 (it) with:E (εit ) = 0

E (εit εi ,s ) =

�σ2ε0

t = s8t 6= s , or E (εi ε0i ) = σ2ε IT where It denotes

the identity matrix (T ,T ) .

E (εit εj ,s ) = 0, 8j 6= i , 8 (t, s) , or E�

εi ε0j

�= 0 where 0 denotes the

null matrix (T ,T ) .

TheoremUnder assumptions H1, the ordinary-least-squares (OLS) estimator of β isthe best linear unbiased estimator (BLUE).

De�nition (LSDV estimator)

In this context, the OLS estimator bβ is called the least-squaresdummy-variable (LSDV) or Fixed E¤ect (FE) estimator, because theobserved values of the variable for the coe¢ cient αi takes the form ofdummy variables.

The OLS estimators of αi and β and are obtained by minimizingnbαi , bβLSDV o = argminfαi ,βgni=1

∑i=1

ε0i εi

∑i=1(yi � eαi � Xi β)0 (yi � eαi � Xi β)

FOC1 (with respect to αi ) gives:

bαi = y i � bβ0LSDV x iwith

x i =1T

∑t=1xit y i =

∑t=1yit

Given the second FOC (with respect to β) and the previous result, we canderive the formula for bβLSDV .

De�nition (LSDV estimator)

Under assumption H1, the �xed e¤ect estimator or LSDV estimator of βis de�ned by:

bβLSDV =

∑i=1

∑t=1(xit � x i ) (xit � x i )0

∑i=1

∑t=1(xit � x i ) (yit � y i )

Remarks

1 The computational procedure for estimating the slope parameters inthis model does not require that the dummy variables for theindividual (and/or time) e¤ects actually be included in the matrix ofexplanatory variables.

2 We only need (1) the empirical means of time-series observationsseparately for each cross-sectional unit, (2) transform theobserved variables by subtracting out these means, and (3) then applythe least squares method to the transformed data.

The foregoing procedure is equivalent to premultiplying the i th equation

yi = eαi + Xi β+ εi

by a T � T idempotent (covariance) transformation matrix (withinoperator)

Q = IT �1Tee 0

to �sweep out� the individual e¤ect αi so that individual observations aremeasured as deviations from individual means (over time).

Q(T ,T )

= IT �1Tee 0 =

0BBBBBB@

1� 1T � 1

T ... � 1T � 1

� 1T 1� 1

T ... � 1T � 1

.... .... ... ... ...

� 1T � 1

T ... 1� 1T � 1

� 1T � 1

T ... � 1T 1� 1

1CCCCCCA

Qyi and QXi correspond to the observations are measured as deviationsfrom individual means :

�IT �

1Tee 0�yi

= yi � e�1Te 0yi

1CCA� 1T

∑t=1yit

!0BB@11...1

1CCA= yi � y ie

QXi = Xi �1Tee 0Xi

� 1T

0BB@11...1

1CCA� ∑Tt=1 x1,it ∑T

t=1 x2,it ... ∑Tt=1 xK ,it

Finally, when the transformation Q is applied to a vector of constant (or atime invariant variable), it lead to a null vector.

�IT �

1Tee 0�e

= e � 1Tee 0e

= e � e = 0

e 0e =�1 .. 1

�0@ 1..1

1A = T

So, we have:yi = eαi + Xi β+ εi

() Qyi = Qeαi +QXi β+Qεi

() Qyi = QXi β+Qεi

De�nition (Within - LSDV estimator)Under assumption H1, the �xed e¤ect estimator or LSDV estimator orWithin estimator of parameter β is de�ned by:

bβLSDV =

∑i=1X 0i QXi

!�1 n

∑i=1X 0i Qyi

whereQ = IT �

1Tee 0

Fact (Time-invariant regressors)If the explanatory variables contain some time-invariant variables zi , theircoe¢ cients cannot be estimated by LSDV, because the covariancetransformation eliminates zi .

Qzi =�IT �

1Tee 0�zi = zi �

1Tee 0zi = zi � z ie = 0T

ExampleLet us consider a simple panel regression model for the total number ofstrike days in OECD countries. We have a balanced panel data set for 17countries (n = 17) and annual data form 1951 to 1985 (T = 35). Generalidea: evaluate the link between strikes and some macroeconomic factors(in�ation, unemployment etc..)

We consider the following model

sit the number of strike days for 1000 workers for the country i attime t.

uit the unemployement rate

pit the in�ation rate

Theorem

The LSDV estimator bβ is unbiased and consistent when either n, or T , orboth tend to in�nity. bβLSDV p�!

nT!∞β

TheoremThe estimator for the unobserved e¤ects bαi , although unbiased, isconsistent only when T ! ∞.

bαi p�!T!∞

Theorem

The asymptotic variance�covariance matrix of the LSDV estimator bβ isgiven by:

V�bβLSDV � = σ2ε

∑i=1X 0i QXi

Estimator of the asymptotic covariance matrix

bV �bβLSDV � = bσ2ε

∑i=1X 0i QXi

!�1with bσ2ε = 1

nT �K � nn

∑i=1

∑t=1bε2it

bσ2ε =1

nT �K � nn

∑i=1

∑t=1bε2it

595� 2� 17 � 0.146958e09

= 505.1093

Be careful with a simple OLSmethod!

bσ2ε =1

nT �Kn

∑i=1

∑t=1bε2it

595� 2 � 0.146958e09

= 497.8165

as it does not take intoaccount the correct number ofconstant terms.

1 Linear unobserved e¤ects panel data model.

2 Fixed e¤ects and assumptions H1.

3 LSDV or within estimator.

4 Within transformation

5 Properties of the LSDV estimator.

6 Asymptotic variance-covariance of the LSDV estimator.

Section 4

Random E¤ects Estimation Methods

4. Random e¤ects estimation methods

Objectives

1 Specify the error-component model.

2 De�ne the Generalized Least Squares (GLS) estimator.

3 De�ne the between and pooled estimators.

4 Write the GLS estimator as a weigthed average of the LSDV andbetween estimators.

5 Study the properties of the GLS estimator..

6 De�ne the feasible GLS estimator.

De�nition (error-component model)The random speci�cation of unobserved e¤ects corresponds to a particularcase of variance-component or error-component model, in which theerror is assumed to consist of three components

yit = β0xit + εit 8 (it)

εit = αi + λt + vit

Terminologyεit = αi + λt + vit

αi : individual (random) e¤ect

λt : time (random) e¤ect

vit : idiosyncratic error term

Assumptions (H2) The errors terms εit = αi + λt + vit are i .i .d . 8 (it)with:

E (αi ) = E (λt ) = E (vit ) = 0

E (αiλt ) = E (λtvit ) = E (αivit ) = 0

E (αiαj ) =

�σ2α0

i = j8i 6= j

E (λtλs ) =

�σ2λ0

t = s8t 6= s

E (vitvj ,s ) =�

t = s, i = j8t 6= s, 8i 6= j

E (αix 0it ) = E (λtx 0it ) = E (vitx 0it ) = 0

Remark

As suggested by Wooldridge (2001), the "�xed e¤ect" speci�cation can beviewed as a case in which αi is a random parameter with

cov�αi , x 0it

�6= 0

whereas the "random e¤ect model" correspond to the situation in which

cov�αi , x 0it

�= 0

De�nition (error-component model)Under H2, the variance of yit conditional on xit is equal to:

σ2y jx = σ2ε = σ2α + σ2λ + σ2v

De�nition (centered individual e¤ects)

If the individual e¤ects α�i are supposed to have a non zero mean, with

E (αi ) = µ

then we can de�ned individual e¤ects αi = µ+ α�i with zero mean. Theerror-component model is then de�ned as:

yit = µ+ β0xit + εit

εit = α�i + λt + vit

Random coe¢ cient model

In the sequel, for simplicity we do not introduce any time e¤ects andconsider a simple random e¤ect model with

εit = αi + vit

Vectorial form

The vectorial expression of the individual e¤ects model is then de�ned as:

yi(T ,1)

= eXi(T ,K+1)

γ(K+1,1)

+ εi(T ,1)

εi(T ,1)

= e(T ,1)

αi(1,1)

+ vi(T ,1)

eXi = (e : Xi ) and γ0 =�µ : β0

De�nition (variance-covariance matrix of errors)Under assumptions H2, the variance-covariance matrix of εi is equal to:

V = E�εi ε0i

�= E

�(αie + vi ) (αie + vi )

0� = σ2αee0 + σ2v IT

Its inverse is:

V�1 =1

�IT �

�σ2α

σ2v + Tσ2α

�ee 0�

Remark

The presence of αi produces a correlation among errors of the samecross-sectional unit (autocorrelation) as

V = E�εi ε0i

�= σ2αee

0 + σ2v IT

V(T ,T )

0BBBB@σ2α + σ2v σ2α ... σ2α

σ2α + σ2v ... σ2α

... σ2α

σ2α + σ2v

1CCCCA

Remark

If we consider the nT � 1 vector of errors ε = (ε01, ..., ε0n)0 , we have

V (ε) = E�εε0�= V In

V (ε)(nT ,nT )

0BBBBBB@

V(T ,T )

0(T ,T )

... 0(T ,T )

V(T ,T )

... 0(T ,T )

V(T ,T )

1CCCCCCA

Within transformation

Regardless of whether the αi are treated as �xed or as random, theindividual-speci�c e¤ects for a given sample can be swept out by theidempotent (covariance) transformation matrix Q

Qyi = Qeµ+QXi β+Qeαi +Qvi

Since Qe =�IT � T�1ee 0

�e = 0, we have

Qyi = QXi β+Qvi

TheoremUnder assumptions H2, when αi are treated as random, the LSDVestimator is unbiased and consistent either n, or T , or both tend toin�nity. However, whereas the LSDV is the BLUE under the assumptionthat αi are �xed constants, it is not the BLUE when αi are assumedrandom. The BLUE in the latter case is the Generalized-Least-Squares(GLS) estimator.

Summary

Assumptions LSDV GLS

H2 +E (αi j xi1, .., xiK ) = 0 Unbiased BLUE

H2 +E (αi j xi1, .., xiK ) 6= 0 BLUE Biased

Notations

Let us consider the model

yi = eXiγ+ εi 8 i = 1, .., n

where εi = αie + vi , eXi = (e : Xi ) and γ0 =�µ : β0

We assume that the variance covariance matrix V = E (εi ε0i ) isknown.

De�nition (GLS estimator)If the variance covariance matrix V is known, the GLS estimator of theγ vector, denoted bγGLS , is de�ned by:

bγGLS =

∑i=1

eX 0i V�1 eXi!�1 n

∑i=1

eX 0i V�1yi!

Under assumptions H2, this estimaor is BLUE.

De�nition (inverse of the variance-covariance matrix)

Following Maddala (1971), we can write V�1 as:

V�1 =1

�Q + ψ

1Tee 0�

where Q = (IT � ee 0/T ) and where the parameter ψ is de�ned by:

�σ2v

σ2v + Tσ2α

V�1V =1

�Q + ψ

1Tee 0� �

σ2αee0 + σ2v IT

�σ2αQee

0 + σ2vQ + ψσ2αTee 0ee 0 + ψ

σ2vTee 0�

�σ2vQ + ψσ2αee

0 + ψσ2vTee 0�

as e 0e = T

�σ2vQ +

1Tee 0ψ

�Tσ2α + σ2v

��= IT �

1Tee 0 +

1Tee 0ψ

�Tσ2α + σ2v

Proof (ct�d)

V�1V = IT �1Tee 0 +

1Tee 0ψ

�Tσ2α + σ2v

�Since

�σ2v

σ2v + Tσ2α

�we have

V�1V = IT �1Tee 0 +

1Tee 0 = IT �

Given this de�nition of V�1, we have:

bγGLS =

∑i=1

eX 0i �Q + ψ1Tee 0� eXi

!�1 n

∑i=1

eX 0i �Q + ψ1Tee 0�yi

∑i=1

eX 0i Q eXi + ψ1T

∑i=1

eX 0i ee 0 eXi!�1 n

∑i=1

eX 0i Qyi + ψ1T

∑i=1

eX 0i ee 0yi!

with eXi = (e Xi ) and γ0 =�µ β0

It is possible to show that

� bµGLSbβGLS�

0B@ ψnT ψTn∑i=1x0i

ψTn∑i=1x i

n∑i=1X 0i QXi + ψT

n∑i=1x ix

1CA�1

0@ ψnTyn∑i=1X 0i Qyi + ψT

n∑i=1x iy i

1AUsing the formula of the partitioned inverse, we can derive bβGLS .

De�nition (GLS estimator)

If the variance covariance matrix V is known, the GLS estimator of β is:

bβGLS =

∑i=1X 0i QXi + ψ

∑i=1(x i � x) (x i � x)0

!�1 1T

∑i=1X 0i Qyi + ψ

∑i=1(x i � x) (y i � y)

with ψ=σ2v�σ2v + Tσ2α

��1

The GLS estimator can be expressed as a weigthed average of the LSDV(OLS) estimator and the between estimator:

bβGLS =

∑i=1X 0i QXi + ψ

∑i=1(x i � x) (x i � x)0

!�1 1T

∑i=1X 0i Qyi + ψ

∑i=1(x i � x) (y i � y)

De�nition (between group estimator)

The between-group estimator or between estimator bβBE corresponds tothe OLS estimator obtained in the model:

y i = c + β0x i + εi 8 i = 1, .., n

bβBE =

∑i=1(x i � x) (x i � x)0

!�1 n

∑i=1(x i � x) (y i � y)

!The estimator bβBE is called the between-group estimator because itignores variation within the group

De�nition (pooled estimator)

The pooled estimator bβpooled corresponds to the OLS estimator obtainedin the pooled model:

yit = α+ β0xit + εit 8 i = 1, .., n 8 t = 1, ..,T

bβpooled =

∑t=1

∑i=1(xit � x) (x i � x)0

∑t=1

∑i=1(xit � x) (yit � y)

Theorem

Under assumptions H2, the GLS estimator bβGLS is a weighted average ofthe between-group bβBE and the within-group (LSDV) estimators bβLSDV .bβGLS = ∆bβBE + (IK � ∆) bβLSDVwhere ∆ denotes a weigth matrix de�ned by:

∆ = ψT

∑i=1X 0i QXi + ψT

∑i=1(x i � x) (x i � x)0

GLS estimator properties

1 If ψ ! 0, the GLS estimator converges to LSDV estimator.

bβGLS p�!ψ!0

bβLSDV2 If ψ ! 1, then GLS converges to the OLS pooled estimator.

bβGLS p�!ψ!1

bβpooled

4. Random e¤ects estimation methodsProof: case 1 bβGLS = ∆bβBE + (IK � ∆) bβLSDV

∆ = ψT

∑i=1(x i � x) (x i � x)0

Consider the case ψ = 0 then

∆ = 0 bβGLS = bβLSDVSo, if ψ ! 0, the GLS estimator converges to LSDV estimator.

bβGLS p�!ψ!0

bβLSDVC. Hurlin (University of Orléans) Advanced Econometrics II February 2018 120 / 258

Proof: case 2 bβGLS = ∆bβBE + (IK � ∆) bβLSDVConsider the case ψ = 1 we have

∆ = T

∑i=1X 0i QXi + T

∑i=1(x i � x) (x i � x)0

!�1 n

∑i=1(x i � x) (x i � x)0

bβBE =

∑i=1(x i � x) (x i � x)0

!�1 n

∑i=1(x i � x) (y i � y)

bβLSDV =

∑i=1

∑t=1(xit � x i ) (xit � x i )0

!�1 n

∑i=1

∑t=1(xit � x i ) (yit � y i )

Proof: case 2

bβGLS = ∆bβBE + (IK � ∆) bβLSDV= T

∑i=1X 0i QXi + T

∑i=1(x i � x) (x i � x)0

!�1 n

∑i=1(x i � x) (y i � y)

∑i=1

∑t=1(xit � x i ) (xit � x i )0

!�1 n

∑i=1

∑t=1(xit � x i ) (yit � y i )

∑i=1X 0i QXi + T

∑i=1(x i � x) (x i � x)0

!�1 n

∑i=1

∑t=1(xit � x i ) (yit � y i )

Proof: case 2

So, if ψ = 1 we have

bβGLS =

∑i=1

∑t=1(xit � x) (xit � x)0

∑i=1

∑t=1(xit � x) (yit � y)

!= bβpooled

So, if ψ ! 1, the GLS estimator converges to the OLS pooled estimator.

bβGLS p�!ψ!1

bβpooledC. Hurlin (University of Orléans) Advanced Econometrics II February 2018 123 / 258

The parameter ψ = σ2v�σ2v + Tσ2α

��1 measures the weight given to thebetween-group variation.

In the LSDV (or �xed-e¤ects model) procedure, this source ofvariation is completely ignored (ψ = 0).

The OLS procedure (pooled model) corresponds to ψ = 1. Thebetween-group and within-group variations are just added up.

FactThe procedure of treating αi as random coe¢ cients provides anintermediate solution between treating them all as di¤erent (�xede¤ects, LSDV) and treating them all as equal (pooled model).

Given the de�nition of ψ, we have:

limT!∞

ψ = limT!∞

�σ2v

σ2v + Tσ2α

�= 0

Theorem (GLS and LSDV)When T tends to in�nity, the GLS estimator converges to the LSDVestimator: bβGLS �!T!∞

bβLSDV

Interpretation bβGLS �!T!∞bβLSDV

When T ! ∞, we have an in�nite number of observations for each i .

Therefore, we can consider each αi as a random variable which hasbeen drawn once and forever

For each i we assume that they are just like �xed parameters.

De�nition (Transformation matrix)Computation of the GLS estimator can be simpli�ed by introducing atransformation matrix P such that

P =�IT �

�1� ψ1/2

�(1/T ) ee 0

�We have

V�1 =1

σ2vP 0P

Premultiplying the model by the transformation matrix P, we obtain theGLS estimator by applying the least-squares method to the transformedmodel (Theil (1971, Chapter 6)).

Transformation matrix

The GLS estimator is equivalent to

1 Transforming the data by subtracting a fraction�1� ψ1/2� of

individual means y i and x i from their corresponding yit and xit

2 Regressing yit ��1� ψ1/2� y i on a constant and xit � �1� ψ1/2� x i

using simple OLS.

De�nition (Asymptotic variance covariance matrix)Under assumptions H2, the asymptotic variance covariance matrix ofthe GLS estimator is given by:

V�bβGLS� = σ2v

∑i=1(x i � x) (x i � x)0

Remark

V�bβGLS� = σ2v

∑i=1(x i � x) (x i � x)0

V�bβLSDV � = σ2ε

∑i=1X 0i QXi

!�1As ψ > 0, the di¤erence between the covariance matrices of bβLSDV andbβGLS is a positive semide�nite matrix. For K = 1, we have:

V�bβGLS� � V

�bβLSDV �=> the LSDV is not BLUE

De�nition (feasible GLS)

If the variance components σ2ε and σ2α are unknown, we can use a two-stepGLS estimation procedure, called as feasible GLS.

1 In the �rst step, we estimate the variance components using someconsistent estimators.

2 In the second step, we substitute their estimated values into

bγGLS =

∑i=1

eX 0i bV�1 eXi!�1 n

∑i=1

eX 0i bV�1yi!

or its equivalent form.

Two-step GLS estimator

De�ne y i = αi + β0x i + εi and (yit � y i ) = (xit � x i ) + (vit � v i ), we canuse the within and between-group residuals to estimate σ2ε and σ2α by

bσ2v =n∑i=1

T∑t=1

�(yit � y i )� bβ0LSDV (xit � x i )�2

n (T � 1)�K

bσ2α =n∑i=1

�y i � bβ0LSDV x i�2n�K � 1 � bσ2v

Then, we have an estimate of ψ and V�1

bψ = bσ2vbσ2v + Tbσ2αbV�1 = 1bσ2v

�Q + bψ 1

Tee 0�

LemmaWhen the sample size is large (in the sense of either n! ∞,or T ! ∞),the two-step GLS estimator will have the same asymptotic e¢ ciency as theGLS procedure with known variance components.

LemmaEven for moderate sample size (for T � 3, n� (K + 1) � 9; for T � 2,n� (K + 1) � 10), the two-step procedure is still more e¢ cient than theLSDV estimator in the sense that the di¤erence between the covariancematrices of the covariance estimator and the two-step estimator isnonnegative de�nite.

ExampleLet us consider a simple panel regression model for the total number ofstrikes days in OECD countries. We have a balanced panel data set for 17countries (n = 17) and annual data form 1951 to 1985 (T = 35).

sit = αi + βuit + γpit + εit

Figure: Random e¤ects method

Here we have bσ2v = 0.25514e06 = 255, 140bσ2α = 55, 401bψ = bσ2vbσ2v + Tbσ2α = 255, 140

255, 140+ 35� 55, 401 = 0.1163

1 Error-component model.

2 GLS, Between and pooled estimators.

3 Write the GLS estimator as a weighted average.

4 Feasible GLS estimator.

5 Properties of the GLS estimator.

6 Asymptotic variance-covariance matrix of the GLS estimator.

Section 5

Speci�cation tests: Fixed or Random e¤ects?

5. Speci�cations tests

Objectives

1 De�ne the Mundlak�s speci�cation.

2 Discuss the independence assumption between random e¤ects andexplanatory variables.

3 Show that the GLS estimator may be not consistent when T is�xed.

4 Show that the GLS is always consistent when T tends to in�nity.

5 Introduce the Hausman�s lemma.

6 De�ne the Hausman�s speci�cation test.

Fact (large T sample)Whether to treat the e¤ects as �xed or random makes no di¤erence whenT is large, because both the LSDV estimator and the generalizedleast-squares estimator become the same estimator:

bβGLS �!T!∞bβLSDV

Fixed or random e¤ects

When T is �nite and n is large, whether to treat the e¤ects as �xedor random is not an easy question to answer.

It can make a surprising amount of di¤erence in the estimates of theparameters.

Example (Hausman, 1978)

Hausman (1978) estimates a wage equation using a sample of 629 highschool graduates followed over six years by the Michigan income dynamicsstudy. The explanatory variables include a piecewise-linear representationof age, the presence of unemployment or poor health in the previous year,and dummy variables for self-employment, living in the South, or living ina rural area.

In the random-e¤ects framework, there are two fundamental assumptions.

1 One is that the unobserved individual e¤ects αi are random drawsfrom a common population.

2 The explanatory variables are strictly exogenous: it implies that allthe components of the error terms are orthogonal to theregressors:

E ( εit j xi1, .., xiK ) = E (αi j xi1, .., xiK ) = E (vit j xi1, .., xiK ) = 0

What happens when this condition is violated?

E (αi j xi1, .., xiK ) 6= 0 or E�αix 0it

�6= 0

1 The Mundlak�s speci�cation (1978)

2 The Hausman�s speci�cation test

Subsection 5.1

The Mundlak�s Speci�cation

5.1. Mundlak�s speci�cation

Mundlak�s speci�cation

Mundlak (1978) criticized the random-e¤ects formulation on thegrounds that it neglects the correlation that may exist between thee¤ects αi and the explanatory variables xit .

There are reasons to believe that in many circumstances αi and xitare indeed correlated.

Mundlak Y. (1978), �On the Pooling of Time Series and Cross SectionData�, Econometrica, 46, 69-85.

The properties of various estimators we have discussed thus fardepend on the existence and extent of the relations between the X�sand the e¤ects αi .

Therefore, we have to consider the joint distribution of thesevariables. However, αi are unobservable.

Mundlak (1978) suggests to approximate E (αixit ) by a linearfunction.

De�nition (Mundlak�s speci�cation)Let us assume that the individual e¤ects satisfy:

αi = x 0ia|{z}component proportional to x

+ α�i|{z}component orthogonal to x

with a 2 RK , x i = T�1 ∑Tt=1 xit the K � 1 vector of individual means of

the explanatory variables and

E�α�i x

�= 0

De�nition (Mundlak�s speci�cation)With the Mundlak�s speci�cation, the unobserved e¤ects model becomes:

yit = µ+ β0xit + x 0ia+ εit

εit = α�i + vit

Assumption H3: The error term εit = α�i + vit are i .i .d . 8 (it) with:E (α�i ) = E (vit ) = 0

E (α�i vit ) = 0

α�i α�j

�σ2α�0

i = j8i 6= j

E (vitvj ,s ) =�

t = s, i = j8t 6= s, 8i 6= j

E (vitx 0it ) = E (α�i x0it ) = 0

The model can be rewritten as follows:

yi(T ,1)

= eX �i(T ,K+1)

γ(K+1,1)

+ εi(T ,1)

8 i = 1, .., n

withεi = α�i e + vieX �i = �ex 0i : e : Xi

�γ0 =

�a0 : µ : β0

The variance-covariance matrix of the error term is de�ned as:

E�εi ε0j

�= E

�(α�i e + vi )

�α�j e + vj

�0�=

�σ2α�ee

0 + σ2v IT = V�

0i = ji 6= j

GLS estimator

Utilizing the expression for the inverse of a partitioned matrix, we obtainthe GLS estimator of µ, β, and a as:

bµ�GLS = y � x 0bβBEbβ�GLS = ∆bβBE + (IK � ∆) bβLSDVba�GLS = bβBE � bβLSDV

Between estimator

The between estimator bβBE corresponds to the OLS estimator obtainedin the model:

y i = c + (β+ a)0 x i + εi = c + θ0x i + εi 8 i = 1, .., n

bθBE =

∑i=1(x i � x) (x i � x)0

!�1 n

∑i=1(x i � x) (y i � y)

Under H3, the GLS estimator bβ�GLS is consistent (cf. section 4):bβ�GLS �!

nT!∞β

Besides, we have bβ�GLS �!T!∞bβLSDV

Question: What is the consequence to neglect the dependence betweenαi and xit and to wrongly consider the following model?

εit = αi + vit

αi = x 0ia+ α�i

Let us assume that the DGP corresponds to the Mundlak�s model

αi = x 0ia+ α�i

and we apply GLS to the initial model:

εit = αi + vit

In general, we have:

bβGLS = ∆bβBE + (IK � ∆) bβLSDVIt is easy to show that: bβBE p�!

n!∞β+ a

bβLSDV p�!n!∞

Let us assume that∆

p�!n!∞

∆ = ψT

∑i=1(x i � x) (x i � x)0

When T is �xed and n tends to in�nity, the GLS is not consistent if thereis a correlation between individual e¤ects and the expanatory variables:

plimn!∞

bβGLS = ∆� plimn!∞

bβBE + �IK � ∆�� plimn!∞

bβLSDV= ∆� (β+ a) +

�IK � ∆

�� β (2)

= β+ ∆a

with∆ = plim

n!∞∆.

Theorem (GLS bias)

If αi = x 0ia+ α�i with a 6= 0,the GLS is not consistent when T is �xedand n tends to in�nity: bβGLS p�!

n!∞β+ ∆a

As usual, the GLS is consistent with T :

bβGLS p�!T!∞

Summary

E (αi j xi1, .., xiK ) = 0 E (αi j xi1, .., xiK ) 6= 0LSDV GLS LSDV GLS

T �xed, n! ∞ Consistent � Consistent Not Consistent

T ! ∞ and n! ∞ Consistent BLUE Consistent Consistent

Subsection 5.2

The Hausman�s Speci�cation Test

5.2. Hausman�s speci�cation test

Hausman (1978) proposes a general speci�cation test, that can beapplied in the speci�c context of linear panel models to the issue ofspeci�cation of individual e¤ects (�xed or random).

Hausman J.A., (1978) �Speci�cation Tests in Econometrics�, Econometrica,46, 1251-1271

General idea of the Hausman�s lemma

Let us consider a general model

y = f (x ; β) + ε

and particular hypothesis H0 on the parameters, error terms, etc.

Let us consider two estimators of the K -vector β, denoted bβ1 and bβ2,both consistent under H0 and asymptotically normally distributed.

Under H0, the estimator bβ1 reachs the asymptotic Cramer�Raobound.

Under H1, the estimator bβ2 is biased and not consistent.C. Hurlin (University of Orléans) Advanced Econometrics II February 2018 172 / 258

General idea of the Hausman�s lemma

By examing the distance between bβ1 and bβ2, it is possible to concludeabout H0 :

1 If the distance is small, H0 can not be rejected.

2 If the distance is large, H0 can be rejected.

Distance measure

This distance is naturally de�ned as follows:

H =�bβ2 � bβ1�0 �V

�bβ2 � bβ1��1 �bβ2 � bβ1�However, the issue is to compute the variance-covariancematrix V

�bβ2 � bβ1� of the di¤erence between both estimators.Generaly we know V

�bβ2� and V�bβ1�, but not V

�bβ2 � bβ1� .

Lemma (Hausman, 1978)

Based on a sample of n observations, consider two estimates bβ1 and bβ2that are both consistent and asymptotically normally distributed, with bβ1attaining the asymptotic Cramer�Rao bound so that

pn�bβ1 � β

asymptotically normally distributed with variance�covariance matrix V1.Suppose

pn�bβ2 � β

�is asymptotically normally distributed, with mean

zero and variance�covariance matrix V2. Let bq = bβ2 � bβ1. Then thelimiting distributions [under the null] of

pn�bβ1 � β

�and

pnbq have zero

covariance:E(bβ1bq0) = 0K

TheoremFrom this lemma, it follows that

V�bβ2 � bβ1� = V

�bβ2��V�bβ1�

Thus, Hausman suggests using the test statistic

H =�bβ2 � bβ1�0 �V

�bβ2��V�bβ1��1 �bβ2 � bβ1�

or equivalentlyH = bq0 (V (bq))�1 bq

Under the null hypothesis, the test statistic H has an asymptoticchi-square distribution with K degrees of freedom.

HHO�!n!∞

χ2 (K )

Under the alternative, it has a noncentral chi-square distribution withnoncentrality parameter eq0 (V (bq))�1 eq, where eq is de�ned as follows:

eq = plimH1/n!∞

�bβ2 � bβ1�

Speci�cation test for �xed versus random e¤ects

Let us apply the Hausman�s test to discriminate between �xed e¤ectsmethods and random e¤ects methods.

We assume that αi are random variable and the key assumptiontested is here de�ned as:

H0 : E (αi jXi ) = 0

H1 : E (αi jXi ) 6= 0

De�nition (Hausman�s speci�cation test)The Hausman speci�cation test is a test of the null of no dependencebetween the (random) individual e¤ects and the explanatory variables.

H0 : E (αi jXi ) = 0

H1 : E (αi jXi ) 6= 0

Speci�cation test for �xed versus random e¤ects

The Hausman�s test can also be interpreted as a speci�cation test between"�xed e¤ect methods" and "random e¤ect methods".

1 If the null is rejected, the correlation between individual e¤ects andthe explicative variables induces a bias in the GLS estimates. So, astandard LSDV approach (�xed e¤ects method) has to beprivilegiated.

2 If the null is not rejected, we can use a GLS estimator (random e¤ectmethod) and specify the individual e¤ects as random variables(random e¤ects model).

Hausman�s speci�cation test

How to implement this test? Let us consider the standard model withrandom e¤ects (µ = 0):

yi = Xi β+ eαi + vi

1 Under H0 (and assumptions A2) we know that bβLSDV and bβGLS areconsistent and asymptotically normally distributed.

2 Under H0, bβGLS is BLUE and attains asymptotic Cramer�Rao bound.3 Under H1, bβGLS is not consistent.

According to the Hausman�s lemma, we have (for K = 1):

cov�bβGLS ,�bβLSDV � bβGLS�� = 0() cov

�bβLSDV , bβGLS� = V�bβGLS�

Since,

V�bβLSDV � bβGLS� = V

�bβLSDV �+V�bβGLS�� 2cov �bβLSDV , bβGLS�

We have:

V�bβLSDV � bβGLS� = V

�bβLSDV ��V�bβGLS�

De�nition (Hausman�s speci�cation test)The Hausman speci�cation test statistic of individual e¤ect can be de�nedas follows:

H =�bβLSDV � bβGLS�0 �V

�bβLSDV ��V�bβGLS��1 �bβLSDV � bβGLS�

Under H0 : E (αi jXi ) = 0, we have:

HHO�!

nT!∞χ2 (K )

Remarks

H =�bβLSDV � bβGLS�0 �V

�bβLSDV ��V�bβGLS��1 �bβLSDV � bβGLS�

Let us assume that K = 1, then under the null H0 : E (αi jXi ) = 0

V�bβLSDV ��V

�bβGLS� > 0since bβGLS is the BLUE.

Remarks

1 When n is �xed and T tends to in�nity, bβGLS and bβMCG becomeidentical. However, it was shown by Ahn and Moon (2001) that thenumerator and denominator of H approach zero at the same speed.Therefore the ratio remains chi-square distributed. However, in thissituation the �xed-e¤ects and random-e¤ects models becomeindistinguishable for all practical purposes.

2 The more typical case in practice is that n is large relative to T , sothat di¤erences between the two estimators or two approaches areimportant problems.

ExampleLet us consider a simple panel regression model for the total number ofstrikes days in OECD countries. We have a balanced panel data set for 17countries (n = 17) and annual data form 1951 to 1985 (T = 35).

1 Mundlak�s speci�cation.

2 Dependence between random e¤ects and explanatory variables.

3 The GLS estimator may be not consistent (�xed T , n tends toin�nity).

4 The GLS is always consistent when T tends to in�nity.

5 Hausman�s lemma.

6 Hausman�s speci�cation test for �xed or random e¤ects models.

Section 6

Heterogeneous Panel Data Models

6. Heterogeneous panel data models

Objectives

1 De�ne the heterogeneous panel data model.

2 Introduce the random coe¢ cient model.

3 Introduce the Swamy�s model.

4 De�ne the GLS estimator

There are cases in which there are changing economic structures ordi¤erent socioeconomic and demographic background factors that implythat the slope parameters β may be varying over time and/or may bedi¤erent for di¤erent crosssectional units.

Heterogeneous panel data model

The most general form of an heterogeneous and time-varying coe¢ cientmodel is:

yit =K

∑k=1

βkitxkit + vit i = 1, .., n and t = 1, ..,T

In contrast to previous sections, we no longer treat the intercept di¤erentlythan other explanatory variables and let x1it = 1.

Assumption H4: We assume that parameters do not vary with time.Then, we have:

βkit = βki

yit =K

∑k=1

βkixkit + vit i = 1, .., n and t = 1, ..,T

Panel or not panel?

This model is equivalent to postulating a separate regression for eachcross-sectional unit

yit = β0ixit + vit i = 1, .., n

where βi = (β1i , β2i , ..., βKi )0 is a K � 1 vector of parameters, and

xit = (x1it , ..., xKit )0 is a K � 1 vector of exogenous variables.

Panel or not panel?

But some "links" between the individuals may require a panel regressionmodel :

The error terms vit are cross-correlated among cross-units.

The slope parameters βi are considered as random variable with acommon probability distribution or at least common moments

De�nition (heterogeneous slope parameters)

The vectors of slope parameters βi are assumed to satisfy

βi(K ,1)

= β(K ,1)

+ ζ i(K ,1)

for i = 1, .., n, where β is a K � 1 vector of constants, and ζ i denotes aK � 1 vector of constant or random variables.

The heterogeneous coe¢ cient model becomes

yit =K

∑k=1

(βk + ξki ) xkit + vit i = 1, .., n and t = 1, ..,T

β = (β1, β2, ..., βK )0 denotes the common mean coe¢ cient K � 1

vector.

ξ i = (ξ1i , ξ2i , ..., ξKi )0 is the vector of individual deviation from the

common mean.

The errors terms may be cross-correlated or not, i.e.cov (vjit , vit ) 6= 0 or cov (vjit , vit ) = 0.

For this type of model we are interested in

1 Estimating the mean coe¢ cient vector β,

2 Predicting each individual component βi ,

3 Estimating the dispersion of the individual-parameter vectors βi ,

Heterogeneous panel models with �xed or random coe¢ cients

1 If individual observations are heterogeneous, then ξ i can be treated as�xed constants.

2 If conditional on xkit , individual units can be viewed as random drawsfrom a common population, then ξ i are generally treated as randomvariables having for instance, zero means and constant variances andcovariances.

E (ζ i ) = 0 and V (ζ i ) = ∆

De�nition (Fixed-coe¢ cient model)

When βi are treated as �xed constants, we can stack the nTobservations in the form of the Zellner (1962) seemingly unrelatedregression (SURE) model0@ y1

0@ X1 0 00 .. ..0 .. Xn

1A0@ β1.

1A+0@ v1

1Awhere yi and vi are T � 1 vectors (yit , ..., yiT ) and (vit , ..., viT ), and Xi isthe T �K matrix of the time-series observations of the i th individual�sexplanatory variables with the tth row equal to xit .

Heterogeneous panel models with �xed coe¢ cients

1 If the covariances between di¤erent cross-sectional units are not zero,e.g. E (vivj 0) 6= 0, the GLS estimator of

�β01, ..., β

�is more e¢ cient

than the single-equation estimator of i for each cross-sectional unit.Panel data is useful.

2 If Xi are identical for all i or E (vivi 0) = σ2i IT and E (vivj 0) = 0 fori 6= j , the GLS estimator for

�β01, ..., β

�is the same as applying least

squares separately to the time-series observations of eachcross-sectional unit. Panel data is useless.

De�nition (random coe¢ cient model)Alternatively, each regression coe¢ cient can be viewed as a randomvariable with a common probability distribution:

βii .i .d .� Common Distribution

or at least common moments:

E (βi ) = β V (βi ) = ∆ 8i = 1, ..., n

Random coe¢ cient model

1 The random-coe¢ cient speci�cation reduces the number ofparameters to be estimated substantially, while still allowing thecoe¢ cients to di¤er from unit to unit and/or from time to time.

2 Depending on the type of assumption about the parameter variation,it can be further classi�ed into one of two categories: stationary andnonstationary random-coe¢ cient models.

3 For more details, see Hurwicz (1950), Klein (1953), Theil and Mennes(1959), or Zellner (1966).

Subsection 6.1

Random Coe¢ cient Models

6.1. Random coe¢ cient model

De�nition (random coe¢ cient model)

The vectors of slope parameters βi are randomly distributed with acommon mean E (βi ) = β, and

yit =K

∑k=1

(βk + ξki ) xkit + vit i = 1, .., n

with β = (β1, β2, ..., βK )0 and ξ i = (ξ1i , ξ2i , ..., ξKi ). Let us denote

βki = βk + ξki

Remarks

1 The vector xi = (x1i ..xKi ) includes a constant term. The parameterβki associated to this constant term corresponds to an individual(random) e¤ect.

2 An alternative notation is:

yit = α+K

∑k=2

(βk + ξki ) xkit + αi + vit

E (αi ) = 0

Consider the set of assumptions used in the seminal paper of Swamy(1970).

Swamy P.A. (1970), �E¢ cient Inference in a Random Coe¢ cient RegressionModel�, Econometrica, 38, 311-323

Assumption H5 (Swamy�s model): Let us assume that

E (ξ i ) = 0, E (vi ) = 0

E�ξ i ξ

�∆0

i = j8i 6= j

E�xitξ

�= 0, E

�ξ iv

�= 0, 8 (i , j)

E (vivj 0) =�

σ2i IT0

t = s, i = j8t 6= s, 8i 6= j

Remark: we assume that the error term vi is heteroskedastic:

E�viv 0i

�= σ2i IT

De�nition (moments of slopes parameters)

The two �rst moments of the vector of random parameters βi = β+ ξ iare de�ned by, 8 i = 1, ..n :

E (βi )(K ,1)

= β(K ,1)

V (βi )(K ,K )

= E�ξ i ξ

�= ∆(K ,K )

Homegeneous moments

The variance covariance matrix ∆ of the random parametersβi = (β1i , β2i , .., βKi )

0 is assumed to be common to all cross section units:

∆(K ,K )

= E�(βi � β) (βi � β)0

0BBB@σ2β1

σβ1,β2... σβ1,βK

σβ2,β1σ2β2

... σβ2,βK

... ... ... ...σβK ,β1

σβK ,β2... σ2βK

Vectorial form

For each cross section unit, we have:

yi = Xi β+ Xi ξ i + vi

βi = β+ ξ i

where the vector Xi include a constant term (i.e. the average of randomindividual e¤ects, α).

De�nition (random coe¢ cient model)The random coe¢ cient model can be rewriten as follows :

yi = Xi β+ εi

εi = Xi ξ i + vi = Xi (βi � β) + vi

Covariance matrix

For a given cross unit, the covariance matrix for the composite disturbanceterm εi = Xi ξ i + vi is de�ned by:

Φi = E�εi ε0i

�= E

�(Xi ξ i + vi ) (Xi ξ i + vi )

0�= XiE

�ξ i ξ

�X 0i +E

�viv 0i

�= Xi∆X 0i + σ2i IT

De�nitionFor a given cross unit, the covariance matrix for the composite disturbanceterm εi = Xi ξ i + vi is de�ned by:

Φi = Xi∆X 0i + σ2i IT

Stacking all nT observations, the covariance matrix for the compositedisturbance term is block-diagonal and heteroskedastic.

Remarks

1 Under Swamy�s assumption, the simple regression of y on X will yieldan unbiased and consistent estimator of β if (1/nT )X 0X convergesto a nonzero constant matrix.

2 But the estimator is ine¢ cient, and the usual least-squares formulafor computing the variance�covariance matrix of the estimator isincorrect, often leading to misleading statistical inferences.

The best linear unbiased estimator of β is the GLS estimator

bβGLS =

∑i=1X 0i Φ

�1i Xi

!�1 n

∑i=1X 0i Φ

�1i yi

The GLS estimator bβGLS is a matrix-weighted average of the least-squaresestimator bβi for each cross-sectional unit, with the weights inverselyproportional to their covariance matrices:

bβGLS = n

∑i=1

ωibβiωi =

∑i=1

�∆+ σ2i

�X 0i Xi

��1��1!�1 h∆+ σ2i�X 0i Xi

��1i�1bβi = �X 0i Xi ��1 X 0i yi

The covariance matrix for the GLS estimator is:

V�bβGLS� =

∑i=1X 0i Φ

�1i Xi

∑i=1

�∆+ σ2i

�X 0i Xi

��1��1!�1

Feasible GLS estimator

Swamy proposes to use the OLS estimators bβi = (X 0i Xi )�1 X 0i yi and theirresiduals bvi = yi � Xibβi to obtain unbiased estimators of σ2i and ∆

bσ2i=

1T �K y

�IT � Xi

�X 0i Xi

��1 X 0i � yi=

1T �K

∑t=1bvit

withyit = β0ixit + vit

Feasible GLS estimator

For the ∆ matrix, we have:

b∆(K ,K )

n� 1n

∑i=1

bβi � 1n

∑i=1

bβi! bβi � 1

∑i=1

bβi!0!

∑i=1

�X 0i Xi

��1

De�nition (estimator for ∆)

However, the previous estimator b∆ is not necessarily nonnegative de�nite.In this situation, Swamy (1970) has suggested replacing this estimator by:

b∆(K ,K )

n� 1n

∑i=1

bβi � 1n

∑i=1

bβi! bβi � 1

∑i=1

bβi!0!

This estimator, although not unbiased, is nonnegative de�nite and isconsistent when T tends to in�nity.

Remarks

1 Swamy proved that substituting bσ2iand b∆ for σ2

iand ∆ yields an

asymptotically normal and e¢ cient estimator of β.

2 The speed of convergence of the GLS estimator is n1/2.

Summary: how to estimate a random coe¢ cient model?

1 Run the n individual regressions yit = β0ixit + vit .

2 Compute bσ2iand the Swamy�s estimator b∆ as follows

b∆(K ,K )

n� 1n

∑i=1

bβi � 1n

∑i=1

bβi! bβi � 1

∑i=1

bβi!0!

bσ2i=

1T �K

∑t=1bvit

3. Compute the GLS estimate of the mean of the parameters βi

bβGLS =

∑i=1X 0i bΦ�1

!�1 n

∑i=1X 0i bΦ�1

with bΦi = Xi b∆X 0i + bσ2i IT

Example (Swamy, 1970)

Swamy (1970) used this model to reestimate the Grunfeld investmentfunction with the annual data of 11 U.S. corporations. His GLS estimatesof the common-mean coe¢ cients of the �rms�beginning-of-year value ofoutstanding shares and capital stock are 0.0843 and 0.1961, withasymptotic standard errors 0.014 and 0.0412, respectively. The estimateddispersion measure of these coe¢ cients is

b∆ = � 0.0011 �0.00020.0187

Predicting Individual Coe¢ cients

1 Sometimes one may wish to predict the individual component βi ,because it provides information on the behavior of each individual andalso because it provides a basis for predicting future values of thedependent variable for a given individual.

2 Swamy (1970, 1971) has shown that the best linear unbiasedpredictor, conditional on given xi , is the least-squares estimator bβi .

De�nition (individual predictors for βi )

Lee and Gri¢ ths (1979) suggest predicting βi by

β�i =bβGLS + ∆X 0i

�Xi∆X 0i + σ2i IT

��1 �yi � XibβGLS�

This predictor is the best linear unbiased estimator in the sense thatE (β�i � βi ) = 0, where the expectation is an unconditional one.

Predicting Individual Coe¢ cients : Lindley and Smith (1972)

1 It is also possible to consider a Bayesian approach: the randomcoe¢ cient model is also called the hierarchical model

2 In this case, the prior distribution for the βi parameters is speci�edwith the value for β, ∆ and σ2i .

3 From a Bayesian perspective, the likelihood is combined with priors togenerate posterior distributions of the parameters.

Subsection 6.2

Other Heterogeneous Panel Data Models

6.2. Other hetorogeneous panel data models

Heterogeneous panel data models

There are many other way to model the heterogeneity of slope parameters:

1 Mixed �xed and random (MFR) coe¢ cients model.

2 Mean group estimation.

3 Panel threshold regression models.

4 Grouped Patterns of Heterogeneity.

5 etc.

De�nition (mixed �xed and random coe¢ cient model)We assume that each cross section unit is di¤erent

yit =K

∑k=1

βkixki +m

∑l=1

γliwli + vit i = 1, .., n

where xit and wit are each a K � 1 and an m� 1 vector of explanatoryvariables that are independent of the error of the equation vit .

Hsiao C. (1989), �Modelling Ontrario Regional Electricity System DemandUsing a Mixed Fixed and Random Coe¤cient Approaoch�, Regional Scienceand Urban Economics, 19, 565-587

Assumptions

The parameters β(nK ,1)

= (β01, β02, ..., β

0K )0 are assumed to be randomly

distributed

The parameters γ(nm,1)

= (γ01,γ02, ...,γ

0m)0 are �xed.

In a vectorial form, we have

y(nT ,1)

= X(nT ,nK )

β(nK ,1)

+ W(nT ,nm)

γ(nm,1)

+ v(nT ,1)

0BB@X1 0 00 X2

1CCA W =

0BB@W1 0 00 W2

Assumptions on the random coe¢ cients

The coe¢ cients of xit are assumed to be subject to stochastic restrictionsof the form

β = A1β+ ζ

A1 is an nK � L matrix with known elements,

β is an L� 1 vector of constants,

ζ is assumed to be (normally distributed) random variables with mean0 and nonsingular constant covariance matrix C and is independent ofxi .

Assumptions on the �xed coe¢ cients

The coe¢ cients of wit are assumed to be subject to

γ = A2γ

A2 is an nm� n matrix with known elements,

γ is an n� 1 vector of constants.

MFR model

Since A2 is known, we can rewrite the model as

y(nT ,1)

= X(nT ,nK )

β(K ,1)

+ fW(nT ,n)

γ(n,1)

+ v(nT ,1)

where fW = WA2.

Many of the linear panel data models with unobserved individual speci�cbut time-invariant heterogeneity can be treated as special cases of thismodel.

ExampleA common model for all cross-sectional units. If there is no interindividualdi¤erence in behavioral patterns, we may let X = 0, A2 = en Im , somodel becomes

yit = witγ+ vit

ExampleWhen each individual is considered di¤erent, then X = 0, A2 = In Im ,and the model becomes

yit = witγi + vit

ExampleWhen the e¤ects of the individual speci�c, time-invariant omitted variablesare treated as random variables just as in the assumption on the e¤ects ofother omitted variables, we can let Xi = eT , ζ 0 = (ζ1, ..., ζn), A1 = en,C = Inσ2α, β be an unknown constant, and wit not contain an interceptterm. Then the model becomes:

yit = β+ γ0wit + ζ i + vit

5 etc.

Consider an heterogeneous panel data model

A consistent estimator of β = E (βi ) can be obtained under moregeneral assumptions concerning βi and the regressors.

One such possible estimator is the Mean Group (MG) estimatorproposed by Pesaran and Smith (1995) for estimation of dynamicrandom coe¢ cient models.

De�nition (mean group estimator)

The mean group (MG) estimator is de�ned as the simple average of theOLS estimators bβi bβMG = 1

∑i=1

Mean Group (MG) estimator

When the regressors are strictly exogeneous and the errors are i.i.d, anunbiased estimator of the covariance matrix is given by

V�bβMG� = 1

b∆ = 1n� 1

∑i=1

bβi � 1n

∑i=1

bβi! bβi � 1

∑i=1

bβi!0!

For more details

Pesaran, M.H. and R. Smith (1995), �Estimation of Long-Run Relationshipsfrom Dynamic Heterogeneous Panels�, Journal of Econometrics, 68, 79-114.

Pesaran, M.H.,Y. Shin and R.P. Smith, (1999), �PooledMean GroupEstimation of Dynamic Heterogeneous Panels�, Journal of the AmericanStatistical Association, 94, 621-634.

5 etc.

Bonhomme S. and E. Manresa (2015), Grouped Patterns of Heterogeneity inPanel Data, Econometrica, 83(3), 1147-1184

This paper introduces time-varying Grouped Patterns ofHeterogeneity in linear panel data models.

A distinctive feature of this approach is that group membership is leftunrestricted.

The authors estimate the parameters of the model using a �grouped�xed-e¤ects� estimator that minimizes a least-squares criterion withrespect to all possible groupings of the cross-sectional units.

A simple linear model with grouped patterns of heterogeneity takes thefollowing form

yit = x 0itθ + αgi t + vit

αgi ,t 2 A � R denotes a group-speci�c unobservable variable

The group membership variables gi 2 f1, ...,Gg and thegroup-speci�c time e¤ects αgi t are unrestricted.

Units in the same group share the same time pro�le αgt

The number of groups G is to be set or estimated by the researcher.

De�nitionThe grouped �xed-e¤ects estimator in this model is de�ned as the solutionof the following minimization problem:

�bθ,bα, bγ� = argmin n

∑i=1

∑t=1

�yit � x 0itθ � αgi t

�2where the minimum is taken over all possible groupings γ = fg1, ..., gng ofthe n units into G groups, common parameters θ, and group-speci�c timee¤ects α.

De�nitionFor given values of θ and α, the optimal group assignment for eachindividual unit is given by:

bgi (θ, α) = argming2f1,...,G g

∑t=1

�yit � x 0itθ � αgt

This model can be extended to allow for group-speci�c e¤ects ofcovariates (heterogeneous slope coe¢ cients):

yit = x 0itθgi + αgi t + vit

The authors propose a very intuitive iterative algorithm to:

Estimate the parameters

Determine the group membership

For more details, see

Bonhomme S. and E. Manresa (2015), Grouped Patterns of Heterogeneity inPanel Data, Econometrica, 83(3), 1147-1184

1 Heterogeneous panel data model.

2 Random coe¢ cient model.

3 GLS estimator.

4 Hierarchical model.

5 Mixed �xed and random coe¢ cients model.

6 Mean group estimator.

End of Chapter 1

Christophe Hurlin (University of Orléans)

chapter 1. linear panel models and heterogeneity types of restrictions can be imposed ... an...

Documents

panel smooth transition regression models · 2 panel smooth...

car panel model

system model-based deï¬nition of modeling language...

model 20 series 2012 panel y tarjetas

control panel: automatic model: a1 - tht sales · pdf...

make a statement. a statement. 65, 66, 67 & 69 model 65...

population stochastic modelling (psm): model de nition

competing with big data - ceps.eu prüfer (website... ·...

1 the heisenberg model - ntnu · 2012. 5. 8. · 1 the...

generalized affine transform formulae and exact simulation...

model panel data

struk tur model panel

estimasi parameter pada model data panel dinamik

notes de cours généralités sur le calcul du champ...

model geographically weighted panel regression …

model 1912xr command processor panel programming manual

analisis estimasi model regresi data panel dengan...

softwaretechnik uberblick i modellgetriebene ...3/44 de...

model finish nailer modèle nt 65ma4 cloueur de ﬁ nition...

control c&k model s5832eb control panel