chapter 5: panel estimation under

[email protected] http://www.mysmu.edu/faculty/zlyang/ Zhenlin Yang

Chapter 5: Panel Estimation Under Heteroskedasticity and Serial Correlation

Cluster-robust estimation for short panels• Pooled OLS or FGLS estimators or population-averaged estimators• Fixed effects estimators: FE, within, LSDV, First-difference• Random effects estimators: RE, BE

Robust estimation for long panels• Heteroskedasticity and serial correlation• Unit roots and cointegration

Robust estimation for large panels

This chapter introduces various panel estimation methods that take into account the possible existence of heteroskedasticityand/or serial correlation. The discussions draw on Cameron & Trivedi: Microeconometrics Using Stata 2009. Main methods include:

mailto:[email protected]

http://www.mysmu.edu/faculty/zlyang/

Chapter 5

ECON6002, Term II 2020-21 © Zhenlin Yang, SMU

Chapter 5

Heteroskedasticity refers to that the variance of 𝑢𝑢𝑖𝑖𝑖𝑖 (or 𝑣𝑣𝑖𝑖𝑖𝑖 in case of FE models) changes over i or t or both, in particular over i as the cross-sectional units may be of varying size.Serial correlation means that 𝑢𝑢𝑖𝑖𝑖𝑖 are correlated over time in a way that is more than the equicorrelation induced by the random effects, because it is often that an unobserved shock in one period will affect the behavioral relationship for at least the next few periods. Recall: a short panel has large N and small T; a long panel has small N and large T, and a large panel has both N and T large.Depending on the type of panels that the estimation is based upon, the methods for handling these two issues are different.

5.1. Introduction

2

Consider the general panel data model that has been studied:𝑦𝑦𝑖𝑖𝑖𝑖 = 𝛼𝛼 + 𝑋𝑋𝑖𝑖𝑖𝑖′ 𝛽𝛽 + 𝑢𝑢𝑖𝑖𝑖𝑖, 𝑢𝑢𝑖𝑖𝑖𝑖 = 𝜇𝜇𝑖𝑖 + 𝜆𝜆𝑖𝑖 + 𝑣𝑣𝑖𝑖𝑖𝑖, (5.1)

with cross-sections i = 1, …, N, and time periods t = 1, …, T.

Chapter 5


Chapter 5

3

The standard assumptions for OLS regression are

(i) E 𝑢𝑢𝑖𝑖 𝑋𝑋𝑖𝑖 = 0 (exogeneity of regressors).

(ii) E 𝑢𝑢𝑖𝑖2|𝑋𝑋𝑖𝑖 = 𝜎𝜎2 (conditional homoskedasticity),

(iii) E 𝑢𝑢𝑖𝑖𝑢𝑢𝑗𝑗|𝑋𝑋𝑖𝑖 ,𝑋𝑋𝑗𝑗 = 0, 𝑖𝑖 ≠ 𝑗𝑗 (conditional zero correlation)

Consider the multiple linear regression model:

𝑦𝑦𝑖𝑖 = 𝛼𝛼 + 𝑋𝑋𝑖𝑖′𝛽𝛽 + 𝑢𝑢𝑖𝑖 = 𝐗𝐗𝑖𝑖′𝛃𝛃 + 𝑢𝑢𝑖𝑖 , 𝑖𝑖 = 1, … ,𝑛𝑛,

or in matrix form: 𝑦𝑦 = 𝐗𝐗𝛃𝛃 + 𝑢𝑢, where dim(𝛽𝛽) = k. The ordinary least squares (OLS) estimator of 𝛃𝛃 is

�𝛃𝛃OLS = (𝐗𝐗′𝐗𝐗)−1𝐗𝐗𝐗𝑦𝑦, which minimizes the sum of squares of errors,

∑𝑖𝑖=1𝑛𝑛 𝑦𝑦𝑖𝑖 − 𝐗𝐗𝑖𝑖′𝛃𝛃2 = (𝑦𝑦 − 𝐗𝐗𝛃𝛃)𝐗(𝑦𝑦 − 𝐗𝐗𝛃𝛃).

Heteroskedasticity in OLS Regression

Chapter 5


Chapter 5

4

The condition for �𝛃𝛃OLS to be valid (unbiased, consistent) is (i). Under (i), E �𝛃𝛃OLS 𝐗𝐗 = 𝐸𝐸 (𝐗𝐗′𝐗𝐗)−1𝐗𝐗𝐗𝑦𝑦 𝐗𝐗 = 𝛃𝛃 + 𝐸𝐸 𝑢𝑢 𝐗𝐗 = 𝛃𝛃, implying that E(�𝛃𝛃OLS) = 𝛃𝛃 (unbiased).The conditions for �𝛃𝛃OLS to be efficient are (ii) and (iii), under which Var �𝛃𝛃OLS = 𝜎𝜎2(𝐗𝐗′𝐗𝐗)−1 (efficient).

Heteroskedasticity-robust standard errors. If homoskedasticity assumption (ii) is violated, i.e., E 𝑢𝑢𝑖𝑖2|𝑋𝑋𝑖𝑖 = 𝜎𝜎𝑖𝑖2 (heteroskedasticity), then the OLS estimator �𝛃𝛃OLS remains valid (unbiased, consistent), but Var �𝛃𝛃OLS ≠ 𝜎𝜎2 𝐗𝐗′𝐗𝐗 −1, instead,

Var �𝛃𝛃OLS = 𝐗𝐗′𝐗𝐗 −1𝐗𝐗′diag 𝜎𝜎𝑖𝑖2 𝐗𝐗 𝐗𝐗′𝐗𝐗 −1.

A heteroskedasticity-robust estimator of Var �𝛃𝛃OLS is

�Vrobust �𝛃𝛃OLS = 𝐗𝐗′𝐗𝐗 −1 𝑁𝑁𝑁𝑁−𝑘𝑘−1

∑𝑖𝑖𝑛𝑛 �𝑢𝑢𝑖𝑖2𝑋𝑋𝑖𝑖𝑋𝑋𝑖𝑖′ 𝐗𝐗′𝐗𝐗 −1,

where �𝑢𝑢𝑖𝑖 are OLS residuals, i.e., �𝑢𝑢𝑖𝑖 = 𝑦𝑦𝑖𝑖 − 𝐗𝐗𝑖𝑖′�𝛃𝛃OLS.

Heteroskedasticity in OLS Regression

Chapter 5


Chapter 5

5

GLS Regression

Both �𝛃𝛃OLS and �𝛃𝛃GLS are unbiased and consistent.

But �𝛃𝛃GLS is more efficient than �𝛃𝛃OLS, because Var(�𝛃𝛃GLS) = 𝜎𝜎2(𝐗𝐗′Ω−1𝐗𝐗)−1 is “less than” Var �𝛃𝛃OLS = 𝜎𝜎2(𝐗𝐗′𝐗𝐗)−1.

In case where Ω is known up to a finite number of parameters 𝛾𝛾, i.e., Ω =Ω(𝛾𝛾), and if a consistent estimator of 𝛾𝛾, say �𝛾𝛾, is available, then a feasible GLS (FGLS) estimator of 𝛃𝛃 and its variance are:�𝛃𝛃FGLS = (𝐗𝐗′�Ω−1𝐗𝐗)−1𝐗𝐗𝐗�Ω−1𝑦𝑦, where �Ω = Ω �𝛾𝛾 ;

�Var(�𝛃𝛃FGLS) = �𝜎𝜎2(𝐗𝐗′�Ω−1𝐗𝐗)−1, where �𝜎𝜎2 is a consistent estimator of 𝜎𝜎2.

If E 𝑢𝑢𝑢𝑢′ 𝐗𝐗 = 𝜎𝜎2Ω, where Ω ≠ I, but is a known correlation matrix ((i) and/or (ii) violated), the generalized least-squares (GLS) estimator is:

�𝛃𝛃GLS = (𝐗𝐗′Ω−1𝐗𝐗)−1𝐗𝐗𝐗Ω−1𝑦𝑦,

which minimizes the sum of squares: (𝑦𝑦 − 𝐗𝐗𝛃𝛃)𝐗Ω−1(𝑦𝑦 − 𝐗𝐗𝛃𝛃), and

Var(�𝛃𝛃GLS) = 𝜎𝜎2(𝐗𝐗′Ω−1𝐗𝐗)−1.

Chapter 5


Chapter 5

Under this parameterization, 𝐗𝐗𝑖𝑖𝑖𝑖 is (𝐾𝐾 + 𝑇𝑇 − 1) × 1, and 𝜇𝜇𝑖𝑖 are subject to ∑𝑖𝑖=1𝑁𝑁 𝜇𝜇𝑖𝑖 = 0. Writing (5.2) in vector form for each i, or the ith cluster,

𝑦𝑦𝑖𝑖 = 𝜄𝜄𝑇𝑇𝛼𝛼 + 𝐗𝐗𝑖𝑖𝛃𝛃 + 𝜄𝜄𝑇𝑇𝜇𝜇𝑖𝑖 + 𝑣𝑣𝑖𝑖, i = 1, 2, . . ., N,and applying the transformation: 𝑄𝑄𝑇𝑇 = 𝐼𝐼𝑇𝑇 − 1

𝑇𝑇𝜄𝜄𝑇𝑇𝜄𝜄𝑇𝑇′ , to give

𝑄𝑄𝑇𝑇𝑦𝑦𝑖𝑖 = 𝑄𝑄𝑇𝑇𝐗𝐗𝑖𝑖𝛃𝛃 + 𝑄𝑄𝑇𝑇𝑣𝑣𝑖𝑖, or 𝑦𝑦𝑖𝑖∗ = 𝐗𝐗𝑖𝑖∗𝛃𝛃 + 𝑣𝑣𝑖𝑖∗, i = 1, 2, . . ., N. (5.3)

The cluster-robust (CR) VC matrix of the Within estimator �𝛃𝛃:

For short panels, as T is small, it is common to let the time effects 𝜆𝜆𝑖𝑖 be fixed effects. Then Model (5.1) reduces to one-way model:

𝑦𝑦𝑖𝑖𝑖𝑖 = 𝛼𝛼 + 𝐗𝐗𝑖𝑖𝑖𝑖′ 𝛃𝛃 + 𝜇𝜇𝑖𝑖 + 𝑣𝑣𝑖𝑖𝑖𝑖, i = 1, …, N, t = 1, …, T, (5.2)

if the regressors 𝐗𝐗𝑖𝑖𝑖𝑖 includes a set of time dummies (with one time dummy dropped to avoid the dummy variable trap).

5.2. Cluster-Robust Estimation for Short Panels

6

Chapter 5


Chapter 5

7

Assume in (5.3), (i) 𝑣𝑣𝑖𝑖∗are independent, and (ii) Var(𝑣𝑣𝑖𝑖∗) = Ω𝑖𝑖, a general T×T positive definite matrix. A robust estimator of the variance-covariance (VC) matrix of the Within estimator �𝛃𝛃 is:

�Var(�𝛃𝛃) = 𝐗𝐗∗′𝐗𝐗∗ −1 ∑𝑖𝑖=1𝑁𝑁 𝐗𝐗𝑖𝑖∗′ �𝑣𝑣𝑖𝑖∗ �𝑣𝑣𝑖𝑖∗′𝐗𝐗𝑖𝑖∗ 𝐗𝐗∗′𝐗𝐗∗ −1,

where �𝑣𝑣𝑖𝑖∗ = 𝑦𝑦𝑖𝑖∗ − 𝐗𝐗𝑖𝑖∗�𝛃𝛃, and 𝐗𝐗∗ is NT×(K+T−1), which stacks 𝐗𝐗𝑖𝑖∗.

This is the result given in (3.12), and is valid only for short panels, i.e., the case of large N and small T.

It allows arbitrary correlation among the elements in 𝑣𝑣𝑖𝑖∗, for each i, but requires the independence of 𝑣𝑣𝑖𝑖∗ over i.

Most importantly, the result not only applies to Model (5.3), obtained from one-way FE model after within transformation, it applies to any model of the that form.

The Cluster Robust Method

Chapter 5


Chapter 5

8

Generalization. Consider the form 𝑦𝑦𝑖𝑖∗ = 𝐗𝐗𝑖𝑖∗𝛃𝛃 + 𝑢𝑢𝑖𝑖∗, i = 1, 2, ... , N. Assume E(𝑢𝑢𝑖𝑖∗|𝐗𝐗𝑖𝑖∗) = 0, Var(𝑢𝑢𝑖𝑖∗|𝐗𝐗𝑖𝑖∗) = Ω𝑖𝑖, and E 𝑢𝑢𝑖𝑖𝑢𝑢𝑗𝑗′|𝐗𝐗𝑖𝑖∗,𝐗𝐗𝑗𝑗∗ = 0, i ≠ j. If dim(𝒚𝒚𝒊𝒊∗) is fixed, the OLS estimator �𝛃𝛃 = 𝐗𝐗∗′𝐗𝐗∗ −1𝐗𝐗∗′𝑦𝑦∗ is valid, and a heteroskedasticity robust estimator of the VC matrix of �𝛃𝛃 is:

�Var(�𝛃𝛃) = 𝐗𝐗∗′𝐗𝐗∗ −1 ∑𝑖𝑖=1𝑁𝑁 𝐗𝐗𝑖𝑖∗′ �𝑢𝑢𝑖𝑖∗ �𝑢𝑢𝑖𝑖∗′𝐗𝐗𝑖𝑖∗ 𝐗𝐗∗′𝐗𝐗∗ −1, (5.4)

where �𝑢𝑢𝑖𝑖∗ = 𝑦𝑦𝑖𝑖∗ − 𝐗𝐗𝑖𝑖∗�𝛃𝛃, 𝑦𝑦∗is the stacked 𝑦𝑦𝑖𝑖∗, and 𝐗𝐗∗ the stacked 𝐗𝐗𝑖𝑖∗.

It can be shown that the OLS estimator �𝛃𝛃 is robust against serial-correlation and cross-sectional heteroskedasticity of unknown form;Clearly, the VC matrix estimate �Var(�𝛃𝛃) is robust against unknown serial-correlation and cross-sectional heteroskedasticity.Therefore, �𝛃𝛃 and �Var(�𝛃𝛃) together provide a set of inference methods that are robust against unknown serial-correlation and cross-sectional heteroskedasticity. Various applications of (5.4) are presented next.

The Cluster Robust Method

Chapter 5


Chapter 5

9

Pooled OLS or population-averaged estimators;

Pooled FGLS or population-averaged estimators;

Within estimation;

Within estimation, allowing time dummies;

Least-squares dummy-variables regression;

First-difference estimation, allowing time dummies;

One-way individual RE estimation, allowing time dummies;

Between estimation;

Comparison of panel estimators based on short panels.

Various applications of the result (5.4) are presented. The key Stata command/option for implementing (5.4) is vce(cluster id):

Applications of the Cluster Robust Method

Chapter 5


Chapter 5

10

Pooled OLS or Population-Averaged Estimators

Pooled OLS estimators simply regress 𝑦𝑦𝑖𝑖𝑖𝑖 on 𝐗𝐗𝑖𝑖𝑖𝑖, using both between (cross-section) and within (time-series) variation in the data, and assuming the disturbances 𝑢𝑢𝑖𝑖𝑖𝑖 are iid.

• The resulting OLS estimator �𝛃𝛃 of the coefficients 𝛃𝛃 of 𝐗𝐗𝑖𝑖𝑖𝑖 can be consistent if 𝐗𝐗𝑖𝑖𝑖𝑖 is uncorrelated with 𝑢𝑢𝑖𝑖𝑖𝑖, otherwise inconsistent.

• Clearly, for RE models, the OLS estimator of 𝛃𝛃 is consistent, whereas for FE models, the OLS estimator of 𝛃𝛃 is inconsistent.

• Even in the case where �𝛃𝛃 is consistent, the VC matrix of �𝛃𝛃 obtained from an OLS regression may not be correct, as 𝑢𝑢𝑖𝑖𝑖𝑖 may not be iid, leading to misleading inferences.

Chapter 5


Chapter 5

11

To further motivate the need for a cluster-robust estimator of the VC matrix of an OLS estimator, consider Model (5.2):

𝑦𝑦𝑖𝑖𝑖𝑖 = 𝛼𝛼 + 𝐗𝐗𝑖𝑖𝑖𝑖′ 𝛃𝛃 + 𝜇𝜇𝑖𝑖 + 𝑣𝑣𝑖𝑖𝑖𝑖 .

Consistency of OLS requires that the error term 𝑢𝑢𝑖𝑖𝑖𝑖 = 𝜇𝜇𝑖𝑖 + 𝑣𝑣𝑖𝑖𝑖𝑖 be uncorrelated with 𝐗𝐗𝑖𝑖𝑖𝑖. So pooled OLS �𝛃𝛃 is consistent if 𝜇𝜇𝑖𝑖 are REbut inconsistent if 𝜇𝜇𝑖𝑖 are FE.

As 𝐗𝐗𝑖𝑖𝑖𝑖 and 𝑢𝑢𝑖𝑖𝑖𝑖 are uncorrelated, the intercept parameter 𝛼𝛼 and some other time-invariant regressors are allowed. Absorb these parameters into the 𝐗𝐗𝑖𝑖𝑖𝑖′ 𝛃𝛃 term and write the model as

𝑦𝑦𝑖𝑖𝑖𝑖 = 𝐗𝐗𝑖𝑖𝑖𝑖′ 𝛃𝛃 + 𝑢𝑢𝑖𝑖𝑖𝑖 or 𝒚𝒚𝑖𝑖 = 𝐗𝐗𝑖𝑖𝛃𝛃 + 𝑢𝑢𝑖𝑖.

Var(�𝛃𝛃) is not 𝜎𝜎2(𝐗𝐗′𝐗𝐗)−1, but (𝐗𝐗′𝐗𝐗)−1 ∑𝑖𝑖=1𝑁𝑁 𝐗𝐗𝑖𝑖′ Ω𝑖𝑖𝐗𝐗𝑖𝑖 (𝐗𝐗′𝐗𝐗)−1.

In statistical literature, the pooled estimators are called population-averaged (pa) estimators.

Ω𝑖𝑖 = Var(𝑢𝑢𝑖𝑖)


Chapter 5


Chapter 5

12


. * Pooled OLS with cluster-robust standard errors

. regress lwage exp expsq wks ed, vce(cluster id)

Linear regression Number of obs = 4,165F(4, 594) = 72.58Prob > F = 0.0000R-squared = 0.2836Root MSE = .39082

(Std. Err. adjusted for 595 clusters in id)------------------------------------------------------------------------------

| Robustlwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------exp | .044675 .0054385 8.21 0.000 .0339941 .055356

expsq | -.0007156 .0001285 -5.57 0.000 -.0009679 -.0004633wks | .005827 .0019284 3.02 0.003 .0020396 .0096144ed | .0760407 .0052122 14.59 0.000 .0658042 .0862772

_cons | 4.907961 .1399887 35.06 0.000 4.633028 5.182894------------------------------------------------------------------------------

We use the “Returns to Schooling Data” to demonstrate pooled OLS (or PA) with cluster-robust standard errors (CRSD).

The coefficients estimates are identical to those from xtreg, pa, to be given latter. The standard error are almost same as well.

Chapter 5


Chapter 5

13


. * Pooled OLS with incorrect default standard errors

. regress lwage exp expsq wks edSource | SS df MS Number of obs = 4,165

-------------+---------------------------------- F(4, 4160) = 411.62Model | 251.491445 4 62.8728613 Prob > F = 0.0000

Residual | 635.413457 4,160 .152743619 R-squared = 0.2836-------------+---------------------------------- Adj R-squared = 0.2829

Total | 886.904902 4,164 .212993492 Root MSE = .39082------------------------------------------------------------------------------

lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

exp | .044675 .0023929 18.67 0.000 .0399838 .0493663expsq | -.0007156 .0000528 -13.56 0.000 -.0008191 -.0006121

wks | .005827 .0011827 4.93 0.000 .0035084 .0081456ed | .0760407 .0022266 34.15 0.000 .0716754 .080406

_cons | 4.907961 .0673297 72.89 0.000 4.775959 5.039963------------------------------------------------------------------------------

• Wages increase with experience until a peak at 31 years [ 0.04472×0.00072×31

≈ 1];• Wages increase by 0.6% with each additional week worked;• And wages increase by 7.6% with each additional year of education.

The default standard errors assume that the regression errors are iid:

Chapter 5


Chapter 5

14


• These standard errors are misleadingly small, being .002393, 0.000053, 0.001183, 0.002227, compared with CRSD: 0.0054, 0.0001, 0.0019, 0.0052;

• Therefore, it is essential that the OLS standard errors be corrected for clustering on individuals;

. * Pooled OLS with CRSD using the general xtreg, pa procedure.

. xtreg lwage exp expsq wks ed, pa corr(independent) vce(robust) nolog

…(Std. Err. adjusted for clustering on id)

------------------------------------------------------------------------------| Robust

lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

exp | .044675 .0054358 8.22 0.000 .034021 .0553291expsq | -.0007156 .0001284 -5.57 0.000 -.0009673 -.000464

wks | .005827 .0019275 3.02 0.003 .0020491 .0096048ed | .0760407 .0052097 14.60 0.000 .0658299 .0862515

_cons | 4.907961 .1399214 35.08 0.000 4.63372 5.182202------------------------------------------------------------------------------

The pooled OLS estimator can also be obtained using (xtreg, pa) command, with options corr(independent) and vce(robust) nolog:

Chapter 5


Chapter 5

15

Pooled FGLS or Population-Averaged Estimators

Pooled FGLS estimation can lead to more efficient estimator of the parameters of the pooled model 𝑦𝑦𝑖𝑖𝑖𝑖 = 𝐗𝐗𝑖𝑖𝑖𝑖′ 𝛃𝛃 + 𝑢𝑢𝑖𝑖𝑖𝑖 (than OLS estimator). This is achieved by modelling the 𝑇𝑇 × 𝑇𝑇 error correlation matrix of 𝑢𝑢𝑖𝑖assumed constant over i, (𝑦𝑦𝑖𝑖′𝑠𝑠 are independent over i and 𝑁𝑁 is large).

The pooled estimator, or PA estimator, is obtained using the (xtreg, pa) command, and with two key additional options:

corr( ): place different restriction on the error correlation;

vce(robust): to obtain cluster-robust standard errors that are valid even if corr( ) does not specify correct correlation model.

Let 𝜌𝜌𝑖𝑖𝑡𝑡 = Cor(𝑢𝑢𝑖𝑖𝑖𝑖,𝑢𝑢𝑖𝑖𝑡𝑡) be the correlation of the errors at time periods tand s, for individual i. Note the restriction that 𝜌𝜌𝑖𝑖𝑡𝑡 does not vary with i. Also, corr( ) options all set 𝜌𝜌𝑖𝑖𝑖𝑖 = 1.

There are potentially T(T−1) unique off-diagonal values in the 𝑇𝑇 × 𝑇𝑇 error correlation matrix because it need not be that 𝜌𝜌𝑖𝑖𝑡𝑡 = 𝜌𝜌𝑡𝑡𝑖𝑖.

Chapter 5


Chapter 5

16


Typical options for corr( ) include:

• corr(independence): sets 𝜌𝜌𝑖𝑖𝑡𝑡 = 0 for 𝑠𝑠 ≠ 𝑡𝑡. Then the PA estimator equals the pooled OLS estimator;

• corr(exchangeable): sets 𝜌𝜌𝑖𝑖𝑡𝑡 = 𝜌𝜌 for 𝑠𝑠 ≠ 𝑡𝑡. Then errors are equicorrelated and (xtreg, pa) is asymptotically equivalent to (xtreg, re).

• corr(ar k): specifies an autoregressive process of order k, or AR(k), for 𝑢𝑢𝑖𝑖𝑖𝑖.• corr(stationary g): specifies a moving average process, or MA(g), for 𝑢𝑢𝑖𝑖𝑖𝑖.• corr(unstructured): places no restrictions on 𝜌𝜌𝑖𝑖𝑡𝑡. For small T, this may be

the best model for correlations over time, but can fail for a larger T.• The nolog option is to prevent the display of an iteration log.

In the statistics literature, the PA estimator is also called the generalized estimating equations (GEE) estimator.

The (xtreg, pa) command is a special case of xtgee with family(gaussian) option.

Chapter 5


Chapter 5

17

Pooled FGLS or Population-Averaged EstimatorsWe demonstrate the applications of the (xtreg, re) command using the “Return to Schooling Data”. * PA or pooled FGLS estimation with AR(2) and cluster-robust standard errors. xtreg lwage exp expsq wks ed, pa corr(ar 2) vce(robust) nolog

GEE population-averaged model Number of obs = 4,165Group and time vars: id year Number of groups = 595Link: identity Obs per group:Family: Gaussian min = 7Correlation: AR(2) avg = 7.0

max = 7Wald chi2(4) = 873.28

Scale parameter: .1966639 Prob > chi2 = 0.0000

(Std. Err. adjusted for clustering on id)------------------------------------------------------------------------------

| Robustlwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------exp | .0718915 .003999 17.98 0.000 .0640535 .0797294

expsq | -.0008966 .0000933 -9.61 0.000 -.0010794 -.0007137wks | .0002964 .0010553 0.28 0.779 -.001772 .0023647ed | .0905069 .0060161 15.04 0.000 .0787156 .1022982

_cons | 4.526381 .1056897 42.83 0.000 4.319233 4.733529------------------------------------------------------------------------------

Chapter 5


Chapter 5

18


Compared with the results from pooled OLS, we see that the coefficients change considerably, due to the use of AR(2) model. the cluster standard errors are smaller than those from the pooled

OLS for all regressors except ed, showing the efficacy gain.

The estimated error correlation matrix is stored in e(R). We have. * Estimated error correlation matrix after xtreg, pa. matrix list e(R)

symmetric e(R)[7,7]

c1 c2 c3 c4 c5 c6 c7

r1 1

r2 .89722058 1

r3 .84308581 .89722058 1

r4 .78392846 .84308581 .89722058 1

r5 .73064474 .78392846 .84308581 .89722058 1

r6 .6806209 .73064474 .78392846 .84308581 .89722058 1

r7 .63409777 .6806209 .73064474 .78392846 .84308581 .89722058 1

�𝜌𝜌𝑖𝑖𝑡𝑡 changes only with the value of |t−s|, as an AR model is used.

Chapter 5


Chapter 5

19


If an unstructured error correlation matrix is specified, we have. xtreg lwage exp expsq wks south, pa corr(unstructured) vce(robust) nolog...



-------------+----------------------------------------------------------------exp | .0635066 .0044502 14.27 0.000 .0547845 .0722288

expsq | -.00076 .0001004 -7.57 0.000 -.0009567 -.0005633wks | .0004141 .0009523 0.43 0.664 -.0014524 .0022805

south | -.0556712 .0516129 -1.08 0.281 -.1568307 .0454883_cons | 5.810748 .0591324 98.27 0.000 5.694851 5.926645

------------------------------------------------------------------------------. matrix list e(R)symmetric e(R)[7,7]

c1 c2 c3 c4 c5 c6 c7r1 1r2 .91725004 1r3 .87482529 .85342628 1r4 .81187266 .81020598 .94111792 1r5 .74119645 .75303939 .8840834 .91577823 1r6 .66331271 .68468392 .83678661 .88524751 .91405078 1r7 .6242693 .65875491 .83721361 .89982964 .93506435 .96584732 1

�𝜌𝜌𝑖𝑖𝑡𝑡 changes with both the values of t and s, as an unstructured error correlation is specified.

Chapter 5


Chapter 5

20

Pooled FGLS or Population-Averaged Estimators. * PA or pooled FGLS estimation with MA(6) and cluster-robust standard errors. xtreg lwage exp expsq wks ed, pa corr(stationary 6) vce(robust) nolog

GEE population-averaged model Number of obs = 4,165Group and time vars: id year Number of groups = 595Link: identity Obs per group:Family: Gaussian min = 7Correlation: stationary(6) avg = 7.0

max = 7Wald chi2(4) = 596.42




-------------+----------------------------------------------------------------exp | .0608033 .0039931 15.23 0.000 .0529769 .0686297

expsq | -.0008603 .0000929 -9.26 0.000 -.0010424 -.0006781wks | .0005029 .0010016 0.50 0.616 -.0014602 .002466ed | .07985 .0053356 14.97 0.000 .0693923 .0903076

_cons | 4.846687 .097278 49.82 0.000 4.656026 5.037349------------------------------------------------------------------------------The results are similar to those based on AR(2).

Chapter 5


Chapter 5

21

Time Series Autocorrelation for Panel Data

Some Stata commands are useful in analyzing the correlation of errors over time. First, set both panel and time identifies by xtset.• L1.lwage or L.lwage: for lwage lagged once;• L2.lwage: for lwage lagged twice;• D.lwage: for the first difference in lwage (equals lwage − L.lwage);• LD.lwage: for the difference lagged once;• L2D.lwage: for the difference lagged twice.. correlate lwage L1.lwage L2.lwage L3.lwage L4.lwage L5.lwage L6.lwage (obs=595)

| L. L2. L3. L4. L5. L6.| lwage lwage lwage lwage lwage lwage lwage

-------------+---------------------------------------------------------------lwage |--. | 1.0000L1. | 0.9238 1.0000L2. | 0.9083 0.9271 1.0000L3. | 0.8753 0.8843 0.9067 1.0000L4. | 0.8471 0.8551 0.8833 0.8990 1.0000L5. | 0.8261 0.8347 0.8721 0.8641 0.8667 1.0000L6. | 0.8033 0.8163 0.8518 0.8465 0.8594 0.9418 1.0000

Correlation 𝑟𝑟𝑖𝑖𝑡𝑡 changes only with the values of t and s.

Chapter 5


Chapter 5

22

Within Estimator with Cluster-Robust SE

The within (of FE) estimator of a one-way FE model is obtained by running an OLS regression on the within-transformed model (5.3), or an OLS regression of the within equation:

𝑦𝑦𝑖𝑖𝑖𝑖 − �𝑦𝑦𝑖𝑖� = 𝑋𝑋𝑖𝑖𝑖𝑖 − �𝑋𝑋𝑖𝑖�′𝛽𝛽 + 𝑣𝑣𝑖𝑖𝑖𝑖 − �̅�𝑣𝑖𝑖� .

The (xtreg, fe) command computes this estimator assuming 𝑣𝑣𝑖𝑖𝑖𝑖 are iid. The vce(robust) option relaxes iid assumption and provides cluster-robust standard errors (CRSE), (𝑦𝑦𝑖𝑖𝑖𝑖′ 𝑠𝑠 independent over i and 𝑁𝑁 large).

• The FE or within estimator controls for the fixed effects 𝜇𝜇𝑖𝑖, by using the within i differences so that 𝜇𝜇𝑖𝑖 are differenced out;

• However, the within estimation method is unable to estimate the coefficients of time-invariant regressors, and

• The within estimator will be relatively imprecise for time-varying regressors that vary little over time.

• Further, the within estimation will be relatively less efficient as a result of losing one period of data due to differencing.

Chapter 5


Chapter 5

23

Within Estimator with Cluster-Robust SE. xtreg lwage exp expsq wks, fe vce(cluster id)

Fixed-effects (within) regression Number of obs = 4,165Group variable: id Number of groups = 595

R-sq: Obs per group:within = 0.6566 min = 7between = 0.0276 avg = 7.0overall = 0.0476 max = 7

F(3,594) = 1059.72corr(u_i, Xb) = -0.9107 Prob > F = 0.0000



-------------+----------------------------------------------------------------exp | .1137879 .0040289 28.24 0.000 .1058753 .1217004

expsq | -.0004244 .0000822 -5.16 0.000 -.0005858 -.0002629wks | .0008359 .0008697 0.96 0.337 -.0008721 .0025439

_cons | 4.596396 .0600887 76.49 0.000 4.478384 4.714408-------------+----------------------------------------------------------------

sigma_u | 1.0362039sigma_e | .15220316

rho | .97888036 (fraction of variance due to u_i)------------------------------------------------------------------------------

Compared with pooled OLS, the standard errors have increased. The edvariable cannot be included.

Chapter 5


Chapter 5

24

Within Estimator with CRSE and Time Dummies

. xtreg lwage exp expsq wks i.year, fe vce(cluster id)note: 7.year omitted because of collinearity

Fixed-effects (within) regression Number of obs = 4,165Group variable: id Number of groups = 595


F(8,594) = 412.33corr(u_i, Xb) = -0.9089 Prob > F = 0.0000



-------------+----------------------------------------------------------------exp | .1119927 .0041184 27.19 0.000 .1039043 .1200812

expsq | -.0004051 .0000834 -4.86 0.000 -.0005688 -.0002413wks | .00068 .0008812 0.77 0.441 -.0010506 .0024105

Chapter 5


Chapter 5

25

Within Estimator with CRSE and Time Dummies

Cont’dyear |

2 | -.0083984 .0049321 -1.70 0.089 -.0180849 .00128813 | .0259652 .0084359 3.08 0.002 .0093974 .04253294 | .0289134 .0078093 3.70 0.000 .0135762 .04425065 | .0239406 .0065275 3.67 0.000 .0111208 .03676046 | .0069955 .0064617 1.08 0.279 -.0056949 .0196867 | 0 (omitted)

|_cons | 4.618339 .0599451 77.04 0.000 4.500609 4.736069

-------------+----------------------------------------------------------------sigma_u | 1.0268811sigma_e | .15159041


Chapter 5


Chapter 5

26

Least-Squares Dummy-Variables Regression

The within estimator of 𝛽𝛽 can be shown to equal the estimator obtained from a direct OLS estimation of 𝜇𝜇1, . . . , 𝜇𝜇𝑁𝑁 and 𝛽𝛽 in individual effects model 𝑦𝑦𝑖𝑖𝑖𝑖 = 𝑋𝑋𝑖𝑖𝑖𝑖′ 𝛽𝛽 + 𝜇𝜇𝑖𝑖 + 𝑣𝑣𝑖𝑖𝑖𝑖, using command areg:. areg lwage exp expsq wks, absorb(id) vce(cluster id)

Linear regression, absorbing indicators Number of obs = 4,165F( 3, 594) = 908.44Prob > F = 0.0000R-squared = 0.9068Adj R-squared = 0.8912Root MSE = 0.1522



-------------+----------------------------------------------------------------exp | .1137879 .0043514 26.15 0.000 .1052418 .1223339

expsq | -.0004244 .0000888 -4.78 0.000 -.0005988 -.00025wks | .0008359 .0009393 0.89 0.374 -.0010089 .0026806

_cons | 4.596396 .0648993 70.82 0.000 4.468936 4.723856-------------+----------------------------------------------------------------

id | absorbed (595 categories)

The coefficients estimates are the same as those from xtreg, fe. The robust standard errors differ and are invalid as aregis designed for long panels.

Chapter 5


Chapter 5

27

First-Difference Estimator

Consistent estimation in one-way FE model requires elimination of 𝜇𝜇1, . . . , 𝜇𝜇𝑁𝑁 which is achieved by the within transformation to give the within estimator. An orthogonal transformation method was introduced in Chapter 2. Another way to do so is through the first difference:

𝑦𝑦𝑖𝑖𝑖𝑖 − 𝑦𝑦𝑖𝑖,𝑖𝑖−1 = 𝑋𝑋𝑖𝑖𝑖𝑖 − 𝑋𝑋𝑖𝑖,𝑖𝑖−1′𝛽𝛽 + 𝑣𝑣𝑖𝑖𝑖𝑖 − 𝑣𝑣𝑖𝑖,𝑖𝑖−1 ,

where the time-invariant 𝜇𝜇𝑖𝑖 are eliminated through differencing. An OLS estimation of this model yields consistent estimates of 𝛽𝛽.

• The FD operator is not provided as an option to xtreg. Instead, the estimator can be computed using regress and Stata time-series operators D. to compute the first difference.

• Similar to the within estimator, the time dummies, fixed time effects,can be added to the model.

• The robust standard errors can be calculated using vce(cluster id), valid when observations are independent over i and 𝑁𝑁 → ∞.

Chapter 5


Chapter 5

28

First-Difference Estimator

. regress D.(lwage exp expsq wks ed), vce(cluster id) noconstantnote: D.ed omitted because of collinearity

Linear regression Number of obs = 3,570F(3, 594) = 1035.19Prob > F = 0.0000R-squared = 0.2209Root MSE = .18156


| RobustD.lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------exp |D1. | .1170654 .0040974 28.57 0.000 .1090182 .1251126

expsq |D1. | -.0005321 .0000808 -6.58 0.000 -.0006908 -.0003734wks |D1. | -.0002683 .0011783 -0.23 0.820 -.0025824 .0020459ed |

D1. | 0 (omitted)------------------------------------------------------------------------------

Chapter 5


Chapter 5

29

One-Way Random Effects Estimator with CRSE

• For the disturbances 𝑢𝑢𝑖𝑖𝑖𝑖 = 𝜇𝜇𝑖𝑖 + 𝑣𝑣𝑖𝑖𝑖𝑖, it is easy to see that

𝜌𝜌𝑖𝑖𝑡𝑡 = Cor 𝑢𝑢𝑖𝑖𝑖𝑖 ,𝑢𝑢𝑖𝑖𝑡𝑡 = �𝜎𝜎𝜇𝜇2 (𝜎𝜎𝜇𝜇2 + 𝜎𝜎𝑣𝑣2) = 𝜌𝜌, for all 𝑠𝑠 ≠ 𝑡𝑡.

• RE model has equicorrelated/exchangeable errors, which is realized by Stata command xtreg with option re.

• The options re, mle, and pa corr(exchangeable) give asymptotically equivalent estimators of 𝛽𝛽, but different estimators of 𝜎𝜎𝜇𝜇2 and 𝜎𝜎𝑣𝑣2.

• The robust standard errors can be calculated using vce(cluster id), valid when observations are independent over i and 𝑁𝑁 → ∞.

Recall the one-way random effects model given Ch. 2:𝑦𝑦𝑖𝑖𝑖𝑖 = 𝛼𝛼 + 𝑋𝑋𝑖𝑖𝑖𝑖′ 𝛽𝛽 + 𝜇𝜇𝑖𝑖 + 𝑣𝑣𝑖𝑖𝑖𝑖, i = 1, …, N and t = 1, …, T.

The default of (xtreg, re) command returns RE estimator of this model under 𝜇𝜇𝑖𝑖 ~ IID(0, 𝜎𝜎𝜇𝜇2) and 𝑣𝑣𝑖𝑖𝑖𝑖 ~ IID(0, 𝜎𝜎𝑣𝑣2), independent of each other, and 𝑋𝑋𝑖𝑖𝑖𝑖 is independent of 𝜇𝜇𝑖𝑖 and 𝑣𝑣𝑖𝑖𝑖𝑖 for all i and t.

Chapter 5


Chapter 5

30

One-Way Random Effects Estimator with CRSE. xtreg lwage exp expsq wks ed, re vce(cluster id) thetaRandom-effects GLS regression Number of obs = 4,165Group variable: id Number of groups = 595R-sq: Obs per group:

within = 0.6340 min = 7between = 0.1716 avg = 7.0overall = 0.1830 max = 7

Wald chi2(4) = 1598.50corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000theta = .82280511



-------------+----------------------------------------------------------------exp | .0888609 .0039992 22.22 0.000 .0810227 .0966992

expsq | -.0007726 .0000896 -8.62 0.000 -.0009481 -.000597wks | .0009658 .0009259 1.04 0.297 -.000849 .0027806ed | .1117099 .0083954 13.31 0.000 .0952552 .1281647

_cons | 3.829366 .1333931 28.71 0.000 3.567921 4.090812-------------+----------------------------------------------------------------

sigma_u | .31951859sigma_e | .15220316


Chapter 5


Chapter 5

31

One-Way Random Effects Estimator with CRSE. xtreg lwage exp expsq wks ed, mle

Random-effects ML regression Number of obs = 4,165Group variable: id Number of groups = 595

Random effects u_i ~ Gaussian Obs per group:min = 7avg = 7.0max = 7

LR chi2(4) = 2828.12Log likelihood = 293.69563 Prob > chi2 = 0.0000------------------------------------------------------------------------------

lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

exp | .1079955 .0024806 43.54 0.000 .1031335 .1128574expsq | -.0005202 .0000546 -9.53 0.000 -.0006272 -.0004132wks | .0008365 .0006042 1.38 0.166 -.0003477 .0020208ed | .1378558 .0125933 10.95 0.000 .1131735 .1625382

_cons | 2.989859 .1720638 17.38 0.000 2.65262 3.327097-------------+----------------------------------------------------------------

/sigma_u | .8509013 .0278622 .7980078 .9073006/sigma_e | .1536109 .0018574 .1500132 .1572949

rho | .9684385 .002199 .9638788 .9725117------------------------------------------------------------------------------LR test of sigma_u=0: chibar2(01) = 4576.13 Prob >= chibar2 = 0.000

Chapter 5


Chapter 5

32

One-Way Random Effects Estimator with CRSE. xtreg lwage exp expsq wks ed, pa corr(exchangeable)

GEE population-averaged model Number of obs = 4,165Group variable: id Number of groups = 595Link: identity Obs per group:Family: Gaussian min = 7Correlation: exchangeable avg = 7.0

max = 7Wald chi2(4) = 6160.57


------------------------------------------------------------------------------lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------exp | .1079955 .0024527 44.03 0.000 .1031883 .1128026

expsq | -.0005202 .0000543 -9.59 0.000 -.0006266 -.0004139wks | .0008365 .0006042 1.38 0.166 -.0003477 .0020208ed | .1378558 .0125814 10.96 0.000 .1131968 .1625149

_cons | 2.98986 .1711799 17.47 0.000 2.654353 3.325366------------------------------------------------------------------------------

Chapter 5


Chapter 5

33

Between Estimator with CRSE

• The between estimator is obtained by specifying the be option of the xtreg command. This essentially a cross-section regression.

• Therefore, the cross-sectional heteroskedasticity is the issue of concern. There is no explicit option of heteroskedasticity-robust standard errors, except the vce(bootstrap) option.

• The between estimator is based on averages over t, , i.e., based on the between i variations. Hence it is less efficient than the other estimators such as RE, MLE.

The between estimator is the OLS estimator of the between model:

�𝑦𝑦𝑖𝑖� = �𝑋𝑋𝑖𝑖�′ 𝛽𝛽 + (𝜇𝜇𝑖𝑖 + �̅�𝑣𝑖𝑖�).

Consistency of the OLS estimator �̂�𝛽 requires that the ‘disturbance’ term (𝜇𝜇𝑖𝑖 + �̅�𝑣𝑖𝑖�) is uncorrelated with 𝑋𝑋𝑖𝑖𝑖𝑖. This is the case if 𝜇𝜇𝑖𝑖 is a random effect but not if 𝜇𝜇𝑖𝑖 is a fixed effect.

Chapter 5


Chapter 5

34


. xtreg lwage exp expsq wks ed, be

Between regression (regression on group means) Number of obs = 4,165Group variable: id Number of groups = 595


F(4,590) = 71.48sd(u_i + avg(e_i.))= .324656 Prob > F = 0.0000

------------------------------------------------------------------------------lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------exp | .038153 .0056967 6.70 0.000 .0269647 .0493412

expsq | -.0006313 .0001257 -5.02 0.000 -.0008781 -.0003844wks | .0130903 .0040659 3.22 0.001 .0051048 .0210757ed | .0737838 .0048985 15.06 0.000 .0641632 .0834044

_cons | 4.683039 .2100989 22.29 0.000 4.270407 5.095672------------------------------------------------------------------------------

Chapter 5


Chapter 5

35


. xtreg lwage exp expsq wks ed, be vce(bootstrap)(running xtreg on estimation sample)

Bootstrap replications (50)----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50

Between regression (regression on group means) Number of obs = 4,165Group variable: id Number of groups = 595

…

(Replications based on 595 clusters in id)------------------------------------------------------------------------------

| Observed Bootstrap Normal-basedlwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------exp | .038153 .0056931 6.70 0.000 .0269946 .0493113

expsq | -.0006313 .0001292 -4.89 0.000 -.0008845 -.0003781wks | .0130903 .0035953 3.64 0.000 .0060437 .0201369ed | .0737838 .005292 13.94 0.000 .0634116 .084156

_cons | 4.683039 .2019226 23.19 0.000 4.287278 5.0788------------------------------------------------------------------------------

Chapter 5


Chapter 5

36

Comparison of Panel Estimators based on Short Panels

Recall from Chap 3 the three 𝑅𝑅2 measures reported in Stata:Within 𝑅𝑅2: 𝜌𝜌2{ 𝑦𝑦𝑖𝑖𝑖𝑖 − �𝑦𝑦𝑖𝑖� , 𝑋𝑋𝑖𝑖𝑖𝑖 − �𝑋𝑋𝑖𝑖� �̂�𝛽}Between 𝑅𝑅2: 𝜌𝜌2(�𝑦𝑦𝑖𝑖�, �𝑋𝑋𝑖𝑖��̂�𝛽)Overall 𝑅𝑅2: 𝜌𝜌2(𝑦𝑦𝑖𝑖𝑖𝑖 , 𝑋𝑋𝑖𝑖𝑖𝑖�̂�𝛽)

where 𝜌𝜌2(𝑥𝑥,𝑦𝑦) denotes the squared correlation between x and y, and �̂�𝛽is obtained from one of the xtreg options (be, fe, or re).

Also, Stata reports:sigma_u: gives the standard deviation of individual effects 𝜇𝜇𝑖𝑖sigma_e: gives the standard deviation of idiosyncratic error 𝑣𝑣𝑖𝑖𝑖𝑖

rho: the fraction of variance due to 𝜇𝜇𝑖𝑖, i.e., 𝜌𝜌 = �𝜎𝜎𝜇𝜇2 (𝜎𝜎𝜇𝜇2 + 𝜎𝜎𝑣𝑣2).

In RE estimation, there is a theta option (STATA default):

𝜃𝜃 = 1 − �𝜎𝜎𝑣𝑣2 𝑇𝑇𝜎𝜎𝜇𝜇2 + 𝜎𝜎𝑣𝑣2 ,

which turns FGLS to OLS.

For pooled OLS estimation: �̂�𝜃 = 0; For within estimation: �̂�𝜃 = 1;For RE, �̂�𝜃 → 1 as T and 𝜎𝜎𝜇𝜇2 get large.

Chapter 5


Chapter 5

37

Comparison of Panel Estimators based on Short PanelsWe compare some of the panel estimators and the associated standard errors, variance components estimates, and R2. Note: pooled OLS is the same as xtreg command with the corr(independence) and pa options.The Stata commands are:� * Compare OLS, BE, FE, RE estimators, and methods to compare standard errors

� global xlist exp expsq wks ed

� quietly regress lwage $xlist, vce(cluster id)

� estimates store OLS_rob

� quietly xtreg lwage $xlist, be

� estimates store BE

� quietly xtreg lwage $xlist, fe

� estimates store FE

� quietly xtreg lwage $xlist, fe vce(robust)

� estimates store FE_rob

� quietly xtreg lwage $xlist, re

� estimates store RE

� quietly xtreg lwage $xlist, re vce(robust)

� estimates store RE_rob

� estimates table OLS_rob BE FE FE_rob RE RE_rob,

> b se stats(N r2 r2_o r2_b r2_w sigma_u sigma_e rho) b(%7.4f)

Chapter 5


Chapter 5

38

Comparison of Panel Estimators based on Short Panels

-------------------------------------------------------------------------------Variable | OLS_rob BE FE FE_rob RE RE_rob

--------+----------------------------------------------------------------------exp | 0.0447 0.0382 0.1138 0.1138 0.0889 0.0889

| 0.0054 0.0057 0.0025 0.0040 0.0028 0.0040 expsq | -0.0007 -0.0006 -0.0004 -0.0004 -0.0008 -0.0008

| 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 wks | 0.0058 0.0131 0.0008 0.0008 0.0010 0.0010

| 0.0019 0.0041 0.0006 0.0009 0.0007 0.0009 ed | 0.0760 0.0738 (omitted) (omitted) 0.1117 0.1117

| 0.0052 0.0049 0.0061 0.0084 _cons | 4.9080 4.6830 4.5964 4.5964 3.8294 3.8294

| 0.1400 0.2101 0.0389 0.0601 0.0936 0.1334 --------+----------------------------------------------------------------------

N | 4165 4165 4165 4165 4165 4165 r2 | 0.2836 0.3264 0.6566 0.6566

r2_o | 0.2723 0.0476 0.0476 0.1830 0.1830 r2_b | 0.3264 0.0276 0.0276 0.1716 0.1716 r2_w | 0.1357 0.6566 0.6566 0.6340 0.6340

sigma_u | 1.0362 1.0362 0.3195 0.3195 sigma_e | 0.1522 0.1522 0.1522 0.1522

rho | 0.9789 0.9789 0.8151 0.8151 -------------------------------------------------------------------------------

Chapter 5


Chapter 55.3. Robust Estimation for Long Panels

39

• The individual fixed effects, if desired, can be easily handled by including dummy variables for each individual as regressors.

• With long panels (𝑇𝑇 → ∞), there is an issue of stationarity. Here we consider only methods for stationary errors, with the cases of unit roots and cointegration being briefly mentioned.

• When T is large, one cannot have cluster-robust standard errors (as in short panel case). Instead, it is necessary to specify a model for serial correlation in the error.

• Typical Stata commands for analyzing long panels include: xtregar, xtpcse, xtgls, xtscc, and the respective options.

• We will use the well-known cigarette demand data for illustrations.

The methods considered up to now have focused on short panels. Now we consider long panels with many time periods for few individuals (N is small and T is large).

Chapter 5


Chapter 5

40

Cigarette Demand Data

Recall the cigarette demand data introduced in Chap. 1: a panel of 46 states in United States over 30 years (1963-1992), given on the Wiley website for Baltagi (2005): https://www.wiley.com/legacy/wileychi/baltagi3e/.

Variables (columns) in the data file Cigar.txt are:(1) State = State abbreviation.(2) Year = Year 1963 to 1992.(3) Price = Price per pack of cigarettes.(4) Pop = Population.(5) Pop16 = Population above the age of 16.(6) CPI = Consumer price index with (1983=100)(7) NDI = Per capita disposable income.(8) C = Cigarette sales in packs per capita.(9) PIMIN = Minimum price in adjoining states per pack of cigarettes.

• Several time dummies corresponding to the major policy interventions in 1965, 1968 and 1971 can be added into the model.

• To reflect long panel nature, we choose only first 10 states.

Define:LnC = Ln(C)LnP = Ln(Price)LnNDI = Ln(NDI)LnPmin = Ln(PIMIN)

https://www.wiley.com/legacy/wileychi/baltagi3e/

Chapter 5


Chapter 5

41

Consider the one-way effects model:

𝑦𝑦𝑖𝑖𝑖𝑖 = 𝛼𝛼 + 𝑋𝑋𝑖𝑖𝑖𝑖′ 𝛽𝛽 + 𝜇𝜇𝑖𝑖 + 𝑣𝑣𝑖𝑖𝑖𝑖, i = 1, …, N, t = 1, …, T.

As now N is small, the individual effects can be merged into 𝑋𝑋𝑖𝑖𝑖𝑖in the form of dummies, so that the model is reduced to:

𝑦𝑦𝑖𝑖𝑖𝑖 = 𝐗𝐗𝑖𝑖𝑖𝑖′ 𝛃𝛃 + 𝑣𝑣𝑖𝑖𝑖𝑖, i = 1, …, N, t = 1, …, T. (5.5)

where the regressors 𝐗𝐗𝑖𝑖𝑖𝑖 include intercept, and may also include individual dummies, and time and possibly time-squared, giving a model like a regular multiple linear regression model.

The focal point for a long panel is the serial correlation of 𝑣𝑣𝑖𝑖𝑖𝑖over t. A model has to be specified as T is large.

As N is small, one can be more flexible on the cross-sectional relations: heteroskedasticity and cross-section correlation.

Serial Correlation & Heteroskedasticity in Long Panels

Chapter 5


Chapter 5Serial Correlation & Heteroskedasticity in Long Panels

42

Two special cases are of interest:Σ is diagonal (i.e., 𝜎𝜎𝑖𝑖𝑗𝑗2 = 0 for 𝑖𝑖 ≠ 𝑗𝑗), only heteroskedasticity.Σ is diagonal and further, all of the 𝜌𝜌𝑖𝑖 are equal to 𝜌𝜌.

Under exogeneity of 𝐗𝐗𝑖𝑖𝑖𝑖 in (5.5), the OLS is unbiased and consistent.

A simple way to model serial correlation is to allow for first-order autoregressive disturbances, i.e., AR(1), for (5.5):

𝑣𝑣𝑖𝑖𝑖𝑖 = 𝜌𝜌𝑖𝑖𝑣𝑣𝑖𝑖,𝑖𝑖−1 + 𝜖𝜖𝑖𝑖𝑖𝑖, i = 1, …, N, t = 1, …, T.where the autoregressive parameter may vary with i, with |𝜌𝜌𝑖𝑖 | < 1. Also, the remainder errors 𝜖𝜖𝑖𝑖𝑖𝑖 are assumed to be normal with mean zero and a general VC matrix that allows for possible heteroskedasticity and cross-sectional correlation:

E(𝜖𝜖𝜖𝜖𝐗) = Σ⨂𝐼𝐼𝑇𝑇, where 𝜖𝜖′ = (𝜖𝜖11, … , 𝜖𝜖1𝑇𝑇 , … , 𝜖𝜖𝑁𝑁1, … , 𝜖𝜖𝑁𝑁𝑇𝑇)

where Σ is 𝑁𝑁 × 𝑁𝑁 with elements 𝜎𝜎𝑖𝑖𝑗𝑗2 .

Chapter 5


Chapter 5

43


. xtgls LnC LnP LnNDI LnPmin Year, panels(correlated) corr(psar1)

Cross-sectional time-series FGLS regression

Coefficients: generalized least squaresPanels: heteroskedastic with cross-sectional correlationCorrelation: panel-specific AR(1)

Estimated covariances = 55 Number of obs = 300Estimated autocorrelations = 10 Number of groups = 10Estimated coefficients = 5 Time periods = 30

Wald chi2(4) = 415.15Prob > chi2 = 0.0000

------------------------------------------------------------------------------LnC | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------LnP | -.3582815 .0218597 -16.39 0.000 -.4011258 -.3154373

LnNDI | .5221231 .0573434 9.11 0.000 .4097321 .634514LnPmin | -.019819 .0289541 -0.68 0.494 -.0765681 .03693Year | -.0273553 .0052343 -5.23 0.000 -.0376144 -.0170961_cons | 3.847968 .2192518 17.55 0.000 3.418242 4.277693

------------------------------------------------------------------------------

Chapter 5


Chapter 5

44


where the xtset command is first executed before running xtgls:. xtset State Year

panel variable: State (strongly balanced)time variable: Year, 63 to 92

delta: 1 unit

• All regressors have the expected effects.• The estimated price elasticity of demand for cigarette is −.3583,• The income elasticity is estimated to be .5521,• Demand declines by 2.7% per year (the coefficient of Year is

semielasticity because the dependent variable is in logs),• The minimum price in the adjoining states does not have a

significant effect on the demand in the current state.

There are 10 states, so there 10×11/2 = 55 unique entries in the 10×10 contemporaneous error covariance matrix Σ, and 10 autocorrelation parameters 𝜌𝜌𝑖𝑖 are estimated!

Chapter 5


Chapter 5

45


The xtgls command does the pooled OLS or FGLS estimation when data are from a long panel. They allow the errors 𝑣𝑣𝑖𝑖𝑖𝑖 in the model to be correlated over i, allow the use of AR(1) models for 𝑣𝑣𝑖𝑖𝑖𝑖 over t, and allow 𝑣𝑣𝑖𝑖𝑖𝑖 to be heteroskedastic over i.An alternative Stata command, xtpcse, yields (long) panel-corrected standard errors (pcse) for the pooled OLS estimator, as well as for pooled least-squares estimator with an AR(1) model for 𝑣𝑣𝑖𝑖𝑖𝑖.A third choice, xtscc, generalizes xtpcse by allowing AR(m) errors. It gives Driscoll and Kraay (1998) standard errors for coefficients estimated by pooled OLS/WLS or fixed-effects (within) regression.Note: the xtscc is not automatically installed with the installation of Stata. It can be found and installed by following the steps:• Goto help −> search; type xtscc• In the pumped up window, click the link

xtscc from http://fmwww.bc.edu/RePEc/bocode/x• And then click on: “click here to install”.

http://fmwww.bc.edu/RePEc/bocode/x

Chapter 5


Chapter 5

46


Options for xtgls:panels( ): specifies the error correlation across individuals: • iid: 𝑣𝑣𝑖𝑖𝑖𝑖 are iid;• heteroskedastic: 𝑣𝑣𝑖𝑖𝑖𝑖 are independence over i, with changing variance 𝜎𝜎𝑖𝑖2• correlated: additionally allows correlation over individuals, with

independence over time for given individual.corr( ): specifies serial correlation of errors for each individual: • ar1: constant 𝜌𝜌; and • psar1: different 𝜌𝜌𝑖𝑖.

Options for xtpcse: correlation( ) with choices:• hetonly: 𝑣𝑣𝑖𝑖𝑖𝑖 are independence but heteroskedastic over i;• independence: 𝑣𝑣𝑖𝑖𝑖𝑖 are iid;• ar1: constant 𝜌𝜌; and • psar1: different 𝜌𝜌𝑖𝑖.

In all cases, panel corrected standard errors (PCSE) are reported, which allow heteroskedasticity and contemporaneous correlation over i.

Chapter 5


Chapter 5

47


Options for xtscc:.• lag(#): set maximum lag order of autocorrelation; default is• m(T)=floor[4(T/100)^(2/9)];• fe: perform fixed effects (within) regression;• re: perform GLS random effects regression• pooled: perform pooled OLS/WLS regression; default• noconstant: suppress regression constant in pooled OLS/WLS

regressions• ase: return (asymptotic) Driscoll-Kraay SE without small sample

adjustment

Chapter 5


Chapter 5

48


. * Comparison of various pooled OLS and GLS estimators

. quietly xtpcse LnC LnP LnNDI LnPmin Year, corr(ind) independent nmk

. estimates store OLS_iid

. quietly xtpcse LnC LnP LnNDI LnPmin Year, corr(ind)

. estimates store OLS_cor

. quietly xtscc LnC LnP LnNDI LnPmin Year, lag(4)

. estimates store OLS_DK

. quietly xtpcse LnC LnP LnNDI LnPmin Year, corr(ar1)

. estimates store AR1_cor

. quietly xtgls LnC LnP LnNDI LnPmin Year, corr(ar1) panels(iid)

. estimates store FGLSAR1

. quietly xtgls LnC LnP LnNDI LnPmin Year, corr(ar1) panels(correlated)

. estimates store FGLSCAR

We now use xtpcse, xtgls and user written xtscc (needs separate installation) to obtain the following pooled estimators and the associated standard errors: 1)pooled OLS with iid errors; 2) pooled OLS with standard errors assuming correlation over states; 3) pooled OLS assuming general serial correlation in the error (4 lags) and correlation over states; 4) pooled OLS that assumes an AR(1) error and gets standard errors that additionally permits correlation over states; 5) pooled FGLS with standard errors assuming an AR(1) error; and 6)pooled FGLS assuming an AR(1) error and correlation across states. 𝜌𝜌𝑖𝑖 = 𝜌𝜌.

Chapter 5


Chapter 5

49


. estimates table OLS_iid OLS_cor OLS_DK AR1_cor FGLSAR1 FGLSCAR, b(%7.3f) se

--------------------------------------------------------------------------Variable | OLS_iid OLS_cor OLS_DK AR1_cor FGLSAR1 FGLSCAR

-------------+------------------------------------------------------------LnP | -0.786 -0.786 -0.786 -0.311 -0.308 -0.360

| 0.130 0.172 0.262 0.050 0.049 0.026 LnNDI | 0.412 0.412 0.412 0.458 0.460 0.490

| 0.047 0.075 0.154 0.118 0.088 0.077 LnPmin | -0.049 -0.049 -0.049 -0.003 -0.001 -0.009

| 0.123 0.157 0.243 0.067 0.060 0.034 Year | 0.015 0.015 0.015 -0.024 -0.024 -0.023

| 0.007 0.006 0.012 0.011 0.008 0.007 _cons | 3.469 3.469 3.469 3.930 3.931 3.778

| 0.174 0.245 0.462 0.418 0.310 0.302 --------------------------------------------------------------------------

legend: b/se

For pooled OLS with iid errors, the nmk option normalizes the VCE by N−krather than N, so that the output is exactly the same as that from regress with default standard errors. The same could be obtained by using xtgls with the corr(ind) panel(iid) nmk options

Chapter 5


Chapter 5

50


. xtscc LnC LnP LnNDI LnPmin Year, fe lag(4)

Regression with Driscoll-Kraay standard errors Number of obs = 300Method: Fixed-effects regression Number of groups = 10Group variable (i): State F( 4, 29) = 117.61maximum lag: 4 Prob > F = 0.0000

within R-squared = 0.6392

------------------------------------------------------------------------------| Drisc/Kraay

LnC | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

LnP | -1.238273 .1367816 -9.05 0.000 -1.518023 -.9585236LnNDI | .7553486 .1595965 4.73 0.000 .4289371 1.08176

LnPmin | .4683234 .1679333 2.79 0.009 .1248613 .8117856Year | -.0152654 .0126641 -1.21 0.238 -.0411663 .0106355_cons | 2.562785 .463528 5.53 0.000 1.614764 3.510806

------------------------------------------------------------------------------

An final illustration is the xtscc with fe option. The default is re.

Compared with the results from xtscc LnC LnP LnNDI LnPmin Year, lag(4), we see that LnPmin becomes significant.

Chapter 5


Chapter 5Unit Roots and Cointegration

51

The methods for long panel considered depend on the stationarity of the time series, i.e., 𝜌𝜌𝑖𝑖 < 1, i = 1, …, N.The literature on panel methods for unit roots and cointegration is large, and it remains to be an active area of research.In standard application of long panel methods, it is of interest to test the existence unit roots and cointegration.

Panel unit-root tests:The Stata command xtunitroot (https://www.stata.com/features/overview/panel-data-unit-root-tests/) provides tests appropriate for all types of panel data: short, long, or large panel. A detailed treatments on these tests are beyond the course.

Panel cointegration tests:The Stata command xtcointtest (https://www.stata.com/new-in-stata/panel-data-cointegration-tests/) implements a variety of tests for panel data with large-N large-T. This seems to be an added feature for Stata 15. Again, a detained treatment on this topic is beyond the course.

https://www.stata.com/features/overview/panel-data-unit-root-tests/

https://www.stata.com/new-in-stata/panel-data-cointegration-tests/

Chapter 5


Chapter 55.4. Robust Estimation for Large Panels

52

Consider the one-way effects model:

𝑦𝑦𝑖𝑖𝑖𝑖 = 𝛼𝛼 + 𝑋𝑋𝑖𝑖𝑖𝑖′ 𝛽𝛽 + 𝜇𝜇𝑖𝑖 + 𝑣𝑣𝑖𝑖𝑖𝑖, i = 1, …, N, t = 1, …, T,

where both N and T are ‘large’.

The xtreg works for large panel under iid assumptions on 𝑣𝑣𝑖𝑖𝑖𝑖.An alternative and better procedure, xtregar, allows AR(1) error 𝑣𝑣𝑖𝑖𝑖𝑖 = 𝜌𝜌𝑖𝑖𝑣𝑣𝑖𝑖,𝑖𝑖−1 + 𝜖𝜖𝑖𝑖𝑖𝑖.

. * Comparison of various RE and FE estimators with full cigarette demand data

. quietly xtscc LnC LnP LnNDI LnPmin, lag(4)

. estimates store OLS_DK

. quietly xtreg LnC LnP LnNDI LnPmin, fe

. estimates store FE_REG

. quietly xtreg LnC LnP LnNDI LnPmin, re

. estimates store RE_REG

. quietly xtregar LnC LnP LnNDI LnPmin, fe

. estimates store FE_REGAR

. quietly xtregar LnC LnP LnNDI LnPmin, re

. estimates store RE_REGAR

. quietly xtscc LnC LnP LnNDI LnPmin, fe lag(4)

. estimates store FE_DK

Chapter 5


Chapter 5Robust Estimation for Large Panels

53

. estimates table OLS_DK FE_REG RE_REG FE_REGAR RE_REGAR FE_DK, b(%7.3f) se

--------------------------------------------------------------------------Variable | OLS_DK FE_REG RE_REG FE_RE~R RE_RE~R FE_DK

-------------+------------------------------------------------------------LnP | -1.107 -0.886 -0.889 -0.388 -0.413 -0.886

| 0.049 0.037 0.037 0.024 0.024 0.072 LnNDI | 0.569 0.512 0.512 0.213 0.250 0.512

| 0.048 0.014 0.014 0.026 0.019 0.031 LnPmin | 0.358 0.207 0.210 0.017 0.016 0.207

| 0.049 0.037 0.037 0.026 0.026 0.067 _cons | 2.908 3.111 3.108 4.452 4.221 3.111

| 0.208 0.060 0.064 0.017 0.099 0.137 --------------------------------------------------------------------------

legend: b/se

Indeed, xtregar gives more efficient estimators than does the xtreg.The last set of results from “xtscc LnC LnP LnNDI LnPmin, fe lag(4)” are the standard within estimators but with standard errors are robust to both spatial and temporal correlation of the error.However, the standard errors produced by xtscc are much larger, … .

chapter 5: panel estimation under

Documents