chapter 5: panel estimation under

1

Upload: others

Post on 11-Apr-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 5: Panel Estimation Under

[email protected] http://www.mysmu.edu/faculty/zlyang/ Zhenlin Yang

Chapter 5: Panel Estimation Under Heteroskedasticity and Serial Correlation

Cluster-robust estimation for short panelsβ€’ Pooled OLS or FGLS estimators or population-averaged estimatorsβ€’ Fixed effects estimators: FE, within, LSDV, First-differenceβ€’ Random effects estimators: RE, BE

Robust estimation for long panelsβ€’ Heteroskedasticity and serial correlationβ€’ Unit roots and cointegration

Robust estimation for large panels

This chapter introduces various panel estimation methods that take into account the possible existence of heteroskedasticityand/or serial correlation. The discussions draw on Cameron & Trivedi: Microeconometrics Using Stata 2009. Main methods include:

Page 2: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

Heteroskedasticity refers to that the variance of 𝑒𝑒𝑖𝑖𝑖𝑖 (or 𝑣𝑣𝑖𝑖𝑖𝑖 in case of FE models) changes over i or t or both, in particular over i as the cross-sectional units may be of varying size.Serial correlation means that 𝑒𝑒𝑖𝑖𝑖𝑖 are correlated over time in a way that is more than the equicorrelation induced by the random effects, because it is often that an unobserved shock in one period will affect the behavioral relationship for at least the next few periods. Recall: a short panel has large N and small T; a long panel has small N and large T, and a large panel has both N and T large.Depending on the type of panels that the estimation is based upon, the methods for handling these two issues are different.

5.1. Introduction

2

Consider the general panel data model that has been studied:𝑦𝑦𝑖𝑖𝑖𝑖 = 𝛼𝛼 + 𝑋𝑋𝑖𝑖𝑖𝑖′ 𝛽𝛽 + 𝑒𝑒𝑖𝑖𝑖𝑖, 𝑒𝑒𝑖𝑖𝑖𝑖 = πœ‡πœ‡π‘–π‘– + πœ†πœ†π‘–π‘– + 𝑣𝑣𝑖𝑖𝑖𝑖, (5.1)

with cross-sections i = 1, …, N, and time periods t = 1, …, T.

Page 3: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

3

The standard assumptions for OLS regression are

(i) E 𝑒𝑒𝑖𝑖 𝑋𝑋𝑖𝑖 = 0 (exogeneity of regressors).

(ii) E 𝑒𝑒𝑖𝑖2|𝑋𝑋𝑖𝑖 = 𝜎𝜎2 (conditional homoskedasticity),

(iii) E 𝑒𝑒𝑖𝑖𝑒𝑒𝑗𝑗|𝑋𝑋𝑖𝑖 ,𝑋𝑋𝑗𝑗 = 0, 𝑖𝑖 β‰  𝑗𝑗 (conditional zero correlation)

Consider the multiple linear regression model:

𝑦𝑦𝑖𝑖 = 𝛼𝛼 + 𝑋𝑋𝑖𝑖′𝛽𝛽 + 𝑒𝑒𝑖𝑖 = 𝐗𝐗𝑖𝑖′𝛃𝛃 + 𝑒𝑒𝑖𝑖 , 𝑖𝑖 = 1, … ,𝑛𝑛,

or in matrix form: 𝑦𝑦 = 𝐗𝐗𝛃𝛃 + 𝑒𝑒, where dim(𝛽𝛽) = k. The ordinary least squares (OLS) estimator of 𝛃𝛃 is

�𝛃𝛃OLS = (𝐗𝐗′𝐗𝐗)βˆ’1𝐗𝐗𝐗𝑦𝑦, which minimizes the sum of squares of errors,

βˆ‘π‘–π‘–=1𝑛𝑛 𝑦𝑦𝑖𝑖 βˆ’ 𝐗𝐗𝑖𝑖′𝛃𝛃2 = (𝑦𝑦 βˆ’ 𝐗𝐗𝛃𝛃)𝐗(𝑦𝑦 βˆ’ 𝐗𝐗𝛃𝛃).

Heteroskedasticity in OLS Regression

Page 4: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

4

The condition for �𝛃𝛃OLS to be valid (unbiased, consistent) is (i). Under (i), E �𝛃𝛃OLS 𝐗𝐗 = 𝐸𝐸 (𝐗𝐗′𝐗𝐗)βˆ’1𝐗𝐗𝐗𝑦𝑦 𝐗𝐗 = 𝛃𝛃 + 𝐸𝐸 𝑒𝑒 𝐗𝐗 = 𝛃𝛃, implying that E(�𝛃𝛃OLS) = 𝛃𝛃 (unbiased).The conditions for �𝛃𝛃OLS to be efficient are (ii) and (iii), under which Var �𝛃𝛃OLS = 𝜎𝜎2(𝐗𝐗′𝐗𝐗)βˆ’1 (efficient).

Heteroskedasticity-robust standard errors. If homoskedasticity assumption (ii) is violated, i.e., E 𝑒𝑒𝑖𝑖2|𝑋𝑋𝑖𝑖 = πœŽπœŽπ‘–π‘–2 (heteroskedasticity), then the OLS estimator �𝛃𝛃OLS remains valid (unbiased, consistent), but Var �𝛃𝛃OLS β‰  𝜎𝜎2 𝐗𝐗′𝐗𝐗 βˆ’1, instead,

Var �𝛃𝛃OLS = 𝐗𝐗′𝐗𝐗 βˆ’1𝐗𝐗′diag πœŽπœŽπ‘–π‘–2 𝐗𝐗 𝐗𝐗′𝐗𝐗 βˆ’1.

A heteroskedasticity-robust estimator of Var �𝛃𝛃OLS is

οΏ½Vrobust �𝛃𝛃OLS = 𝐗𝐗′𝐗𝐗 βˆ’1 π‘π‘π‘π‘βˆ’π‘˜π‘˜βˆ’1

βˆ‘π‘–π‘–π‘›π‘› �𝑒𝑒𝑖𝑖2𝑋𝑋𝑖𝑖𝑋𝑋𝑖𝑖′ 𝐗𝐗′𝐗𝐗 βˆ’1,

where �𝑒𝑒𝑖𝑖 are OLS residuals, i.e., �𝑒𝑒𝑖𝑖 = 𝑦𝑦𝑖𝑖 βˆ’ 𝐗𝐗𝑖𝑖′�𝛃𝛃OLS.

Heteroskedasticity in OLS Regression

Page 5: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

5

GLS Regression

Both �𝛃𝛃OLS and �𝛃𝛃GLS are unbiased and consistent.

But �𝛃𝛃GLS is more efficient than �𝛃𝛃OLS, because Var(�𝛃𝛃GLS) = 𝜎𝜎2(π—π—β€²Ξ©βˆ’1𝐗𝐗)βˆ’1 is β€œless than” Var �𝛃𝛃OLS = 𝜎𝜎2(𝐗𝐗′𝐗𝐗)βˆ’1.

In case where Ξ© is known up to a finite number of parameters 𝛾𝛾, i.e., Ξ© =Ξ©(𝛾𝛾), and if a consistent estimator of 𝛾𝛾, say �𝛾𝛾, is available, then a feasible GLS (FGLS) estimator of 𝛃𝛃 and its variance are:�𝛃𝛃FGLS = (π—π—β€²οΏ½Ξ©βˆ’1𝐗𝐗)βˆ’1π—π—π—οΏ½Ξ©βˆ’1𝑦𝑦, where οΏ½Ξ© = Ξ© �𝛾𝛾 ;

οΏ½Var(�𝛃𝛃FGLS) = �𝜎𝜎2(π—π—β€²οΏ½Ξ©βˆ’1𝐗𝐗)βˆ’1, where �𝜎𝜎2 is a consistent estimator of 𝜎𝜎2.

If E 𝑒𝑒𝑒𝑒′ 𝐗𝐗 = 𝜎𝜎2Ξ©, where Ξ© β‰  I, but is a known correlation matrix ((i) and/or (ii) violated), the generalized least-squares (GLS) estimator is:

�𝛃𝛃GLS = (π—π—β€²Ξ©βˆ’1𝐗𝐗)βˆ’1π—π—π—Ξ©βˆ’1𝑦𝑦,

which minimizes the sum of squares: (𝑦𝑦 βˆ’ 𝐗𝐗𝛃𝛃)π—Ξ©βˆ’1(𝑦𝑦 βˆ’ 𝐗𝐗𝛃𝛃), and

Var(�𝛃𝛃GLS) = 𝜎𝜎2(π—π—β€²Ξ©βˆ’1𝐗𝐗)βˆ’1.

Page 6: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

Under this parameterization, 𝐗𝐗𝑖𝑖𝑖𝑖 is (𝐾𝐾 + 𝑇𝑇 βˆ’ 1) Γ— 1, and πœ‡πœ‡π‘–π‘– are subject to βˆ‘π‘–π‘–=1𝑁𝑁 πœ‡πœ‡π‘–π‘– = 0. Writing (5.2) in vector form for each i, or the ith cluster,

𝑦𝑦𝑖𝑖 = πœ„πœ„π‘‡π‘‡π›Όπ›Ό + 𝐗𝐗𝑖𝑖𝛃𝛃 + πœ„πœ„π‘‡π‘‡πœ‡πœ‡π‘–π‘– + 𝑣𝑣𝑖𝑖, i = 1, 2, . . ., N,and applying the transformation: 𝑄𝑄𝑇𝑇 = 𝐼𝐼𝑇𝑇 βˆ’ 1

π‘‡π‘‡πœ„πœ„π‘‡π‘‡πœ„πœ„π‘‡π‘‡β€² , to give

𝑄𝑄𝑇𝑇𝑦𝑦𝑖𝑖 = 𝑄𝑄𝑇𝑇𝐗𝐗𝑖𝑖𝛃𝛃 + 𝑄𝑄𝑇𝑇𝑣𝑣𝑖𝑖, or π‘¦π‘¦π‘–π‘–βˆ— = π—π—π‘–π‘–βˆ—π›ƒπ›ƒ + π‘£π‘£π‘–π‘–βˆ—, i = 1, 2, . . ., N. (5.3)

The cluster-robust (CR) VC matrix of the Within estimator �𝛃𝛃:

For short panels, as T is small, it is common to let the time effects πœ†πœ†π‘–π‘– be fixed effects. Then Model (5.1) reduces to one-way model:

𝑦𝑦𝑖𝑖𝑖𝑖 = 𝛼𝛼 + 𝐗𝐗𝑖𝑖𝑖𝑖′ 𝛃𝛃 + πœ‡πœ‡π‘–π‘– + 𝑣𝑣𝑖𝑖𝑖𝑖, i = 1, …, N, t = 1, …, T, (5.2)

if the regressors 𝐗𝐗𝑖𝑖𝑖𝑖 includes a set of time dummies (with one time dummy dropped to avoid the dummy variable trap).

5.2. Cluster-Robust Estimation for Short Panels

6

Page 7: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

7

Assume in (5.3), (i) π‘£π‘£π‘–π‘–βˆ—are independent, and (ii) Var(π‘£π‘£π‘–π‘–βˆ—) = Ω𝑖𝑖, a general TΓ—T positive definite matrix. A robust estimator of the variance-covariance (VC) matrix of the Within estimator �𝛃𝛃 is:

οΏ½Var(�𝛃𝛃) = π—π—βˆ—β€²π—π—βˆ— βˆ’1 βˆ‘π‘–π‘–=1𝑁𝑁 π—π—π‘–π‘–βˆ—β€² οΏ½π‘£π‘£π‘–π‘–βˆ— οΏ½π‘£π‘£π‘–π‘–βˆ—β€²π—π—π‘–π‘–βˆ— π—π—βˆ—β€²π—π—βˆ— βˆ’1,

where οΏ½π‘£π‘£π‘–π‘–βˆ— = π‘¦π‘¦π‘–π‘–βˆ— βˆ’ π—π—π‘–π‘–βˆ—οΏ½π›ƒπ›ƒ, and π—π—βˆ— is NTΓ—(K+Tβˆ’1), which stacks π—π—π‘–π‘–βˆ—.

This is the result given in (3.12), and is valid only for short panels, i.e., the case of large N and small T.

It allows arbitrary correlation among the elements in π‘£π‘£π‘–π‘–βˆ—, for each i, but requires the independence of π‘£π‘£π‘–π‘–βˆ— over i.

Most importantly, the result not only applies to Model (5.3), obtained from one-way FE model after within transformation, it applies to any model of the that form.

The Cluster Robust Method

Page 8: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

8

Generalization. Consider the form π‘¦π‘¦π‘–π‘–βˆ— = π—π—π‘–π‘–βˆ—π›ƒπ›ƒ + π‘’π‘’π‘–π‘–βˆ—, i = 1, 2, ... , N. Assume E(π‘’π‘’π‘–π‘–βˆ—|π—π—π‘–π‘–βˆ—) = 0, Var(π‘’π‘’π‘–π‘–βˆ—|π—π—π‘–π‘–βˆ—) = Ω𝑖𝑖, and E 𝑒𝑒𝑖𝑖𝑒𝑒𝑗𝑗′|π—π—π‘–π‘–βˆ—,π—π—π‘—π‘—βˆ— = 0, i β‰  j. If dim(π’šπ’šπ’Šπ’Šβˆ—) is fixed, the OLS estimator �𝛃𝛃 = π—π—βˆ—β€²π—π—βˆ— βˆ’1π—π—βˆ—β€²π‘¦π‘¦βˆ— is valid, and a heteroskedasticity robust estimator of the VC matrix of �𝛃𝛃 is:

οΏ½Var(�𝛃𝛃) = π—π—βˆ—β€²π—π—βˆ— βˆ’1 βˆ‘π‘–π‘–=1𝑁𝑁 π—π—π‘–π‘–βˆ—β€² οΏ½π‘’π‘’π‘–π‘–βˆ— οΏ½π‘’π‘’π‘–π‘–βˆ—β€²π—π—π‘–π‘–βˆ— π—π—βˆ—β€²π—π—βˆ— βˆ’1, (5.4)

where οΏ½π‘’π‘’π‘–π‘–βˆ— = π‘¦π‘¦π‘–π‘–βˆ— βˆ’ π—π—π‘–π‘–βˆ—οΏ½π›ƒπ›ƒ, π‘¦π‘¦βˆ—is the stacked π‘¦π‘¦π‘–π‘–βˆ—, and π—π—βˆ— the stacked π—π—π‘–π‘–βˆ—.

It can be shown that the OLS estimator �𝛃𝛃 is robust against serial-correlation and cross-sectional heteroskedasticity of unknown form;Clearly, the VC matrix estimate οΏ½Var(�𝛃𝛃) is robust against unknown serial-correlation and cross-sectional heteroskedasticity.Therefore, �𝛃𝛃 and οΏ½Var(�𝛃𝛃) together provide a set of inference methods that are robust against unknown serial-correlation and cross-sectional heteroskedasticity. Various applications of (5.4) are presented next.

The Cluster Robust Method

Page 9: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

9

Pooled OLS or population-averaged estimators;

Pooled FGLS or population-averaged estimators;

Within estimation;

Within estimation, allowing time dummies;

Least-squares dummy-variables regression;

First-difference estimation, allowing time dummies;

One-way individual RE estimation, allowing time dummies;

Between estimation;

Comparison of panel estimators based on short panels.

Various applications of the result (5.4) are presented. The key Stata command/option for implementing (5.4) is vce(cluster id):

Applications of the Cluster Robust Method

Page 10: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

10

Pooled OLS or Population-Averaged Estimators

Pooled OLS estimators simply regress 𝑦𝑦𝑖𝑖𝑖𝑖 on 𝐗𝐗𝑖𝑖𝑖𝑖, using both between (cross-section) and within (time-series) variation in the data, and assuming the disturbances 𝑒𝑒𝑖𝑖𝑖𝑖 are iid.

β€’ The resulting OLS estimator �𝛃𝛃 of the coefficients 𝛃𝛃 of 𝐗𝐗𝑖𝑖𝑖𝑖 can be consistent if 𝐗𝐗𝑖𝑖𝑖𝑖 is uncorrelated with 𝑒𝑒𝑖𝑖𝑖𝑖, otherwise inconsistent.

β€’ Clearly, for RE models, the OLS estimator of 𝛃𝛃 is consistent, whereas for FE models, the OLS estimator of 𝛃𝛃 is inconsistent.

β€’ Even in the case where �𝛃𝛃 is consistent, the VC matrix of �𝛃𝛃 obtained from an OLS regression may not be correct, as 𝑒𝑒𝑖𝑖𝑖𝑖 may not be iid, leading to misleading inferences.

Page 11: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

11

To further motivate the need for a cluster-robust estimator of the VC matrix of an OLS estimator, consider Model (5.2):

𝑦𝑦𝑖𝑖𝑖𝑖 = 𝛼𝛼 + 𝐗𝐗𝑖𝑖𝑖𝑖′ 𝛃𝛃 + πœ‡πœ‡π‘–π‘– + 𝑣𝑣𝑖𝑖𝑖𝑖 .

Consistency of OLS requires that the error term 𝑒𝑒𝑖𝑖𝑖𝑖 = πœ‡πœ‡π‘–π‘– + 𝑣𝑣𝑖𝑖𝑖𝑖 be uncorrelated with 𝐗𝐗𝑖𝑖𝑖𝑖. So pooled OLS �𝛃𝛃 is consistent if πœ‡πœ‡π‘–π‘– are REbut inconsistent if πœ‡πœ‡π‘–π‘– are FE.

As 𝐗𝐗𝑖𝑖𝑖𝑖 and 𝑒𝑒𝑖𝑖𝑖𝑖 are uncorrelated, the intercept parameter 𝛼𝛼 and some other time-invariant regressors are allowed. Absorb these parameters into the 𝐗𝐗𝑖𝑖𝑖𝑖′ 𝛃𝛃 term and write the model as

𝑦𝑦𝑖𝑖𝑖𝑖 = 𝐗𝐗𝑖𝑖𝑖𝑖′ 𝛃𝛃 + 𝑒𝑒𝑖𝑖𝑖𝑖 or π’šπ’šπ‘–π‘– = 𝐗𝐗𝑖𝑖𝛃𝛃 + 𝑒𝑒𝑖𝑖.

Var(�𝛃𝛃) is not 𝜎𝜎2(𝐗𝐗′𝐗𝐗)βˆ’1, but (𝐗𝐗′𝐗𝐗)βˆ’1 βˆ‘π‘–π‘–=1𝑁𝑁 𝐗𝐗𝑖𝑖′ Ω𝑖𝑖𝐗𝐗𝑖𝑖 (𝐗𝐗′𝐗𝐗)βˆ’1.

In statistical literature, the pooled estimators are called population-averaged (pa) estimators.

Ω𝑖𝑖 = Var(𝑒𝑒𝑖𝑖)

Pooled OLS or Population-Averaged Estimators

Page 12: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

12

Pooled OLS or Population-Averaged Estimators

. * Pooled OLS with cluster-robust standard errors

. regress lwage exp expsq wks ed, vce(cluster id)

Linear regression Number of obs = 4,165F(4, 594) = 72.58Prob > F = 0.0000R-squared = 0.2836Root MSE = .39082

(Std. Err. adjusted for 595 clusters in id)------------------------------------------------------------------------------

| Robustlwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------exp | .044675 .0054385 8.21 0.000 .0339941 .055356

expsq | -.0007156 .0001285 -5.57 0.000 -.0009679 -.0004633wks | .005827 .0019284 3.02 0.003 .0020396 .0096144ed | .0760407 .0052122 14.59 0.000 .0658042 .0862772

_cons | 4.907961 .1399887 35.06 0.000 4.633028 5.182894------------------------------------------------------------------------------

We use the β€œReturns to Schooling Data” to demonstrate pooled OLS (or PA) with cluster-robust standard errors (CRSD).

The coefficients estimates are identical to those from xtreg, pa, to be given latter. The standard error are almost same as well.

Page 13: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

13

Pooled OLS or Population-Averaged Estimators

. * Pooled OLS with incorrect default standard errors

. regress lwage exp expsq wks edSource | SS df MS Number of obs = 4,165

-------------+---------------------------------- F(4, 4160) = 411.62Model | 251.491445 4 62.8728613 Prob > F = 0.0000

Residual | 635.413457 4,160 .152743619 R-squared = 0.2836-------------+---------------------------------- Adj R-squared = 0.2829

Total | 886.904902 4,164 .212993492 Root MSE = .39082------------------------------------------------------------------------------

lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

exp | .044675 .0023929 18.67 0.000 .0399838 .0493663expsq | -.0007156 .0000528 -13.56 0.000 -.0008191 -.0006121

wks | .005827 .0011827 4.93 0.000 .0035084 .0081456ed | .0760407 .0022266 34.15 0.000 .0716754 .080406

_cons | 4.907961 .0673297 72.89 0.000 4.775959 5.039963------------------------------------------------------------------------------

β€’ Wages increase with experience until a peak at 31 years [ 0.04472Γ—0.00072Γ—31

β‰ˆ 1];β€’ Wages increase by 0.6% with each additional week worked;β€’ And wages increase by 7.6% with each additional year of education.

The default standard errors assume that the regression errors are iid:

Page 14: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

14

Pooled OLS or Population-Averaged Estimators

β€’ These standard errors are misleadingly small, being .002393, 0.000053, 0.001183, 0.002227, compared with CRSD: 0.0054, 0.0001, 0.0019, 0.0052;

β€’ Therefore, it is essential that the OLS standard errors be corrected for clustering on individuals;

. * Pooled OLS with CRSD using the general xtreg, pa procedure.

. xtreg lwage exp expsq wks ed, pa corr(independent) vce(robust) nolog

…(Std. Err. adjusted for clustering on id)

------------------------------------------------------------------------------| Robust

lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

exp | .044675 .0054358 8.22 0.000 .034021 .0553291expsq | -.0007156 .0001284 -5.57 0.000 -.0009673 -.000464

wks | .005827 .0019275 3.02 0.003 .0020491 .0096048ed | .0760407 .0052097 14.60 0.000 .0658299 .0862515

_cons | 4.907961 .1399214 35.08 0.000 4.63372 5.182202------------------------------------------------------------------------------

The pooled OLS estimator can also be obtained using (xtreg, pa) command, with options corr(independent) and vce(robust) nolog:

Page 15: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

15

Pooled FGLS or Population-Averaged Estimators

Pooled FGLS estimation can lead to more efficient estimator of the parameters of the pooled model 𝑦𝑦𝑖𝑖𝑖𝑖 = 𝐗𝐗𝑖𝑖𝑖𝑖′ 𝛃𝛃 + 𝑒𝑒𝑖𝑖𝑖𝑖 (than OLS estimator). This is achieved by modelling the 𝑇𝑇 Γ— 𝑇𝑇 error correlation matrix of 𝑒𝑒𝑖𝑖assumed constant over i, (𝑦𝑦𝑖𝑖′𝑠𝑠 are independent over i and 𝑁𝑁 is large).

The pooled estimator, or PA estimator, is obtained using the (xtreg, pa) command, and with two key additional options:

corr( ): place different restriction on the error correlation;

vce(robust): to obtain cluster-robust standard errors that are valid even if corr( ) does not specify correct correlation model.

Let πœŒπœŒπ‘–π‘–π‘‘π‘‘ = Cor(𝑒𝑒𝑖𝑖𝑖𝑖,𝑒𝑒𝑖𝑖𝑑𝑑) be the correlation of the errors at time periods tand s, for individual i. Note the restriction that πœŒπœŒπ‘–π‘–π‘‘π‘‘ does not vary with i. Also, corr( ) options all set πœŒπœŒπ‘–π‘–π‘–π‘– = 1.

There are potentially T(Tβˆ’1) unique off-diagonal values in the 𝑇𝑇 Γ— 𝑇𝑇 error correlation matrix because it need not be that πœŒπœŒπ‘–π‘–π‘‘π‘‘ = πœŒπœŒπ‘‘π‘‘π‘–π‘–.

Page 16: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

16

Pooled FGLS or Population-Averaged Estimators

Typical options for corr( ) include:

β€’ corr(independence): sets πœŒπœŒπ‘–π‘–π‘‘π‘‘ = 0 for 𝑠𝑠 β‰  𝑑𝑑. Then the PA estimator equals the pooled OLS estimator;

β€’ corr(exchangeable): sets πœŒπœŒπ‘–π‘–π‘‘π‘‘ = 𝜌𝜌 for 𝑠𝑠 β‰  𝑑𝑑. Then errors are equicorrelated and (xtreg, pa) is asymptotically equivalent to (xtreg, re).

β€’ corr(ar k): specifies an autoregressive process of order k, or AR(k), for 𝑒𝑒𝑖𝑖𝑖𝑖.β€’ corr(stationary g): specifies a moving average process, or MA(g), for 𝑒𝑒𝑖𝑖𝑖𝑖.β€’ corr(unstructured): places no restrictions on πœŒπœŒπ‘–π‘–π‘‘π‘‘. For small T, this may be

the best model for correlations over time, but can fail for a larger T.β€’ The nolog option is to prevent the display of an iteration log.

In the statistics literature, the PA estimator is also called the generalized estimating equations (GEE) estimator.

The (xtreg, pa) command is a special case of xtgee with family(gaussian) option.

Page 17: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

17

Pooled FGLS or Population-Averaged EstimatorsWe demonstrate the applications of the (xtreg, re) command using the β€œReturn to Schooling Data”. * PA or pooled FGLS estimation with AR(2) and cluster-robust standard errors. xtreg lwage exp expsq wks ed, pa corr(ar 2) vce(robust) nolog

GEE population-averaged model Number of obs = 4,165Group and time vars: id year Number of groups = 595Link: identity Obs per group:Family: Gaussian min = 7Correlation: AR(2) avg = 7.0

max = 7Wald chi2(4) = 873.28

Scale parameter: .1966639 Prob > chi2 = 0.0000

(Std. Err. adjusted for clustering on id)------------------------------------------------------------------------------

| Robustlwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------exp | .0718915 .003999 17.98 0.000 .0640535 .0797294

expsq | -.0008966 .0000933 -9.61 0.000 -.0010794 -.0007137wks | .0002964 .0010553 0.28 0.779 -.001772 .0023647ed | .0905069 .0060161 15.04 0.000 .0787156 .1022982

_cons | 4.526381 .1056897 42.83 0.000 4.319233 4.733529------------------------------------------------------------------------------

Page 18: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

18

Pooled FGLS or Population-Averaged Estimators

Compared with the results from pooled OLS, we see that the coefficients change considerably, due to the use of AR(2) model. the cluster standard errors are smaller than those from the pooled

OLS for all regressors except ed, showing the efficacy gain.

The estimated error correlation matrix is stored in e(R). We have. * Estimated error correlation matrix after xtreg, pa. matrix list e(R)

symmetric e(R)[7,7]

c1 c2 c3 c4 c5 c6 c7

r1 1

r2 .89722058 1

r3 .84308581 .89722058 1

r4 .78392846 .84308581 .89722058 1

r5 .73064474 .78392846 .84308581 .89722058 1

r6 .6806209 .73064474 .78392846 .84308581 .89722058 1

r7 .63409777 .6806209 .73064474 .78392846 .84308581 .89722058 1

οΏ½πœŒπœŒπ‘–π‘–π‘‘π‘‘ changes only with the value of |tβˆ’s|, as an AR model is used.

Page 19: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

19

Pooled FGLS or Population-Averaged Estimators

If an unstructured error correlation matrix is specified, we have. xtreg lwage exp expsq wks south, pa corr(unstructured) vce(robust) nolog...

(Std. Err. adjusted for clustering on id)------------------------------------------------------------------------------

| Robustlwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------exp | .0635066 .0044502 14.27 0.000 .0547845 .0722288

expsq | -.00076 .0001004 -7.57 0.000 -.0009567 -.0005633wks | .0004141 .0009523 0.43 0.664 -.0014524 .0022805

south | -.0556712 .0516129 -1.08 0.281 -.1568307 .0454883_cons | 5.810748 .0591324 98.27 0.000 5.694851 5.926645

------------------------------------------------------------------------------. matrix list e(R)symmetric e(R)[7,7]

c1 c2 c3 c4 c5 c6 c7r1 1r2 .91725004 1r3 .87482529 .85342628 1r4 .81187266 .81020598 .94111792 1r5 .74119645 .75303939 .8840834 .91577823 1r6 .66331271 .68468392 .83678661 .88524751 .91405078 1r7 .6242693 .65875491 .83721361 .89982964 .93506435 .96584732 1

οΏ½πœŒπœŒπ‘–π‘–π‘‘π‘‘ changes with both the values of t and s, as an unstructured error correlation is specified.

Page 20: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

20

Pooled FGLS or Population-Averaged Estimators. * PA or pooled FGLS estimation with MA(6) and cluster-robust standard errors. xtreg lwage exp expsq wks ed, pa corr(stationary 6) vce(robust) nolog

GEE population-averaged model Number of obs = 4,165Group and time vars: id year Number of groups = 595Link: identity Obs per group:Family: Gaussian min = 7Correlation: stationary(6) avg = 7.0

max = 7Wald chi2(4) = 596.42

Scale parameter: .1650487 Prob > chi2 = 0.0000

(Std. Err. adjusted for clustering on id)------------------------------------------------------------------------------

| Robustlwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------exp | .0608033 .0039931 15.23 0.000 .0529769 .0686297

expsq | -.0008603 .0000929 -9.26 0.000 -.0010424 -.0006781wks | .0005029 .0010016 0.50 0.616 -.0014602 .002466ed | .07985 .0053356 14.97 0.000 .0693923 .0903076

_cons | 4.846687 .097278 49.82 0.000 4.656026 5.037349------------------------------------------------------------------------------The results are similar to those based on AR(2).

Page 21: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

21

Time Series Autocorrelation for Panel Data

Some Stata commands are useful in analyzing the correlation of errors over time. First, set both panel and time identifies by xtset.β€’ L1.lwage or L.lwage: for lwage lagged once;β€’ L2.lwage: for lwage lagged twice;β€’ D.lwage: for the first difference in lwage (equals lwage βˆ’ L.lwage);β€’ LD.lwage: for the difference lagged once;β€’ L2D.lwage: for the difference lagged twice.. correlate lwage L1.lwage L2.lwage L3.lwage L4.lwage L5.lwage L6.lwage (obs=595)

| L. L2. L3. L4. L5. L6.| lwage lwage lwage lwage lwage lwage lwage

-------------+---------------------------------------------------------------lwage |--. | 1.0000L1. | 0.9238 1.0000L2. | 0.9083 0.9271 1.0000L3. | 0.8753 0.8843 0.9067 1.0000L4. | 0.8471 0.8551 0.8833 0.8990 1.0000L5. | 0.8261 0.8347 0.8721 0.8641 0.8667 1.0000L6. | 0.8033 0.8163 0.8518 0.8465 0.8594 0.9418 1.0000

Correlation π‘Ÿπ‘Ÿπ‘–π‘–π‘‘π‘‘ changes only with the values of t and s.

Page 22: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

22

Within Estimator with Cluster-Robust SE

The within (of FE) estimator of a one-way FE model is obtained by running an OLS regression on the within-transformed model (5.3), or an OLS regression of the within equation:

𝑦𝑦𝑖𝑖𝑖𝑖 βˆ’ �𝑦𝑦𝑖𝑖� = 𝑋𝑋𝑖𝑖𝑖𝑖 βˆ’ �𝑋𝑋𝑖𝑖�′𝛽𝛽 + 𝑣𝑣𝑖𝑖𝑖𝑖 βˆ’ �̅�𝑣𝑖𝑖� .

The (xtreg, fe) command computes this estimator assuming 𝑣𝑣𝑖𝑖𝑖𝑖 are iid. The vce(robust) option relaxes iid assumption and provides cluster-robust standard errors (CRSE), (𝑦𝑦𝑖𝑖𝑖𝑖′ 𝑠𝑠 independent over i and 𝑁𝑁 large).

β€’ The FE or within estimator controls for the fixed effects πœ‡πœ‡π‘–π‘–, by using the within i differences so that πœ‡πœ‡π‘–π‘– are differenced out;

β€’ However, the within estimation method is unable to estimate the coefficients of time-invariant regressors, and

β€’ The within estimator will be relatively imprecise for time-varying regressors that vary little over time.

β€’ Further, the within estimation will be relatively less efficient as a result of losing one period of data due to differencing.

Page 23: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

23

Within Estimator with Cluster-Robust SE. xtreg lwage exp expsq wks, fe vce(cluster id)

Fixed-effects (within) regression Number of obs = 4,165Group variable: id Number of groups = 595

R-sq: Obs per group:within = 0.6566 min = 7between = 0.0276 avg = 7.0overall = 0.0476 max = 7

F(3,594) = 1059.72corr(u_i, Xb) = -0.9107 Prob > F = 0.0000

(Std. Err. adjusted for 595 clusters in id)------------------------------------------------------------------------------

| Robustlwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------exp | .1137879 .0040289 28.24 0.000 .1058753 .1217004

expsq | -.0004244 .0000822 -5.16 0.000 -.0005858 -.0002629wks | .0008359 .0008697 0.96 0.337 -.0008721 .0025439

_cons | 4.596396 .0600887 76.49 0.000 4.478384 4.714408-------------+----------------------------------------------------------------

sigma_u | 1.0362039sigma_e | .15220316

rho | .97888036 (fraction of variance due to u_i)------------------------------------------------------------------------------

Compared with pooled OLS, the standard errors have increased. The edvariable cannot be included.

Page 24: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

24

Within Estimator with CRSE and Time Dummies

. xtreg lwage exp expsq wks i.year, fe vce(cluster id)note: 7.year omitted because of collinearity

Fixed-effects (within) regression Number of obs = 4,165Group variable: id Number of groups = 595

R-sq: Obs per group:within = 0.6599 min = 7between = 0.0275 avg = 7.0overall = 0.0480 max = 7

F(8,594) = 412.33corr(u_i, Xb) = -0.9089 Prob > F = 0.0000

(Std. Err. adjusted for 595 clusters in id)------------------------------------------------------------------------------

| Robustlwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------exp | .1119927 .0041184 27.19 0.000 .1039043 .1200812

expsq | -.0004051 .0000834 -4.86 0.000 -.0005688 -.0002413wks | .00068 .0008812 0.77 0.441 -.0010506 .0024105

Page 25: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

25

Within Estimator with CRSE and Time Dummies

Cont’dyear |

2 | -.0083984 .0049321 -1.70 0.089 -.0180849 .00128813 | .0259652 .0084359 3.08 0.002 .0093974 .04253294 | .0289134 .0078093 3.70 0.000 .0135762 .04425065 | .0239406 .0065275 3.67 0.000 .0111208 .03676046 | .0069955 .0064617 1.08 0.279 -.0056949 .0196867 | 0 (omitted)

|_cons | 4.618339 .0599451 77.04 0.000 4.500609 4.736069

-------------+----------------------------------------------------------------sigma_u | 1.0268811sigma_e | .15159041

rho | .97867247 (fraction of variance due to u_i)------------------------------------------------------------------------------

Page 26: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

26

Least-Squares Dummy-Variables Regression

The within estimator of 𝛽𝛽 can be shown to equal the estimator obtained from a direct OLS estimation of πœ‡πœ‡1, . . . , πœ‡πœ‡π‘π‘ and 𝛽𝛽 in individual effects model 𝑦𝑦𝑖𝑖𝑖𝑖 = 𝑋𝑋𝑖𝑖𝑖𝑖′ 𝛽𝛽 + πœ‡πœ‡π‘–π‘– + 𝑣𝑣𝑖𝑖𝑖𝑖, using command areg:. areg lwage exp expsq wks, absorb(id) vce(cluster id)

Linear regression, absorbing indicators Number of obs = 4,165F( 3, 594) = 908.44Prob > F = 0.0000R-squared = 0.9068Adj R-squared = 0.8912Root MSE = 0.1522

(Std. Err. adjusted for 595 clusters in id)------------------------------------------------------------------------------

| Robustlwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------exp | .1137879 .0043514 26.15 0.000 .1052418 .1223339

expsq | -.0004244 .0000888 -4.78 0.000 -.0005988 -.00025wks | .0008359 .0009393 0.89 0.374 -.0010089 .0026806

_cons | 4.596396 .0648993 70.82 0.000 4.468936 4.723856-------------+----------------------------------------------------------------

id | absorbed (595 categories)

The coefficients estimates are the same as those from xtreg, fe. The robust standard errors differ and are invalid as aregis designed for long panels.

Page 27: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

27

First-Difference Estimator

Consistent estimation in one-way FE model requires elimination of πœ‡πœ‡1, . . . , πœ‡πœ‡π‘π‘ which is achieved by the within transformation to give the within estimator. An orthogonal transformation method was introduced in Chapter 2. Another way to do so is through the first difference:

𝑦𝑦𝑖𝑖𝑖𝑖 βˆ’ 𝑦𝑦𝑖𝑖,π‘–π‘–βˆ’1 = 𝑋𝑋𝑖𝑖𝑖𝑖 βˆ’ 𝑋𝑋𝑖𝑖,π‘–π‘–βˆ’1′𝛽𝛽 + 𝑣𝑣𝑖𝑖𝑖𝑖 βˆ’ 𝑣𝑣𝑖𝑖,π‘–π‘–βˆ’1 ,

where the time-invariant πœ‡πœ‡π‘–π‘– are eliminated through differencing. An OLS estimation of this model yields consistent estimates of 𝛽𝛽.

β€’ The FD operator is not provided as an option to xtreg. Instead, the estimator can be computed using regress and Stata time-series operators D. to compute the first difference.

β€’ Similar to the within estimator, the time dummies, fixed time effects,can be added to the model.

β€’ The robust standard errors can be calculated using vce(cluster id), valid when observations are independent over i and 𝑁𝑁 β†’ ∞.

Page 28: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

28

First-Difference Estimator

. regress D.(lwage exp expsq wks ed), vce(cluster id) noconstantnote: D.ed omitted because of collinearity

Linear regression Number of obs = 3,570F(3, 594) = 1035.19Prob > F = 0.0000R-squared = 0.2209Root MSE = .18156

(Std. Err. adjusted for 595 clusters in id)------------------------------------------------------------------------------

| RobustD.lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------exp |D1. | .1170654 .0040974 28.57 0.000 .1090182 .1251126

expsq |D1. | -.0005321 .0000808 -6.58 0.000 -.0006908 -.0003734wks |D1. | -.0002683 .0011783 -0.23 0.820 -.0025824 .0020459ed |

D1. | 0 (omitted)------------------------------------------------------------------------------

Page 29: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

29

One-Way Random Effects Estimator with CRSE

β€’ For the disturbances 𝑒𝑒𝑖𝑖𝑖𝑖 = πœ‡πœ‡π‘–π‘– + 𝑣𝑣𝑖𝑖𝑖𝑖, it is easy to see that

πœŒπœŒπ‘–π‘–π‘‘π‘‘ = Cor 𝑒𝑒𝑖𝑖𝑖𝑖 ,𝑒𝑒𝑖𝑖𝑑𝑑 = οΏ½πœŽπœŽπœ‡πœ‡2 (πœŽπœŽπœ‡πœ‡2 + πœŽπœŽπ‘£π‘£2) = 𝜌𝜌, for all 𝑠𝑠 β‰  𝑑𝑑.

β€’ RE model has equicorrelated/exchangeable errors, which is realized by Stata command xtreg with option re.

β€’ The options re, mle, and pa corr(exchangeable) give asymptotically equivalent estimators of 𝛽𝛽, but different estimators of πœŽπœŽπœ‡πœ‡2 and πœŽπœŽπ‘£π‘£2.

β€’ The robust standard errors can be calculated using vce(cluster id), valid when observations are independent over i and 𝑁𝑁 β†’ ∞.

Recall the one-way random effects model given Ch. 2:𝑦𝑦𝑖𝑖𝑖𝑖 = 𝛼𝛼 + 𝑋𝑋𝑖𝑖𝑖𝑖′ 𝛽𝛽 + πœ‡πœ‡π‘–π‘– + 𝑣𝑣𝑖𝑖𝑖𝑖, i = 1, …, N and t = 1, …, T.

The default of (xtreg, re) command returns RE estimator of this model under πœ‡πœ‡π‘–π‘– ~ IID(0, πœŽπœŽπœ‡πœ‡2) and 𝑣𝑣𝑖𝑖𝑖𝑖 ~ IID(0, πœŽπœŽπ‘£π‘£2), independent of each other, and 𝑋𝑋𝑖𝑖𝑖𝑖 is independent of πœ‡πœ‡π‘–π‘– and 𝑣𝑣𝑖𝑖𝑖𝑖 for all i and t.

Page 30: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

30

One-Way Random Effects Estimator with CRSE. xtreg lwage exp expsq wks ed, re vce(cluster id) thetaRandom-effects GLS regression Number of obs = 4,165Group variable: id Number of groups = 595R-sq: Obs per group:

within = 0.6340 min = 7between = 0.1716 avg = 7.0overall = 0.1830 max = 7

Wald chi2(4) = 1598.50corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000theta = .82280511

(Std. Err. adjusted for 595 clusters in id)------------------------------------------------------------------------------

| Robustlwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------exp | .0888609 .0039992 22.22 0.000 .0810227 .0966992

expsq | -.0007726 .0000896 -8.62 0.000 -.0009481 -.000597wks | .0009658 .0009259 1.04 0.297 -.000849 .0027806ed | .1117099 .0083954 13.31 0.000 .0952552 .1281647

_cons | 3.829366 .1333931 28.71 0.000 3.567921 4.090812-------------+----------------------------------------------------------------

sigma_u | .31951859sigma_e | .15220316

rho | .81505521 (fraction of variance due to u_i)------------------------------------------------------------------------------

Page 31: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

31

One-Way Random Effects Estimator with CRSE. xtreg lwage exp expsq wks ed, mle

Random-effects ML regression Number of obs = 4,165Group variable: id Number of groups = 595

Random effects u_i ~ Gaussian Obs per group:min = 7avg = 7.0max = 7

LR chi2(4) = 2828.12Log likelihood = 293.69563 Prob > chi2 = 0.0000------------------------------------------------------------------------------

lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]-------------+----------------------------------------------------------------

exp | .1079955 .0024806 43.54 0.000 .1031335 .1128574expsq | -.0005202 .0000546 -9.53 0.000 -.0006272 -.0004132wks | .0008365 .0006042 1.38 0.166 -.0003477 .0020208ed | .1378558 .0125933 10.95 0.000 .1131735 .1625382

_cons | 2.989859 .1720638 17.38 0.000 2.65262 3.327097-------------+----------------------------------------------------------------

/sigma_u | .8509013 .0278622 .7980078 .9073006/sigma_e | .1536109 .0018574 .1500132 .1572949

rho | .9684385 .002199 .9638788 .9725117------------------------------------------------------------------------------LR test of sigma_u=0: chibar2(01) = 4576.13 Prob >= chibar2 = 0.000

Page 32: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

32

One-Way Random Effects Estimator with CRSE. xtreg lwage exp expsq wks ed, pa corr(exchangeable)

GEE population-averaged model Number of obs = 4,165Group variable: id Number of groups = 595Link: identity Obs per group:Family: Gaussian min = 7Correlation: exchangeable avg = 7.0

max = 7Wald chi2(4) = 6160.57

Scale parameter: .7476287 Prob > chi2 = 0.0000

------------------------------------------------------------------------------lwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------exp | .1079955 .0024527 44.03 0.000 .1031883 .1128026

expsq | -.0005202 .0000543 -9.59 0.000 -.0006266 -.0004139wks | .0008365 .0006042 1.38 0.166 -.0003477 .0020208ed | .1378558 .0125814 10.96 0.000 .1131968 .1625149

_cons | 2.98986 .1711799 17.47 0.000 2.654353 3.325366------------------------------------------------------------------------------

Page 33: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

33

Between Estimator with CRSE

β€’ The between estimator is obtained by specifying the be option of the xtreg command. This essentially a cross-section regression.

β€’ Therefore, the cross-sectional heteroskedasticity is the issue of concern. There is no explicit option of heteroskedasticity-robust standard errors, except the vce(bootstrap) option.

β€’ The between estimator is based on averages over t, , i.e., based on the between i variations. Hence it is less efficient than the other estimators such as RE, MLE.

The between estimator is the OLS estimator of the between model:

�𝑦𝑦𝑖𝑖� = �𝑋𝑋𝑖𝑖�′ 𝛽𝛽 + (πœ‡πœ‡π‘–π‘– + �̅�𝑣𝑖𝑖�).

Consistency of the OLS estimator �̂�𝛽 requires that the β€˜disturbance’ term (πœ‡πœ‡π‘–π‘– + �̅�𝑣𝑖𝑖�) is uncorrelated with 𝑋𝑋𝑖𝑖𝑖𝑖. This is the case if πœ‡πœ‡π‘–π‘– is a random effect but not if πœ‡πœ‡π‘–π‘– is a fixed effect.

Page 34: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

34

Between Estimator with CRSE

. xtreg lwage exp expsq wks ed, be

Between regression (regression on group means) Number of obs = 4,165Group variable: id Number of groups = 595

R-sq: Obs per group:within = 0.1357 min = 7between = 0.3264 avg = 7.0overall = 0.2723 max = 7

F(4,590) = 71.48sd(u_i + avg(e_i.))= .324656 Prob > F = 0.0000

------------------------------------------------------------------------------lwage | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------exp | .038153 .0056967 6.70 0.000 .0269647 .0493412

expsq | -.0006313 .0001257 -5.02 0.000 -.0008781 -.0003844wks | .0130903 .0040659 3.22 0.001 .0051048 .0210757ed | .0737838 .0048985 15.06 0.000 .0641632 .0834044

_cons | 4.683039 .2100989 22.29 0.000 4.270407 5.095672------------------------------------------------------------------------------

Page 35: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

35

Between Estimator with CRSE

. xtreg lwage exp expsq wks ed, be vce(bootstrap)(running xtreg on estimation sample)

Bootstrap replications (50)----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5 .................................................. 50

Between regression (regression on group means) Number of obs = 4,165Group variable: id Number of groups = 595

…

(Replications based on 595 clusters in id)------------------------------------------------------------------------------

| Observed Bootstrap Normal-basedlwage | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------exp | .038153 .0056931 6.70 0.000 .0269946 .0493113

expsq | -.0006313 .0001292 -4.89 0.000 -.0008845 -.0003781wks | .0130903 .0035953 3.64 0.000 .0060437 .0201369ed | .0737838 .005292 13.94 0.000 .0634116 .084156

_cons | 4.683039 .2019226 23.19 0.000 4.287278 5.0788------------------------------------------------------------------------------

Page 36: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

36

Comparison of Panel Estimators based on Short Panels

Recall from Chap 3 the three 𝑅𝑅2 measures reported in Stata:Within 𝑅𝑅2: 𝜌𝜌2{ 𝑦𝑦𝑖𝑖𝑖𝑖 βˆ’ �𝑦𝑦𝑖𝑖� , 𝑋𝑋𝑖𝑖𝑖𝑖 βˆ’ �𝑋𝑋𝑖𝑖� �̂�𝛽}Between 𝑅𝑅2: 𝜌𝜌2(�𝑦𝑦𝑖𝑖�, �𝑋𝑋𝑖𝑖��̂�𝛽)Overall 𝑅𝑅2: 𝜌𝜌2(𝑦𝑦𝑖𝑖𝑖𝑖 , 𝑋𝑋𝑖𝑖𝑖𝑖�̂�𝛽)

where 𝜌𝜌2(π‘₯π‘₯,𝑦𝑦) denotes the squared correlation between x and y, and �̂�𝛽is obtained from one of the xtreg options (be, fe, or re).

Also, Stata reports:sigma_u: gives the standard deviation of individual effects πœ‡πœ‡π‘–π‘–sigma_e: gives the standard deviation of idiosyncratic error 𝑣𝑣𝑖𝑖𝑖𝑖

rho: the fraction of variance due to πœ‡πœ‡π‘–π‘–, i.e., 𝜌𝜌 = οΏ½πœŽπœŽπœ‡πœ‡2 (πœŽπœŽπœ‡πœ‡2 + πœŽπœŽπ‘£π‘£2).

In RE estimation, there is a theta option (STATA default):

πœƒπœƒ = 1 βˆ’ οΏ½πœŽπœŽπ‘£π‘£2 π‘‡π‘‡πœŽπœŽπœ‡πœ‡2 + πœŽπœŽπ‘£π‘£2 ,

which turns FGLS to OLS.

For pooled OLS estimation: οΏ½Μ‚οΏ½πœƒ = 0; For within estimation: οΏ½Μ‚οΏ½πœƒ = 1;For RE, οΏ½Μ‚οΏ½πœƒ β†’ 1 as T and πœŽπœŽπœ‡πœ‡2 get large.

Page 37: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

37

Comparison of Panel Estimators based on Short PanelsWe compare some of the panel estimators and the associated standard errors, variance components estimates, and R2. Note: pooled OLS is the same as xtreg command with the corr(independence) and pa options.The Stata commands are:οΏ½ * Compare OLS, BE, FE, RE estimators, and methods to compare standard errors

οΏ½ global xlist exp expsq wks ed

οΏ½ quietly regress lwage $xlist, vce(cluster id)

οΏ½ estimates store OLS_rob

οΏ½ quietly xtreg lwage $xlist, be

οΏ½ estimates store BE

οΏ½ quietly xtreg lwage $xlist, fe

οΏ½ estimates store FE

οΏ½ quietly xtreg lwage $xlist, fe vce(robust)

οΏ½ estimates store FE_rob

οΏ½ quietly xtreg lwage $xlist, re

οΏ½ estimates store RE

οΏ½ quietly xtreg lwage $xlist, re vce(robust)

οΏ½ estimates store RE_rob

οΏ½ estimates table OLS_rob BE FE FE_rob RE RE_rob,

> b se stats(N r2 r2_o r2_b r2_w sigma_u sigma_e rho) b(%7.4f)

Page 38: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

38

Comparison of Panel Estimators based on Short Panels

-------------------------------------------------------------------------------Variable | OLS_rob BE FE FE_rob RE RE_rob

--------+----------------------------------------------------------------------exp | 0.0447 0.0382 0.1138 0.1138 0.0889 0.0889

| 0.0054 0.0057 0.0025 0.0040 0.0028 0.0040 expsq | -0.0007 -0.0006 -0.0004 -0.0004 -0.0008 -0.0008

| 0.0001 0.0001 0.0001 0.0001 0.0001 0.0001 wks | 0.0058 0.0131 0.0008 0.0008 0.0010 0.0010

| 0.0019 0.0041 0.0006 0.0009 0.0007 0.0009 ed | 0.0760 0.0738 (omitted) (omitted) 0.1117 0.1117

| 0.0052 0.0049 0.0061 0.0084 _cons | 4.9080 4.6830 4.5964 4.5964 3.8294 3.8294

| 0.1400 0.2101 0.0389 0.0601 0.0936 0.1334 --------+----------------------------------------------------------------------

N | 4165 4165 4165 4165 4165 4165 r2 | 0.2836 0.3264 0.6566 0.6566

r2_o | 0.2723 0.0476 0.0476 0.1830 0.1830 r2_b | 0.3264 0.0276 0.0276 0.1716 0.1716 r2_w | 0.1357 0.6566 0.6566 0.6340 0.6340

sigma_u | 1.0362 1.0362 0.3195 0.3195 sigma_e | 0.1522 0.1522 0.1522 0.1522

rho | 0.9789 0.9789 0.8151 0.8151 -------------------------------------------------------------------------------

Page 39: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 55.3. Robust Estimation for Long Panels

39

β€’ The individual fixed effects, if desired, can be easily handled by including dummy variables for each individual as regressors.

β€’ With long panels (𝑇𝑇 β†’ ∞), there is an issue of stationarity. Here we consider only methods for stationary errors, with the cases of unit roots and cointegration being briefly mentioned.

β€’ When T is large, one cannot have cluster-robust standard errors (as in short panel case). Instead, it is necessary to specify a model for serial correlation in the error.

β€’ Typical Stata commands for analyzing long panels include: xtregar, xtpcse, xtgls, xtscc, and the respective options.

β€’ We will use the well-known cigarette demand data for illustrations.

The methods considered up to now have focused on short panels. Now we consider long panels with many time periods for few individuals (N is small and T is large).

Page 40: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

40

Cigarette Demand Data

Recall the cigarette demand data introduced in Chap. 1: a panel of 46 states in United States over 30 years (1963-1992), given on the Wiley website for Baltagi (2005): https://www.wiley.com/legacy/wileychi/baltagi3e/.

Variables (columns) in the data file Cigar.txt are:(1) State = State abbreviation.(2) Year = Year 1963 to 1992.(3) Price = Price per pack of cigarettes.(4) Pop = Population.(5) Pop16 = Population above the age of 16.(6) CPI = Consumer price index with (1983=100)(7) NDI = Per capita disposable income.(8) C = Cigarette sales in packs per capita.(9) PIMIN = Minimum price in adjoining states per pack of cigarettes.

β€’ Several time dummies corresponding to the major policy interventions in 1965, 1968 and 1971 can be added into the model.

β€’ To reflect long panel nature, we choose only first 10 states.

Define:LnC = Ln(C)LnP = Ln(Price)LnNDI = Ln(NDI)LnPmin = Ln(PIMIN)

Page 41: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

41

Consider the one-way effects model:

𝑦𝑦𝑖𝑖𝑖𝑖 = 𝛼𝛼 + 𝑋𝑋𝑖𝑖𝑖𝑖′ 𝛽𝛽 + πœ‡πœ‡π‘–π‘– + 𝑣𝑣𝑖𝑖𝑖𝑖, i = 1, …, N, t = 1, …, T.

As now N is small, the individual effects can be merged into 𝑋𝑋𝑖𝑖𝑖𝑖in the form of dummies, so that the model is reduced to:

𝑦𝑦𝑖𝑖𝑖𝑖 = 𝐗𝐗𝑖𝑖𝑖𝑖′ 𝛃𝛃 + 𝑣𝑣𝑖𝑖𝑖𝑖, i = 1, …, N, t = 1, …, T. (5.5)

where the regressors 𝐗𝐗𝑖𝑖𝑖𝑖 include intercept, and may also include individual dummies, and time and possibly time-squared, giving a model like a regular multiple linear regression model.

The focal point for a long panel is the serial correlation of 𝑣𝑣𝑖𝑖𝑖𝑖over t. A model has to be specified as T is large.

As N is small, one can be more flexible on the cross-sectional relations: heteroskedasticity and cross-section correlation.

Serial Correlation & Heteroskedasticity in Long Panels

Page 42: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5Serial Correlation & Heteroskedasticity in Long Panels

42

Two special cases are of interest:Ξ£ is diagonal (i.e., πœŽπœŽπ‘–π‘–π‘—π‘—2 = 0 for 𝑖𝑖 β‰  𝑗𝑗), only heteroskedasticity.Ξ£ is diagonal and further, all of the πœŒπœŒπ‘–π‘– are equal to 𝜌𝜌.

Under exogeneity of 𝐗𝐗𝑖𝑖𝑖𝑖 in (5.5), the OLS is unbiased and consistent.

A simple way to model serial correlation is to allow for first-order autoregressive disturbances, i.e., AR(1), for (5.5):

𝑣𝑣𝑖𝑖𝑖𝑖 = πœŒπœŒπ‘–π‘–π‘£π‘£π‘–π‘–,π‘–π‘–βˆ’1 + πœ–πœ–π‘–π‘–π‘–π‘–, i = 1, …, N, t = 1, …, T.where the autoregressive parameter may vary with i, with |πœŒπœŒπ‘–π‘– | < 1. Also, the remainder errors πœ–πœ–π‘–π‘–π‘–π‘– are assumed to be normal with mean zero and a general VC matrix that allows for possible heteroskedasticity and cross-sectional correlation:

E(πœ–πœ–πœ–πœ–π—) = Σ⨂𝐼𝐼𝑇𝑇, where πœ–πœ–β€² = (πœ–πœ–11, … , πœ–πœ–1𝑇𝑇 , … , πœ–πœ–π‘π‘1, … , πœ–πœ–π‘π‘π‘‡π‘‡)

where Ξ£ is 𝑁𝑁 Γ— 𝑁𝑁 with elements πœŽπœŽπ‘–π‘–π‘—π‘—2 .

Page 43: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

43

Serial Correlation & Heteroskedasticity in Long Panels

. xtgls LnC LnP LnNDI LnPmin Year, panels(correlated) corr(psar1)

Cross-sectional time-series FGLS regression

Coefficients: generalized least squaresPanels: heteroskedastic with cross-sectional correlationCorrelation: panel-specific AR(1)

Estimated covariances = 55 Number of obs = 300Estimated autocorrelations = 10 Number of groups = 10Estimated coefficients = 5 Time periods = 30

Wald chi2(4) = 415.15Prob > chi2 = 0.0000

------------------------------------------------------------------------------LnC | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------LnP | -.3582815 .0218597 -16.39 0.000 -.4011258 -.3154373

LnNDI | .5221231 .0573434 9.11 0.000 .4097321 .634514LnPmin | -.019819 .0289541 -0.68 0.494 -.0765681 .03693Year | -.0273553 .0052343 -5.23 0.000 -.0376144 -.0170961_cons | 3.847968 .2192518 17.55 0.000 3.418242 4.277693

------------------------------------------------------------------------------

Page 44: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

44

Serial Correlation & Heteroskedasticity in Long Panels

where the xtset command is first executed before running xtgls:. xtset State Year

panel variable: State (strongly balanced)time variable: Year, 63 to 92

delta: 1 unit

β€’ All regressors have the expected effects.β€’ The estimated price elasticity of demand for cigarette is βˆ’.3583,β€’ The income elasticity is estimated to be .5521,β€’ Demand declines by 2.7% per year (the coefficient of Year is

semielasticity because the dependent variable is in logs),β€’ The minimum price in the adjoining states does not have a

significant effect on the demand in the current state.

There are 10 states, so there 10Γ—11/2 = 55 unique entries in the 10Γ—10 contemporaneous error covariance matrix Ξ£, and 10 autocorrelation parameters πœŒπœŒπ‘–π‘– are estimated!

Page 45: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

45

Serial Correlation & Heteroskedasticity in Long Panels

The xtgls command does the pooled OLS or FGLS estimation when data are from a long panel. They allow the errors 𝑣𝑣𝑖𝑖𝑖𝑖 in the model to be correlated over i, allow the use of AR(1) models for 𝑣𝑣𝑖𝑖𝑖𝑖 over t, and allow 𝑣𝑣𝑖𝑖𝑖𝑖 to be heteroskedastic over i.An alternative Stata command, xtpcse, yields (long) panel-corrected standard errors (pcse) for the pooled OLS estimator, as well as for pooled least-squares estimator with an AR(1) model for 𝑣𝑣𝑖𝑖𝑖𝑖.A third choice, xtscc, generalizes xtpcse by allowing AR(m) errors. It gives Driscoll and Kraay (1998) standard errors for coefficients estimated by pooled OLS/WLS or fixed-effects (within) regression.Note: the xtscc is not automatically installed with the installation of Stata. It can be found and installed by following the steps:β€’ Goto help βˆ’> search; type xtsccβ€’ In the pumped up window, click the link

xtscc from http://fmwww.bc.edu/RePEc/bocode/xβ€’ And then click on: β€œclick here to install”.

Page 46: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

46

Serial Correlation & Heteroskedasticity in Long Panels

Options for xtgls:panels( ): specifies the error correlation across individuals: β€’ iid: 𝑣𝑣𝑖𝑖𝑖𝑖 are iid;β€’ heteroskedastic: 𝑣𝑣𝑖𝑖𝑖𝑖 are independence over i, with changing variance πœŽπœŽπ‘–π‘–2β€’ correlated: additionally allows correlation over individuals, with

independence over time for given individual.corr( ): specifies serial correlation of errors for each individual: β€’ ar1: constant 𝜌𝜌; and β€’ psar1: different πœŒπœŒπ‘–π‘–.

Options for xtpcse: correlation( ) with choices:β€’ hetonly: 𝑣𝑣𝑖𝑖𝑖𝑖 are independence but heteroskedastic over i;β€’ independence: 𝑣𝑣𝑖𝑖𝑖𝑖 are iid;β€’ ar1: constant 𝜌𝜌; and β€’ psar1: different πœŒπœŒπ‘–π‘–.

In all cases, panel corrected standard errors (PCSE) are reported, which allow heteroskedasticity and contemporaneous correlation over i.

Page 47: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

47

Serial Correlation & Heteroskedasticity in Long Panels

Options for xtscc:.β€’ lag(#): set maximum lag order of autocorrelation; default isβ€’ m(T)=floor[4(T/100)^(2/9)];β€’ fe: perform fixed effects (within) regression;β€’ re: perform GLS random effects regressionβ€’ pooled: perform pooled OLS/WLS regression; defaultβ€’ noconstant: suppress regression constant in pooled OLS/WLS

regressionsβ€’ ase: return (asymptotic) Driscoll-Kraay SE without small sample

adjustment

Page 48: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

48

Serial Correlation & Heteroskedasticity in Long Panels

. * Comparison of various pooled OLS and GLS estimators

. quietly xtpcse LnC LnP LnNDI LnPmin Year, corr(ind) independent nmk

. estimates store OLS_iid

. quietly xtpcse LnC LnP LnNDI LnPmin Year, corr(ind)

. estimates store OLS_cor

. quietly xtscc LnC LnP LnNDI LnPmin Year, lag(4)

. estimates store OLS_DK

. quietly xtpcse LnC LnP LnNDI LnPmin Year, corr(ar1)

. estimates store AR1_cor

. quietly xtgls LnC LnP LnNDI LnPmin Year, corr(ar1) panels(iid)

. estimates store FGLSAR1

. quietly xtgls LnC LnP LnNDI LnPmin Year, corr(ar1) panels(correlated)

. estimates store FGLSCAR

We now use xtpcse, xtgls and user written xtscc (needs separate installation) to obtain the following pooled estimators and the associated standard errors: 1)pooled OLS with iid errors; 2) pooled OLS with standard errors assuming correlation over states; 3) pooled OLS assuming general serial correlation in the error (4 lags) and correlation over states; 4) pooled OLS that assumes an AR(1) error and gets standard errors that additionally permits correlation over states; 5) pooled FGLS with standard errors assuming an AR(1) error; and 6)pooled FGLS assuming an AR(1) error and correlation across states. πœŒπœŒπ‘–π‘– = 𝜌𝜌.

Page 49: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

49

Serial Correlation & Heteroskedasticity in Long Panels

. estimates table OLS_iid OLS_cor OLS_DK AR1_cor FGLSAR1 FGLSCAR, b(%7.3f) se

--------------------------------------------------------------------------Variable | OLS_iid OLS_cor OLS_DK AR1_cor FGLSAR1 FGLSCAR

-------------+------------------------------------------------------------LnP | -0.786 -0.786 -0.786 -0.311 -0.308 -0.360

| 0.130 0.172 0.262 0.050 0.049 0.026 LnNDI | 0.412 0.412 0.412 0.458 0.460 0.490

| 0.047 0.075 0.154 0.118 0.088 0.077 LnPmin | -0.049 -0.049 -0.049 -0.003 -0.001 -0.009

| 0.123 0.157 0.243 0.067 0.060 0.034 Year | 0.015 0.015 0.015 -0.024 -0.024 -0.023

| 0.007 0.006 0.012 0.011 0.008 0.007 _cons | 3.469 3.469 3.469 3.930 3.931 3.778

| 0.174 0.245 0.462 0.418 0.310 0.302 --------------------------------------------------------------------------

legend: b/se

For pooled OLS with iid errors, the nmk option normalizes the VCE by Nβˆ’krather than N, so that the output is exactly the same as that from regress with default standard errors. The same could be obtained by using xtgls with the corr(ind) panel(iid) nmk options

Page 50: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5

50

Serial Correlation & Heteroskedasticity in Long Panels

. xtscc LnC LnP LnNDI LnPmin Year, fe lag(4)

Regression with Driscoll-Kraay standard errors Number of obs = 300Method: Fixed-effects regression Number of groups = 10Group variable (i): State F( 4, 29) = 117.61maximum lag: 4 Prob > F = 0.0000

within R-squared = 0.6392

------------------------------------------------------------------------------| Drisc/Kraay

LnC | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

LnP | -1.238273 .1367816 -9.05 0.000 -1.518023 -.9585236LnNDI | .7553486 .1595965 4.73 0.000 .4289371 1.08176

LnPmin | .4683234 .1679333 2.79 0.009 .1248613 .8117856Year | -.0152654 .0126641 -1.21 0.238 -.0411663 .0106355_cons | 2.562785 .463528 5.53 0.000 1.614764 3.510806

------------------------------------------------------------------------------

An final illustration is the xtscc with fe option. The default is re.

Compared with the results from xtscc LnC LnP LnNDI LnPmin Year, lag(4), we see that LnPmin becomes significant.

Page 51: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5Unit Roots and Cointegration

51

The methods for long panel considered depend on the stationarity of the time series, i.e., πœŒπœŒπ‘–π‘– < 1, i = 1, …, N.The literature on panel methods for unit roots and cointegration is large, and it remains to be an active area of research.In standard application of long panel methods, it is of interest to test the existence unit roots and cointegration.

Panel unit-root tests:The Stata command xtunitroot (https://www.stata.com/features/overview/panel-data-unit-root-tests/) provides tests appropriate for all types of panel data: short, long, or large panel. A detailed treatments on these tests are beyond the course.

Panel cointegration tests:The Stata command xtcointtest (https://www.stata.com/new-in-stata/panel-data-cointegration-tests/) implements a variety of tests for panel data with large-N large-T. This seems to be an added feature for Stata 15. Again, a detained treatment on this topic is beyond the course.

Page 52: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 55.4. Robust Estimation for Large Panels

52

Consider the one-way effects model:

𝑦𝑦𝑖𝑖𝑖𝑖 = 𝛼𝛼 + 𝑋𝑋𝑖𝑖𝑖𝑖′ 𝛽𝛽 + πœ‡πœ‡π‘–π‘– + 𝑣𝑣𝑖𝑖𝑖𝑖, i = 1, …, N, t = 1, …, T,

where both N and T are β€˜large’.

The xtreg works for large panel under iid assumptions on 𝑣𝑣𝑖𝑖𝑖𝑖.An alternative and better procedure, xtregar, allows AR(1) error 𝑣𝑣𝑖𝑖𝑖𝑖 = πœŒπœŒπ‘–π‘–π‘£π‘£π‘–π‘–,π‘–π‘–βˆ’1 + πœ–πœ–π‘–π‘–π‘–π‘–.

. * Comparison of various RE and FE estimators with full cigarette demand data

. quietly xtscc LnC LnP LnNDI LnPmin, lag(4)

. estimates store OLS_DK

. quietly xtreg LnC LnP LnNDI LnPmin, fe

. estimates store FE_REG

. quietly xtreg LnC LnP LnNDI LnPmin, re

. estimates store RE_REG

. quietly xtregar LnC LnP LnNDI LnPmin, fe

. estimates store FE_REGAR

. quietly xtregar LnC LnP LnNDI LnPmin, re

. estimates store RE_REGAR

. quietly xtscc LnC LnP LnNDI LnPmin, fe lag(4)

. estimates store FE_DK

Page 53: Chapter 5: Panel Estimation Under

Chapter 5

ECON6002, Term II 2020-21 Β© Zhenlin Yang, SMU

Chapter 5Robust Estimation for Large Panels

53

. estimates table OLS_DK FE_REG RE_REG FE_REGAR RE_REGAR FE_DK, b(%7.3f) se

--------------------------------------------------------------------------Variable | OLS_DK FE_REG RE_REG FE_RE~R RE_RE~R FE_DK

-------------+------------------------------------------------------------LnP | -1.107 -0.886 -0.889 -0.388 -0.413 -0.886

| 0.049 0.037 0.037 0.024 0.024 0.072 LnNDI | 0.569 0.512 0.512 0.213 0.250 0.512

| 0.048 0.014 0.014 0.026 0.019 0.031 LnPmin | 0.358 0.207 0.210 0.017 0.016 0.207

| 0.049 0.037 0.037 0.026 0.026 0.067 _cons | 2.908 3.111 3.108 4.452 4.221 3.111

| 0.208 0.060 0.064 0.017 0.099 0.137 --------------------------------------------------------------------------

legend: b/se

Indeed, xtregar gives more efficient estimators than does the xtreg.The last set of results from β€œxtscc LnC LnP LnNDI LnPmin, fe lag(4)” are the standard within estimators but with standard errors are robust to both spatial and temporal correlation of the error.However, the standard errors produced by xtscc are much larger, … .