inference issues in ols amine ouazad ass. prof. of economics

Download Inference issues in OLS Amine Ouazad Ass. Prof. of Economics

If you can't read please download the document

Upload: madeline-richards

Post on 26-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

  • Slide 1
  • Inference issues in OLS Amine Ouazad Ass. Prof. of Economics
  • Slide 2
  • Outline 1.Heteroscedasticity 2.Clustering 3.Generalized Least Squares 1.For heteroscedasticity 2.For autocorrelation
  • Slide 3
  • HETEROSCEDASTICITY
  • Slide 4
  • Issue The issue arises whenever the residuals variance depends on the observation, or depends on the value of the covariates.
  • Slide 5
  • Example #1
  • Slide 6
  • Example #2 Here Var(y|x) is clearly increasing in x. Notice the underestimation of the size of the confidence intervals.
  • Slide 7
  • Visual checks with multiple variables Use the vector of estimates b, and predict E(Y|X) using the predict xb, xb stata command. Draw the scatter plot of the dependent y and the prediction Xb on the horizontal axis.
  • Slide 8
  • Causes Unobservable that affects the variance of the residuals, but not the mean conditional on x. y=a+bx+e. with e= z. The shock satisfies E( |x)=0, and E(z|x)=0 but the variance Var(z|x) depends on an unobservable z. E(e|x)=0 (exogeneity), but Var(e|x)=Var( z|x) depends on x. (previous example #1). In practice, most regressions have heteroskedastic residuals.
  • Slide 9
  • Examples Variability of stock returns depends on the industry. Stock Return i,t = a + b Market Return t + i,t. Variability of unemployment depends on the state/country. Unemployment i,t = a + b GDP Growth t + i,t. Notice that both the inclusion of industry/state dummies and controlling for heteroskedasticity may be necessary.
  • Slide 10
  • Heteroscedasticity: the framework We set the s so that their sum is equal to n, and they are all positive. The trace of the matrix (see matrix appendix) is therefore equal to n.
  • Slide 11
  • Consequences 1.The OLS estimator is still unbiased, consistent and asymptotically normal (only depends on A1-A3). 2.But the OLS estimator is then inefficient (the proof of the Gauss-Markov theorem relies on homoscedasticity). 3.And the confidence intervals calculated assuming homoscedasticity typically overestimate the power of the estimates/underestimate the size of the confidence intervals.
  • Slide 12
  • Variance-covariance matrix of the estimator Asymptotically At finite and fixed sample size xi is the i-th vector of covariates, a vector of size K. Notice that if the i are all equal to 1, we are back to the homoscedastic case and we get Var(b|x) = 2 (XX) -1 We use the finite sample size formula to design an estimator of the variance-covariance matrix.
  • Slide 13
  • White Heteroscedasticity consistent estimator of the variance-covariance matrix The formula uses the estimated residuals e i of each observation, using the OLS estimator of the coefficients. This formula is consistent (plim Est. Asy. Var(b)=Var(b)), but may yield excessively large standard errors for small sample sizes. This is the formula used by the Stata robust option. From this, the square of the k-th diagonal element is the standard error of the k-th coefficient.
  • Slide 14
  • Test for heteroscedasticity Null hypothesis H 0 : i 2 = 2 for all i=1,2,,n. Alternative hypothesis H a : at least one residual has a different variance. Steps: 1.Estimate the OLS and predict the residuals e i. 2.Regress the square of the residuals on a constant, the covariates, their squares and their cross products (P covariates). 3.Under the null, all of the coefficients should be equal to 0, and NR 2 of the regression is distributed as a 2 with P-1 degrees of freedom.
  • Slide 15
  • Suggests another visual check Examples #1 and #2 with one covariate. Example with two covariates.
  • Slide 16
  • Stata take aways Always use robust standard errors robust option available for most regressions. This is regardless of the use of covariates. Adding a covariate does not free you from the burden of heteroscedasticity. Test for heteroscedasticity: hettest reports the chi-squared statistic with P-1 degrees of freedom, and the p-value. A p-value lower than 0.05 rejects the null at 95%. The test may be used with small sample sizes, to avoid the use of robust standard errors.
  • Slide 17
  • CLUSTERING
  • Slide 18
  • Clustering, example #1 Typical problem with clustering is the existence of a common unobservable component Common to all observations in a country, a state, a year, etc. Take y it = x it + it, a panel dataset where the residual it =u i + it. Exercise: Calculate the variance-covariance matrix of the residuals.
  • Slide 19
  • Clustering, example #2 Other occurrence of clustering is the use of data at a higher level of aggregation than the individual observation. Example: y ij = x ij +z j + ij. This practically implies (but not theoretically), that Cov( ij, ij ) is nonzero. Example: regression performance it = c + policy j(i) + it. regression stock return it = constant + Market t + it.
  • Slide 20
  • Moulton paper
  • Slide 21
  • The clustering model Notice that the variance-covariance matrix can be designed this way by blocks. In this model, the estimator is unbiased and consistent, but inefficient and the estimated variance-covariance matrix is biased.
  • Slide 22
  • True variance-covariance matrix With all the covariates fixed within group, the variance covariance matrix of the estimator is: where m=n/p, the number of observations per group. This formula is not exact when there are individual-specific covariates, but the term (1+(m-1) ) can be used as an approximate correction factor.
  • Slide 23
  • Descriptive Statistics
  • Slide 24
  • Slide 25
  • Stata regress y x, cluster(unit) robust. Clustering and robust s.e. s should be used at the same time. This is the OLS estimator with corrected standard errors. If x includes unit-specific variables, we cannot add a unit (state/firm/industry) dummy as well.
  • Slide 26
  • Multi-way clustering Multi-way clustering: Robust inference with multi-way clustering, Cameron, Gelbach and Miller, Technical NBER Working Paper Number 327 (2006). Has become the new norm very recently. Example: clustering by year and state. y it = x it + z i + w t + it What do you expect? ivreg2, cluster(id year). ssc install ivreg2.
  • Slide 27
  • GENERALIZED LEAST SQUARES
  • Slide 28
  • OLS is BLUE only under A4 OLS is not BLUE if the variance-covariance matrix of the residuals is not diagonal. What should we do? Take general OLS model Y=X + . And assume that Var( )= . Then take the square root of the matrix, -1/2. This is a matrix that satisfies =( -1/2 -1/2. This matrix exists for any positive definite matrix.
  • Slide 29
  • Sphericized model The sphericized model is: -1/2 Y= -1/2 X + -1/2 This model satisfies A4 since Var( |X)= 2.
  • Slide 30
  • Generalized Least Squares The GLS estimator is: This estimator is BLUE. It is the efficient estimator of the parameter beta. This estimator is also consistent and asymptotically normal. Exercise: prove that the estimator is unbiased, and that the estimator is consistent.
  • Slide 31
  • Feasible Generalized Least Squares The matrix in general is unknown. We estimate W using a procedure (see later) so that plim W = . Then the FGLS estimator =(XW -1 X) -1 XW -1 Y is a consistent estimator of . The typical problem is the estimation of . There is no one size fits all estimation procedure.
  • Slide 32
  • GLS for heteroscedastic models Taking the formula of the GLS estimator, with a diagonal variance- covariance matrix. Where each weight is the inverse of i. Or the inverse of i 2. Scaling the weights has no impact. Stata application exercise: Calculate weights and tse the weighted OLS estimator regress y x [aweight=w] to calculate the heteroscedastic GLS estimator, on a dataset of your choice.
  • Slide 33
  • GLS for autocorrelation Autocorrelation is pervasive in finance. Assume that t = t-1 + t, (we say that t is AR(1)) where t is the innovation, uncorrelated with t-1. The problem is the estimation of . Then a natural estimator of is the coefficient of the regression of t on t-1. Exercise 1 (for adv. students): find the inverse of . Exercise 2 (for adv. students): find for an AR(2) process. Exercise 3 (for adv. students): what about MA(2) ? Variation: Panel specific AR(1) structure.
  • Slide 34
  • Autocorrelation example
  • Slide 35
  • GLS for clustered models Correlation within each group. Exercise: write down the variance-covariance matrix of the residuals. Put forward an estimator of . What is the GLS estimator of in Y=X + with clustering? Estimation using xtgls, re.
  • Slide 36
  • Applications of GLS The Generalized Least Squares model is seldom used. In practice, the variance of the OLS estimator is corrected for heteroscedasticity or clustering. Take-away: use regress, cluster(.) robust Otherwise: xtgls, panels(hetero) xtgls, panels(correlated) xtgls, panels(hetero) corr(ar1) The GLS is mostly used for the estimation of random effects models. xtreg, re
  • Slide 37
  • CONCLUSION: NO WORRIES
  • Slide 38
  • Take away for this session 1.Use regress, robust; always, unless the sample size is small. 2.Use regress, robust cluster(unit) if: You believe there are common shocks at the unit level. You have included unit level covariates. 3.Use ivreg2, cluster(unit1 unit2) for two way clustering. 4.Use xtgls for the efficient FGLS estimator with correlated, AR(1) or heteroscedastic residuals. This might allow you to shrink the confidence intervals further, but beware that this is less standard than the previous methods.