econometric s

Alex’s ECON241 Notes

Alex Cooper

June 8, 2011

Contents

1 Basics 2

2 Two-Variable Regression 4

3 Nonlinear models 10

4 Analysis of variance (ANOVA) 12

5 Multiple regression 14

6 Reporting a regression model 16

7 Heteroscedasticity 18

8 Auto-correlation 20

9 Multicollinearity 24

10 Qualitative Analysis: Dummy Variables 25

1

1 Basics

1.1 Parameters

Sample mean X:

X =1

n

n∑i

Xi

Population mean µ:

µ =∑

X · P (X)

µ = E[X] =

∫ ∞−∞

p(x) · dx

Sample variance S2:

S2 =1

n− 1

n∑i

(X − X

)2Population variance σ2

σ2 =

n∑i

(X − µ)2 · P (X)

σ2 = E[(X − µ)

2]

=

∫ ∞−∞

(x− µ)2 · p(x) · dx

1.2 Properties of Estimators

An estimator θ for a population parameter θ is unbiased if the expected valueof that parameter equals its true value:

E[θ] = θ

If P (θ < θ) = P (θ > θ) then θ is the population median of the distribution of

θ. If E[θ] = 0 then θ is the mean of the population distribution of θ.

In the case that E[θ] 6= 0, call θ baised. The bias of θ is the difference betweenits expectation and its true value:

BIAS(θ) = E[θ]− θ

Efficiency is both a relative and absolute concept.

Consider two unbiased estimators, θ1 and θ2. If V [θ1] < V [θ2], then θ1 is

relatively more efficient than θ2.

θ is absolutely efficient if it as least as efficient as any other unbiased estimatorsof θ.

For X ∼ [0, σ2], it can be shown that the minimum variance of an unbiased

estimator of µ is σ2

n . Therefore, X is the efficient estimator of µ.

2

An estimator is said to be linear if it is a linear function of all the sample values.Thus X is a linear estimator of µ:

X =1

n

n∑i

Xi =1

nX1 +

1

nX2 + · · ·+ 1

nXn

An estimator is a best linear unbiased estimator (B.L.U.E.) if it is (1) linear,(2) unbiased, and (3) efficient.

3

2 Two-Variable Regression

Model is given as a Population Regression Function, where εi is the ith valueof an unobservable random error which has an average value of zero.

Yi = (β1 + β2Xi) + εi

i = 1, 2, . . . , n

We estimate this model with a Sample Regression Function:

Yi = b1 + b2Xi + ei

b1 estimates β1b2 estimates β2ei estimates εi

ei is referred to as the ith residual.

This yields the estimated model:

Yi = b1 + b2Xi

And note that the residual is the difference between the actual value of Yi andthe estimated value, Yi:

ei = Yi − Yi

2.1 Assumptions

1. Y is an approximate linear function of X:

Yi = β1 + β2Xi + εi

2. The expected value of the error term is zero

E[εi] = 0

3. The variance of the error term is constant

V [εi] = σ2

4. The error terms are statistically independent:

E[εi · εj ] = 0 (j 6= k)

5. The explanatory variable X and the error term ε are statistically inde-pendent:

E[Xi · εj ] = 0 (∀ i, j)

6. The error terms are normally distributed:

ε ∼ Normal

Note that assumptions 2, 3, 4 and 6 jointly imply that:

ε ∼ [0, σ2]

4

2.2 Terminology

Y : Explained variable, dependent variable, regressandX : Explanatory variable, independent variable, regressorε : Random error term, disturbance terme : Estimated value of ε, residual

2.3 Estimation of regression model

Ordinary least squares (OLS) estimators for β1 and β2:

b2 =

∑XY − nXY∑X2 − nX2

b1 = Y − b2X

Different forms of b2 equation:

b2 =

∑XY − nXY∑X2 − nX2

=n∑XY − (

∑X) (

∑Y )

n∑X2 − (

∑X)2

=

∑(X −X)(Y − Y )∑

(X −X)2

The last version is often written as∑xy∑x2

If the assumptions of the regression model are valid, then the OLS estimatesare BLUE:

1. Both coefficients b1 and b2 are linear functions of Y.

2. OLS estimates are unbiased:

E[bj ] = βj

3. OLS estimators are efficient (Gauss-Markov Theorem)

V [bj ] ≤ V[βj

]where βj is any other linear unbiased estimator of βj .

5

2.4 Estimation of Variance of σ2

The sample estimator S2ε is an unbiased estimator of the error term σ2.

S2ε =

1

n− 2

n∑i

e2i

=1

n− 2

n∑i

(Yi − Yi

)2=

1

n− 2

n∑i

(Yi − b1 − b2Xi)2

2.5 Coefficient of determination R2

Measures “goodness of fit” of regression model.

R2 =Variation in Y explained by model

Total variation in Y

=Total variation in Y−Unexplained variation in Y

Total variation in Y

=

∑(Y − Y

)2 −∑ e2∑(Y − Y

)2= 1−

∑e2∑(

Y − Y)2

R2 is the proportion of variability in Y explained by the regression model/byvariability of X.

2.6 Issues

1. The relationship between X and Y may not be approximately linear.

2. Y may be an approximate linear function of more than one explanatoryvariable.

3. The assumptions involving the error term, ε, may not be valid.

Also, we need at least two points to define a line. But, if n = 2, then S2ε may

not be defined (divide by zero). So using OLS, we can only estimate the twocoefficeints if n > 2. In practice, we need n much larger than that.

2.7 Assumptions about X

We don’t require that X is fixed in repeated samples (often required for basicregression). However, it is required that:

1. X is statistically independent of the error term, and

6

2. X is a stationary variable.

Population mean/variance ofX and Y needs to be constant over time. Canresult in spurious regression, which is beyond the scope of this course.

(See Gujarati section 6.1)

2.8 Estimating β2

For the slope β2, we can show that:

1. Population mean of b2 is β2:

E [b2] = β2

2. Population variance of b2, σ2b2

is

V [b2] = σ2b2 =

σ2∑i

(Xi −X

)23. And b2 is normally distributed.

b2 ∼ N[β2, σ

2b2

](b2 − β2σb2

)∼ Z

Since σb2 is unknown, estimate it with S2b2

which is an unbiased estimator:

S2b2 =

S2ε∑

i

(Xi −X

)2S2ε =

1

n− 2

n∑i

e2i

Note the (n − 2), because we’re estimating a line. Lines need two points todefine them, so estimating one eats up another degree of freedom.

It can be shown that: (b2 − β2S2b2

)= t ∼ tn−2

Notation: Sb2 = se(b2)

100(1− α)% C.I. for β2 = b2 ± tn2,α/2se(b2)

Use the same se(b2) as test statistic when drawing inferences about β2. UnderH0, we have: (

b2 − β∗2S2b2

)= t ∼ tn−2

7

2.9 Estimating β1

It can be shown that:

1. Population mean of b1 is β1:

E[b1] = β1

2. Variance of b1, σ2b1

, is

V [b1] = σ2b1 =

σ2∑ni X

2i

n∑ni

(X −X

)23. And b1 is normally distributed.

Therefore, we have:b1 ∼ N

[β1, σ

2b1

](b1 − β1σb1

)∼ Z

Since σ2b1

unknown, estimate with S2b1

, which is unbiased:

S2b1 =

S2ε

n

∑ni X

2i∑n

i

(Xi −X

)2S2ε =

1

n− 2

n∑i

e2i

We can also write Sb1 as se(b1).

100(1− α)% C.I. for β1 = b1 ± tn−2,α/2se(b1)

Use the same se(b1) as test statistic when drawing inferences about β1. UnderH0, we have: (

b1 − β∗1S2b1

)= t ∼ tn−2

2.10 Prediction interval for Y

Predicted values are found by plugging values for X into the model, i.e. X = Xp:

Yp = b1 + b2Xp

But since b1 and b2 are random variables, so is Y . The forecast error(Yp − Yp

)is the difference between the predicted value Yp and the actual value at Xp, Yp.

It can be shown that: (Yp − Yp

)∼ N

[0, σ2

FE

]8

Where the population variance of FE is:

σ2FE = σ2

[1 +

1

n+

(Xp −X

)2∑ni

(Xi −X

)2]

Standardization gives: (Yp − YpσFEp

)∼ Z

Since σ2FEp

is unknown, we estimate with unbiased estimator S2FEp

:

S2FEp

= S2ε

[1 +

1

n+

(Xp −X

)2∑ni

(Xi −X

)2]

S2ε =

1

n− 2

n∑i

e2i

It can be shown that: (Yp − YpSFEp

)= t ∼ tn−2

Therefore, when X = Xp, the 100(1− α)% prediction interval is given by:

100(1− α)% C.I. for Yp = Yp ± tn−2,α/2SFEp

Prediction bands are variable in size, and smallest at Xp = X. They increasein magnitude the further away from X.

Note: this is the prediction interval for Yp, and not E[Yp], which isn’t part ofthis course.

2.11 Significance testing

We test the significance of the model by verifying that b1 and b2 can’t be zero.

H0 : β2 = 0

H1 : β2 6= 0

t =

(b2 − β2se(b2)

)∼ tn−2

9

3 Nonlinear models

Theory might suggest that the relationship between regressand and regressorsis not linear. However, we can transform the relationship into one that is ap-proximately linear.

Lin-log model:Y = β1 + β2 ln(X)

Log-log model (aka log-linear or double-log):

ln(Y ) = β1 + β2 ln(X)

For quadratics, we regress:

Y = β1 + β2X + β3Z + ε where Z = X2

We just require that Y has a linear relationship with each regressand. In otherwords, for each regressand Xi,

∂Y∂Xi

is a constant.

NB: many relationships cannot use OLS to estimate β1 and β2:

Y = β1Xβ2 + ε

Y = (β1 +X)β2 + ε

Example of log-log relationship given in lectures: Cobb-Douglas productionfunction. Short-run production function (short-run because capital stock ispresumed to be fixed.)

Q = ALβeε

Which can be easily transformed into a double-log model:

lnQ = lnA+ β lnL+ ε

Where β1 = lnA and β2 = β, giving:

lnQ = β1 + β2L+ ε

3.1 Elasticity

Elasticity is defined as the percentage change in Y with respect to a percentagechange in X. This turns out to be:

ε =dY

dX· XY

=d lnY

d lnX

In a double-log model lnY = β1 + β2 lnX + ε, the elasticity of Y wrt X corre-sponds to the slope coefficient, β2. Note the elasticity is constant. A 1% changein X leads to a change in Y of β2% (on average).

10

3.1.1 Interpretation of β2

A change in lnX of 1 unit leads to a change in lnY of β2 units on average.

β2 is hte elasticity of Y w.r.t. X.

A 1% change in X leads to a change in Y of β2%, on average.

3.1.2 Interpretation of β1

If lnX is equal to zero (X = 1), then on average lnY will have a value equal toβ1.

3.1.3 Interpretation of b2

A change in lnX of 1 unit leads to a change in the estimated value of lnY ofb2 units, on average.

b2 is the estimated elasticity of Y w.r.t. X.

A 1% change in X leads to a change in the estimated value of Y of b2% onaverage.

3.1.4 Interpretation of b1

If lnX is equal to zero (X = 1), then on average the predicted or estimatedvalue of lnY will have a value equal to b1.

11

4 Analysis of variance (ANOVA)

Sample variance of dependent variable Y:

S2Y =

1

n− 1

n∑i

(Yi − Y

)2∑

(Yi − Y )2 =∑

(Yi − Y )2 +∑e2i

Total v’y of Y = Explained v’y of Y + Unexplained v’y of YTotal SS = Explained SS + Unexplained SSTotal SS = SS from regression + SS of error termTotal SS = Explained SS + Residual SS

TSS = ESS + RSS

Degrees of freedom associated with these measures:

Total = Explained + Unexplained(n− 1) = (k − 1) + (n− k)

Coefficient of determination:

R2 =Explained variation

Total variation

=ESS

TSS

=

∑i

(Yi − Y

)2∑i

(Yi − Y

)24.1 ANOVA table

Anova from mean, k ≥ 2.

Source SS df MSS = SSdf F

Regression∑i

(Yi − Y

)2(k − 1)

∑i(Yi−Y )

2

k−1 F

Error∑i e

2 (n− k)∑

i e2i

n−k -

Total∑i

(Yi − Y

)2(n− 1)

∑i(Yi−Y )

2

n−1 -

Note MSS for regression doesn’t really have a name, because there’s no partic-ular use for that number.

Estimator of σ2Y :

S2Y =

1

n− 1

∑(Y − Y )2

Estimator of σ2ε :

S2ε =

1

n− k∑

e2i

And significance measures:

12

R2 =ESS

TSS=

∑i(Y − Y )2∑i(Yi − Y )2

F =ESS/(k − 1)

RSS/(n− k)=

∑i

(Yi − Y

)2/(k − 1)∑

i e2i /(n− k)

4.2 ANOVA test

H0 : ρXY = 0

H1 : ρXY 6= 0

If H0 is true, then it can be shown that the sample statistic F has an F distri-bution with (k− 1) DoF in the numerator and (n−k) DoF in the denominator:

F =Explained variation/(k − 1)

Unexplained variation/(n− k)∼ Fk−1,n−k

Note the relationship between F and R2:

R2 =Explained Variation

Total Variation


Unexplained variation/(n− k)

=Explained variation/(k − 1)

(Total variation− Explained variation)/(n− k)

=Explained variation/(k−1)

Total variation(1− Explained variation

Total variation

)/(n− k)

=R2/(k − 1)

(1−R2)/(n− k)

Note that if either the numerator or the denominator of the F critical valuesare not available in the tables, then we need to interpolate. Just use linearinterpolation intuitively.

TODO: lots left out about rho

4.3 Two-variable case

TODO: lots left out about the relationship between F and the t statistic in thetwo variable case.

TODO: see p.25 in topic 13

TODO: ANOVA table

13

5 Multiple regression

The k-variable multiple linear regression model:

Y = β1 + β2X2 + β3X3 + · · ·+ βKXK + ε

Individual instances of this model can be written as:

Yi = β1 + β2Xi2 + β3Xi3 + · · ·+ βKXiK + εi

Sample multiple regression equation:

Yi = b1 + b2Xi2 + b3Xi3 + · · ·+ bKXiK + ei

Estimated multiple regression model:

Yi = b1 + b2Xi2 + b3Xi3 + · · ·+ bKXiK

We choose b1, b2, . . . , bk to minimize RSS, which is a quadratic function of theparameters. Let the computer do it.

5.1 Assumptions

Assumptions of the multiple regression model (pretty much the same as for twovariables):

1. Y is an approximate linear function of X2, X3, . . . , XK .

2. The expected value of the error term is zero

E[εi] = 0

3. The population variance of the error term is constant

V [εi] = σ2

4. The error terms are statistically independent:

E[εi · εj ] = 0 (j 6= k)

5. The explanatory variables X2, X3, . . . , XK are linearly independent.

6. The explanatory variables are statistically independent of the error term.

7. The error terms are normally distributed:

ε ∼ Normal

Note that assumptions 2, 3, 4 and 7 jointly imply that the error terms arenormally and independently distributed with a mean of zero and variance σ2:

ε ∼ N.I.D.[0, σ2]

OLS estimates of the regression coefficients in a multiple regression model areBLUE.

14

5.2 Variance of estimated parameters

It can be shown that S2ε is an unbiased estimator for σ2ε:

S2ε =

1

n− k

n∑i

e2i

=1

n− k

n∑i

(Yi − Yi

)2It can be shown that:

ti =

(bi − βise(bi)

)∼ tn−k

100(1− α)% CI for βi = bi ± tn−k;α/2.se(bi)

Coefficient of determination R2, where rY,Y is the sample correlation efficient

between the actual values of Y and the predicted values Y .

R2 = r2Y,Y

0 ≤ R2 ≤ 1

5.3 F test for significance of model

The F test is used to test joint signifiance of the parameters, except β1:

H0 : β2 = 0 and β3 = 0 . . . and βK = 0

H0 : β2 6= 0 or β3 6= 0 . . . or βK 6= 0

If H0 is true, then none of the (k − 1) variables is significant in determining Y .

If H1 is true, then at least one of the variables is significant in determining Y .

Under H0:


Unexplained variation/(n− k)=

R2/(k − 1)

(1−R2)/(n− k)∼ Fk−1,n−k

Note that this is a test of whether R2 is significantly greater than 0.

F does not have to be a large number for the model to be significant: you canget perfectly good and useful models with very small R2.

5.4 Interpretation of the F test

The F test is a test of the overall signifiance of the multiple regression model inexplaining the variability of Y.

If H0 is rejected, then b2, b3, . . . , bk are said to be jointly significant. (At leastone of β2, β3, . . . , βk is likely to be non-zero.)

Alternately, if H0 is rejected then X2, X3, . . . , XK are said to be jointly signif-icant in explaining the variability of Y . (At least one of X2, X3, . . . , XK is astatistically significant factor in determining Y .)

Also, if H0 is rejected, we can conlude on the basis of hte sample evidence thatthe Sample Coefficient of Determination is significantly greater than zero.

15

6 Reporting a regression model

6.1 Two variable case

Note that only one of se and t rows are required—divide coefficient by one towork out one or the other. In economics/finance usually just give t; stats usuallyreport se.

Y = b1 + b2X(s.e.) se(b1) se(b2)(t) tb1 tb2

n = . . . R2 = . . .

6.2 Pro-forma report

Linear Model 81.41variation in the Mortgage Interest Rate.

Both the estimated slope coefficient, b2, and the estimated intercept, b1, arestrongly significant. (At the 5% significance level, the critical value is 2.306; atthe 1% level it is 3.355).

The Mortgage Interest Rate is a highly significant variable in determining theNumber of Houses Sold.

The estimated slope coefficient, 8.1310, indicates that if the Mortgage Interest

Rate falls by 1 unit (e.g. from 13% to 12%), the Number of Houses Sold in onemonth will increase by 813, on average.

The estimated slope coefficient, 8.1310, indicates that if the Mortgage InterestRate is increased by 1 unit (e.g. from 13% to 14%), the Number of Houses Soldin one month will fall by 813, on average.

If the Mortgage Interest Rate is zero, the estimated model indicates, on average,that the Number of Houses Sold in one month (measured in 100s) is 127.34. Thatis, if the Mortgage Interest Rate is zero, then the actual Number of Houses Soldin one month is estimated or predicted to be 12,734.

However, it is not sensible to consider a housing loans market where the Mort-gage Interest Rate is zero. It is also the case that zero is well below the observedrange of sample values for the Mortgage Interest Rate on which the analysis isbased. We have no sample information on the Number of Houses Sold when theMortgage Interest Rate is below 10.5%. In these circumstances, interpretationof the estimated intercept is not meaningful.

The plot of the Number of Houses Sold against the Mortgage Interest Ratesuggests that a curvilinear relationship exists between these two variables, witha negative but marginally increasing slope.

However, the plot of the residuals against the explanatory variable is inconclu-sive. The scatter of points may be consistent with a random scatter, whichwould suggest that the assumptions that are the basis of regression analysisare reasonable valid in this case, or they may be consistent with a nonrandomscatter, which would indicate that the assumptions underlying the regressionanalysis are not valid in this case.

16

With such a small sample, only 10 observations, caution has to be exercised inexamining the graphical evidence of these plots.

Taken together, these two plots place in doubt the validity of the assumptionson which regression analysis is based for the linear model.

On theoretical grounds, a linear model is unsatisfactory because it suggests thatat high mortgage interest rates the Number of Houses Sold in one month wouldbe negative. For this reason, together with the evidence from the first plot inparticular, i.e. the scatter diagram of the Number of Houses Sold against theMortgage Interest Rate, a nonlinear relationship should be investigated.

6.3 Multiple regression

Y = b1 + b2X2 + b3X3 + . . . + bkXk

(s.e.) se(b1) se(b2) se(b2) se(bk)(t) tb1 tb2 tb3 tbk

n = . . . R2 = . . . F = . . .

17

7 Heteroscedasticity

We assume that the variance of the error term ε is constant:

V [εi] = σ2

This is called the assumption of homoscedasticity. If this assumption is correct,error terms are said to be homoscedastic; otherwise said to be heteroscedastic.

Heteroscedasticity: the variance of the error term is not constant:

V [εi] = σ2i

The consequences of heteroscedasticity are:

1. OLS estimates for regression coefficients are inefficient (but remain unbi-ased).

2. Statistical inference is invalid (hypothesis tests, confidence intervals andprediction intervals).

7.1 BPG (Koenker) Test

Consider population regression model:

Yi = β1 + β2Xi2 + β3Xi3 + · · ·+ βkXik + εi

Null hypothesis is that the error terms are homoscedastic. Alternate hypothesisis that they are heteroscedastic (i.e. population variance of at least one erorterm differs from the population variance of at least one other error term.)

Test procedure:

1. Estimate the regression model using OLS:

Yi = b1 + b2Xi2 + b3Xi3 + · · ·+ bkXik

2. Construct the residuals from the estimated regression model, and squarethem:

e2i =(Yi − Yi

)23. Use OLS to estimate the following auxiliary regression model against the

(k − 1) auxiliary variables:

e2i = α1 + α2Xi2 + α3Xi3 + · · ·+ αkXik + νi

4. Construct the BPG (Koenker) test statistic:

LM = nR2

18

5. Under the null hypothesis of homoscedastic error terms in the originalregression model, the BPG (Koenker) statistic has a χ2 distribution with(k − 1) degrees of freedom.

LM = nR2 ∼ χ2k−1

6. For a given significance level α, compare the LM test statistic with thecritical value form the χ2

k−1 distribution, and a proportion α in the right-hand tail of the distribution.

NOTE: The original BPG and the BPG (Koenker) test are asymptotic tests—they are not valid in small samples.

Under BPG, the error terms are assumed to be normally distributed.

The Koenker variant does not depend on normality of error terms (say it’s robustto the assumption of normality.) Hence prefer Koenker variant.

Can use BPG/BPG (Koenker) to test for other forms of heteroscedasticity. Notconsidered in this course.

7.2 White’s test

White (1980) proposed a more robust test for heteroscedasticity, which dependson a large number of explanatory variables in the auxiliary regression model.This results in a loss of degrees of freedom, which is a disadvantage.

White’s test not considered in this course.

19

8 Auto-correlation

Autocorrelation is one type of heteroscedasticity—a violation of the regressionassumption that Error terms are statistically independent. It is very commonin time series data. Autocorrelation is represented by the model:

εt = ρεt−1 + ηt

1. ρ is a constant

2. −1 < ρ < 1

3. ηt ∼ [0, σ2]

4. Values of ηt statistically independent: E[ηt.ηs] = 0 (t 6= s)

Note that ρ is the population correlation coefficient between εt and εt−1.

Therefore, if ρ = 0 then εt and εt−1 are uncorrelated, so

ε = ηt ∼ [0, σ2]

However, if ρ 6= 0 then εt and εt−1 are correlated and the assumptions of theregression model don’t hold.

In this case, called first-order auto-correlated. Second and higher orders doexist, where error terms depend on more than one previous error term. Higherorders are useful for time-series (e.g. quarterly) data:

εt = ρ1εt−1 + ρ2εt−2 + ηt

The consequences of autocorrelated errors are as mentioned above:

1. OLS estimates for regression coefficients are inefficient.

2. Estimator for variance of error terms is invalid, and therefore statisticalinference is invalid:

(a) Hypothesis test invalid

(b) Confidence intervals invalid

(c) Prediction intervals invalid

8.1 Durbin-Watson test

Tests 1st order auto-correlation. Older test which is widely available, and grad-ually being replaced in the literature. Despite this, lots of the existing econo-metric and statistical literature includes references to it.

Assumptions for DW test are very similar to that of regression:


20

−1 < ρ < 1

ηt ∼ N [0, σ2]

E[ηt.ηs] = 0 (t 6= s)

Hypotheses (nb can use both > or <):

H0 : ρ = 0

H1 : ρ 0

Test statistic:

d = DW =

n∑t=2

(et − et−1)2

n∑t=1

e2t

Note that εt = ρεt−1 + ηt resembles a regression model. You can just regressεt on εt−1 with estimates: et = ρet−1 + ηt, then give an OLS estimate for ρ.Unfortunately the assumptions for the t-test break down, so we use the DWtest as a proxy instead.

The D-W statistic approximately related to ρ in large samples:

d ≈ 2 (1− ρ)

If ρ = 1 then d ≈ 0, and so on. Approximately, 0 < d < 4.

Note the distribution changes, and depends on n, α and k′. Have to look uptwo critical values. If outside both critical values, then definitely a problem.Between the critical values, “test is inconclusive”—but in reality this indicatesa problem.

0 < dL < dU < 2 < (4− dU ) < (4− dL) < 4

Procedure: reject H0 if d < dL or d > (4− dL).

If dL < d < dU or (4− dU ) < d < (4− dL) then the test is inconclusive.

Tables give the lower-tail value (for positive ρ); if upper tail is needed, subtract4 from table value.

Degrees of freedom k′ is the number of explanatory variables, excluding theconstant.

k′ = k − 1

21

8.2 Breusch-Godfrey (BG) Test for 1st-order Autocorre-lation

Population regression function:

Yt = β1 + β2Xt2 + β3Xt3 + · · ·+ βkXtk + εt (t = 1, 2, . . . , n)

Estimated regression equation:

Yt = b1 + b2Xt2 + b3Xt3 + · · ·+ bkXtk

et = Yt − Yt

If first-order autocorrelation occurs, ie ρ 6= 0, then:


−1 < ρ < 1

ηt ∼ n.i.d.[0, σ2]

NB: result: no problem with bias on estimates (although inefficient.) BUTestimate of error is biased. So all bets are off with inference.

If autocorrelation exists, then:

et = Yt − Yt= (β1 + β2Xt2 + β3Xt3 + · · ·+ βkXtk + εt)

−(b1 + b2Xt2 + b3Xt3 + · · ·+ bkXtk)

= (β1 − b1) + (β2 − b2)Xt2 + (β3 − b3)Xt3 + · · ·+ (βk − bk)Xtk + (ρεt−1 + ηt)

Note that the above is a regression equation. Remember that there’s no biasin the regression coefficient estimates, so each term here approximately equalszero.

We can also write it as:

et = α1 + α2Xt2 + α3Xt3 + · · ·+ αkXtk + ρεt−1 + ηt

Since εt−1 not known, estimate with et−1. The resulting auxilliary regressionequation:

et ' α1 + α2Xt2 + α3Xt3 + · · ·+ αkXtk + ρet−1 + ηt

Steps to conduct the test:

1. Perform regression with the equation

Yt = b1 + b2Xt2 + b3Xt3 + · · ·+ bkXtk

2. Calculate OLS residuals:et = Yt − Yt

22

3. Perform regression on auxilliary equation:

et ' α1 + α2Xt2 + α3Xt3 + · · ·+ αkXtk + ρet−1 + ηt

4. Calculate test statistic from the coefficient of determination R2, notingthat we lose a datum:

LM = NR2

N = (n− 1)

5. Test againstLM = NR2 ∼ χ2

1

BG test is a two-tailed test of the following hypothesis:

H0 : ρ = 0

H1 : ρ 6= 0

Compare with D-W which is a one-sided test.

Note that this is a large-sample test. It may not be valid (or have high error)in small samples.

NB you can test for pth-order autocorrelation with LM = (N − p)R2 ∼ χ2p,

slightly different to our version, but there’s no appreciable difference in largesamples. Not considered in this course.

Conclusion: The sample evidence is consistent with the existence of a seriousproblem of first-order autocorrelation in the error term.

8.3 Durbin-Watson versus Breusch-Godfrey test

The Durbin-Watson test is only valid for 1st-order autocorrelation if:

1. Regression model includes an intercept

2. Explanatory variables are fixed in repeating samples, ie non-stochastic.

3. Error terms cannot follow a pth-order autocorrelation pattern where p > 1.

4. Error terms must be normally distributed.

5. Population regression model doesn’t include any lagged values of depen-dent variable as explanatory variables.

If these assumptions are satisfied, then D-W works even for small samples. BGtest is only valid in large samples.

The BG test remains valid even if explanatory variables are lagged values of thedependent variable. It can also be extended test for pth-order autocorrelation,where p > 1.

23

9 Multicollinearity

In the multiple regression model we assume that the explanatory variablesX1, X2, . . . , Xk are linearly independent: can’t express any variables as an exactlinear combination of each other.

Y = β1 + β2X2 + β3X3 + · · ·+ βKXK + ε

If assumption violated, explanatory variables are said to be exactly collinearor exactly multicollinear. In multiple regression where this happens exactly orapproximately, this is called multicollinearity (can use with collinearity inter-changeably.)

Two variables are exactly collinear if the correlation coefficient between them is1 or -1. (And you won’t be able to even estimate the model.) If the correlationcoefficient is 0, they are not collinear.

Assume 3 explanatory variables, X2, X3, X4. Regress X2 against X3, X4.

1. R2 = 1: exactly collinear.

2. 0 < R2 < 1: linear association exists, but only approximate.

3. R2 = 0: relationship doesn’t exist. But, this very rarely happens.

Multicollinearity is a potential problem in all regression models based on eco-nomic financial data.

Multicollinearity is a data problem, not the model.

9.1 Consequences

If exact multicollinearity exists, the regression coefficients cannot be estimated(÷0).

If explanatory variables are closely related but collinearity not exact:

1. Regression coefficients can be estimated, but have large errors. (The co-efficients aren’t biased; they’re in fact still B.L.U.E.)

2. t-tests of significance may be misleading, because they lack power. Rele-vant explanatory variables may appear to be statistically insignificant.

• Perfect multicollinearity: serious problem.

• Near-perfect multicollinearity: If the degree is high, then likely to be aserious problem.

24

10 Qualitative Analysis: Dummy Variables

Dummy variables take the values {0, 1}. They assess how economic modelsrespond to structural change.

You can measure a model with two different sets of observations.

Model with an Additive Dummy Variable:

Y = β1 + β2X2 + δD + εi

D =

{01

This effectively yields two separate, parallel regression models with differentintercepts:

Y = β1 + β2X2 + εi when D = 0Y = (β1 + δ) + β2X2 + εi when D = 1

Model with a Multiplicative Dummy Variable:

Y = β1 + β2X + δDX + εi

The two resulting models have the same intercept, but different slopes:

Y = β1 + β2X + εi when D = 0Y = β1 + (β2 + δ)X + εi when D = 1

And of course you can mix and match the two types:

Y = β1 + β2X + δ1Dδ2DX + εi

Extremely useful for modelling structural change at some point during the model(e.g. strike, liberalization of dollar.) The significance of these changes can beassessed statistically:

H0 : δ1 = 0

H1 : δ1 > 0

Dummy Variable Trap: including unneccessary dummy variables in the modelcan lead to perfect multicollinearity, which renders the model unestimable. Forexample, variables for both male and female.

You can get away with two perfect variables that are perfect complements byeliminating the regression constant. All of these models are fine:

Y = β1 + β2X2 + δmDm + εi

Y = β1 + β2X2 + δfDf + εi

Y = βX + δmDm + δfDf + εi

where Df and Dm are indicators for biological sex. It would not be possible,however, to include the constant and both dummy variables.

25

10.1 Time-Series Analysis

You can use dummy variables to model summer/winter, or quarters. Rememberto not specify all of them, or you get perfect collinearity. For summer/winter,use 1 dummy; for quarters, use 3 dummies.

Data is often spikey, with eg seasonal peaks. Very often, need to be aware oftrends, irrespective of seasonal variations. Want “seasonally adjusted” values.

Note that if there is a seasonal variation, and the specification ignores thatvariation (ie spec. error), you’ll almost certainly get a high D-W value andsignificant ρ.

You might model quarterly data as follows:

Y = β1 + β2TIME + δ1Q1 + δ2Q2 + δ3Q3 + εi

Where Q1 is 1 in September quarter, 0 otherwise, and so on. We omit Q4 orwe’ll get perfect multicollinearity. (Or, suppress intercept.)

You still need to examine the correlation between the dummy variables; it’sentirely possible to get low t-values because of high correlation. If there’s nolow t, then there’s no multicollinearity problem.

26

econometric s

Documents

themortgage

mortgage interest

mortgage interest

approximate

auxiliary

estimated

linear unbiased

regression