chap6

1 Econ 326 - Chapter 6

Chapter 6 More Multiple Regression Model

The F-test – Joint Hypothesis Tests

Consider the linear regression equation:

(1) i4i43i32i21i exxxy ++++ββββ++++ββββ++++ββββ++++ββββ==== for i = 1, 2, . . . , N

The t-statistic give a test of significance of an individual explanatory

variable, given the other variables in the regression equation.

An example of a joint hypothesis test is:

0:H 40 ====ββββ====ββββ3333

:H1 at least one is not zero

Why not use separate t-tests on each of the null hypotheses

0:H0 ====ββββ3333 and 0:H 40 ====ββββ ?

Typically 0)b,bcov( 43 ≠≠≠≠ .

That is, the slope estimators may be correlated.

Therefore, testing a series of single hypotheses is not equivalent to

testing hypotheses jointly.

An equation that assumes the null hypothesis is true and

incorporates the restrictions 0====ββββ3333 and 04 ====ββββ is:

(2) i2i21i vxy ++++ββββ++++ββββ==== ( iv is another random error)

Model (1) is called the unrestricted model.

Model (2) is called the restricted model.


The test method can proceed as follows.

STEP 1 Estimate the unrestricted model and get the sum of

squared residuals:

∑∑∑∑====

====N

1i

2iU eSSE

STEP 2 Estimate the restricted model and compute:

∑∑∑∑====

====N

1i

2iR vSSE

Note: RU SSESSE ≤≤≤≤

STEP 3 Construct the F-statistic:

)KN(SSE

J)SSESSE(F

U

UR

−−−−

−−−−====

where J is the number of restrictions.

In this example, J = 2 and K = 4.

The F-statistic is the ratio of two sum of squares.

The numerator degrees of freedom is J and

the denominator degrees of freedom is N – K.


The F-statistic can be compared with the F-distribution with

(J , N – K) degrees of freedom.

The F-distribution is defined for positive values and has a skewed

shape. The shape depends on the numerator and denominator

degrees of freedom.

Probability density function of the F-distribution

0 0.5 1 1.5 2 2.5 3 3.5 4

F(2,16)

F(3,16)

F(5,16)


Statistical tables included in textbook appendixes report critical

values cF such that

05.0)FF(P c)2m,1m( ====>>>> or

01.0)FF(P c)2m,1m( ====>>>>

where m1 is the numerator degrees of freedom and m2 is the

denominator degrees of freedom.

By setting the significance level of the test at either 0.05 (5%) or 0.01

(1%), the decision rule is to reject the null hypothesis if the calculated

F-statistic exceeds the critical value cF .


Example

Consider the joint hypothesis:

0:H 40 ====ββββ====ββββ3333


For an application with N=20, the calculated F-statistic is F = 7.4.

From the statistical tables, the 1% critical value from the

F-distribution with J=2 and N – K = 20 – 4 = 16 degrees of freedom is

cF = 6.23.

The F-statistic exceeds the critical value and therefore, there is

evidence to reject the null hypothesis.

This suggests that the p-value for the test must be less than 0.01.

The p-value is the probability:

)F(P)FF(Pp )16,2()2m,1m( 7.4>>>>====>>>>====

An exact p-value can be calculated with the Microsoft Excel function:

F.DIST.RT(7.4, 2, 16)

↑↑↑↑ right-tail

This gives the answer p = 0.005303.


With Stata, the test command can be used for joint hypothesis tests.

On the Stata results, the F-statistic is reported with an accompanying

p-value. The p-value can then be interpreted to make a decision.

Application of the F-test method with one restriction J = 1 is

equivalent to a t-test. In this special case the random variable

Ft ==== has a t-distribution with (N – K) degrees of freedom.


Testing the Significance of the Model

To test the overall significance of the regression consider a joint test

that all slope coefficients are zero. Test:

0...:H K320 ====ββββ========ββββ====ββββ


In this case, the restricted model is:

i1i vy ++++ββββ====

Least squares estimation of the restricted model gives: yb1 ====

Therefore, ∑∑∑∑====

−−−−========N

1i

2iR )yy(SSTSSE

Recognize SSESSEU ==== .

The number of restrictions is K – 1.

The test statistic and accompanying p-value is:

)KN(SSE

)1K()SSESST(F

−−−−

−−−−−−−−====

)FF(Pp )KN,1K( >>>>==== −−−−−−−−


There is a relationship between the 2R and the F-test for the overall

significance of the regression.

Recall SST

SSESST

SST

SSE1R2 −−−−

====−−−−====

Express the F-statistic as:

SSE

)SSESST(

1K

KNF

−−−−⋅⋅⋅⋅

−−−−−−−−

====

Divide both numerator and denominator by SST to get:

2

2

R1

R

1K

KN

SST/SSE

SST/)SSESST(

1K

KNF

−−−−⋅⋅⋅⋅

−−−−−−−−

====

−−−−⋅⋅⋅⋅

−−−−−−−−

====

This shows that if 2R = 0 then F = 0.

As 2R increases the F-statistic also increases.

When 2R = 1, F is infinite.

The F-test for the overall significance of the regression can be viewed

as a test of significance of the 2R .


The Use of Nonsample Information

It may be sensible to incorporate restrictions in the model estimation.

That is, better use of the information may give better estimates.

This is illustrated by continuing with the Cobb-Douglas production

function example that was introduced in the Chapter 5 lecture notes.

The linear regression equation is:

ii3i21i e)Kln()Lln()Qln( ++++ββββ++++ββββ++++ββββ====

Nonsample information (that is, additional information provided by

economic theory) is the constant returns to scale restriction:

132 ====ββββ++++ββββ

This restriction can be expressed as:

2222ββββ−−−−====ββββ 13


The restricted model is obtained by substituting the restriction into

the equation. Substitution gives:

iii21i e)Kln()1()Lln()Qln( ++++ββββ−−−−++++ββββ++++ββββ==== 2222

Rearrange terms to get:

iii21ii e)]Kln()L[ln()Kln()Qln( ++++−−−−ββββ++++ββββ====−−−−

The restricted model can be stated as:

ii

i21

i

i eK

Lln

K

Qln ++++

ββββ++++ββββ====

The introduction of the restriction means that one parameter is

eliminated and so the number of degrees of freedom increases by

one.

Least squares estimation of the restricted model is called

restricted least squares.


Some results:

� If the restrictions are true then the restricted estimator gives an

unbiased estimation rule. But if the restrictions are not correct

then the restricted estimator may be biased since the first

assumption of the standard set of assumptions is violated.

� The additional information in the restrictions leads to more

precise estimation. That is, suppose

Ub is the estimator of 2ββββ from the unrestricted model, and

Rb is the estimator of 2ββββ from the restricted model.

A result is: )bvar()bvar( UR <<<<

However, since RU SSESSE ≤≤≤≤ this means:

2R

2U ˆˆ σσσσ≤≤≤≤σσσσ

Therefore, it is possible to get numerical estimation results that

show:

)br(av)br(av UR >>>>

This follows since different estimates of the error variance 2σσσσ

are used for the unrestricted and restricted model.


Model Specification

Suppose the true model of economic behaviour is:

(1) i3i32i21i exxy ++++ββββ++++ββββ++++ββββ====

where the slope coefficients are non-zero.

But the applied work uses the equation:

(2) i2i21i vxy ++++ββββ++++ββββ====

where iv has the role as the random error.

An important variable has been excluded.

This is called a specification error.

Model (1) is the unrestricted model.

Model (2) is the restricted model.

The restriction imposed is 03 ====ββββ .


How does specification error affect the properties of the least squares

estimator ?

If (1) is the correct model then the error term of equation (2) can be

stated as:

i3i3i exv ++++ββββ====

and

0

x

)e(Ex

)ex(E)v(E

3i3

i3i3

i3i3i

≠≠≠≠

ββββ====

++++ββββ====

++++ββββ====

That is, for the restricted model, 0)v(E i ≠≠≠≠ .

This is a violation of one of the standard assumptions.

Therefore, the least squares estimator of 2ββββ from the restricted

model (2) may be a biased estimator.


The magnitude of the bias can be found as follows.

The slope estimator from model (2) is:

∑∑∑∑====

====N

1iii2 ywb where

∑∑∑∑====

−−−−

−−−−==== N

1i

222i

22ii

)xx(

xxw

with 0wN

1ii ====∑∑∑∑

==== and 1xw

N

1i2ii ====∑∑∑∑

====

With model (1) as the true model:

∑∑∑∑ ∑∑∑∑

∑∑∑∑

==== ====

====

++++ββββ++++ββββ====

++++ββββ++++ββββ++++ββββ====

N

1i

N

1iii3ii32

N

1ii3i32i21i2

ewxw

)exx(wb


Take expectations to get:

)xvar(

)x,xcov(

)xx(

x)xx(

)b(E

2

3232

N

1i

222i

N

1i3i22i

322

ββββ++++ββββ====

∑∑∑∑ −−−−

∑∑∑∑ −−−−ββββ++++ββββ====

====

====

The second term gives a measure of the bias.

With 03 ≠≠≠≠ββββ the estimator of 2ββββ from model (2) is biased unless the

variables 2x and 3x have zero covariance (that is, the variables are

uncorrelated).

Note – The above used the result:

∑∑∑∑∑∑∑∑========

−−−−====−−−−−−−−N

1i3i22i

N

1i33i22i x)xx()xx)(xx(

Conclusion: The exclusion of important variables generally leads to

biased estimators of the slope coefficients. Therefore, use economic

theory as a guide to including all important variables in the model.


Now look at the situation where the true value for 3ββββ is zero.

That is, 3x is an irrelevant variable so that the restricted model is

the true model.

If the unrestricted model is used for parameter estimation, it turns

out that all the standard model assumptions are valid.

Therefore, the least squares principle will give an unbiased

estimation rule for the model parameters.

However, the estimator for 2ββββ will have a larger variance compared

to the estimator from the restricted model that incorporates the

correct restrictions.

That is, estimation of the unrestricted model gives an estimator for

2ββββ that is unbiased but ‘inefficient’.

The estimator is not as precise as could be obtained by estimating the

model with the correct restrictions imposed.


Model Selection Criteria

The Adjusted Coefficient of Determination

The 2R can be used as a guide for selecting between two competing

models.

For example, suppose two economists propose two different theories

described by the models:

Model 1 i3i32i21i exxy ++++ββββ++++ββββ++++ββββ====

Model 2 ii21i vzy ++++αααα++++αααα====

A model selection method is to choose the model that yields the

highest 2R . However, an

2R comparison is inappropriate when:

• The models have different dependent variables.

For example, iy versus )yln( i .

• The models have different numbers of explanatory variables.

For the example here, the two models have the same dependent

variable. Therefore the SST is identical for the two models.

But for Model 1, K=3 and for Model 2, K = 2.


For a regression equation, as the number of explanatory variables

increases there will tend to be a decrease in the SSE (the sum of

squared residuals). This then gives a higher 2R .

That is, the 2R tends to increase even if added variables are

irrelevant (a test of significance does not reject the null hypothesis of

a zero coefficient).

To impose a penalty for increasing the number of explanatory

variables, define the Adjusted 2R as:

)1N(/SST

)KN(/SSE1R2

−−−−

−−−−−−−−====

Adjustments are made for the degrees of freedom associated with the

sum of squares.

Features of the Adjusted 2R are:

� 2R need not increase when new explanatory variables are

added to the regression equation.

� 2R can be negative.


The RESET Test ~ Testing for Model Misspecification

‘Diagnostic testing’ is a popular research topic for econometricians.

The interest is in identifying better modelling approaches.

An example of a diagnostic test, developed by James B. Ramsey, is

the RESET test (REgression Specification Error Test).

This test is designed to detect omitted variables and incorrect

functional form.

A model is:

i3i32i21i exxy ++++ββββ++++ββββ++++ββββ====

The least squares estimates of the parameters are 1b , 2b and 3b .

The fitted or predicted values are:

3i32i21i xbxbby ++++++++====

Is the model a good specification ?

To answer this question, create an artificial model that includes the

extra explanatory variable 2iy .

If the coefficient on this extra variable is significantly different from

zero this will suggest that the original model is inadequate.


The artificial model is:

(1) i2i13i32i21i uyxxy ++++γγγγ++++ββββ++++ββββ++++ββββ==== ( iu is a random error)

The artificial model can be estimated by least squares.

The hypothesis of interest is:

0:H 10 ====γγγγ against 0:H 11 ≠≠≠≠γγγγ

A t-statistic and p-value can be obtained from the least squares

estimation output.

Rejection of the null hypothesis says the test has detected

misspecificiation.


Two other variations for the artificial model are:

(2) i3i2

2i13i32i21i uyyxxy ++++γγγγ++++γγγγ++++ββββ++++ββββ++++ββββ====

(3) i4i3

3i2

2i13i32i21i uyyyxxy ++++γγγγ++++γγγγ++++γγγγ++++ββββ++++ββββ++++ββββ====

From artificial model (2) the hypothesis of interest is:

0:H 210 ====γγγγ====γγγγ against :H1 at least one is not zero

This is a joint hypothesis test and so an F-test statistic is required.

The F-test statistic can be compared with an F-distribution with

(2, N – 5) degrees of freedom.

From artificial model (3) the hypothesis of interest is:

0:H 3210 ====γγγγ====γγγγ====γγγγ against :H1 at least one is not zero

In this case, the F-test statistic can be compared with an F-distribution

with (3, N – 6) degrees of freedom.

In each case, rejection of the null hypothesis suggests some general

model misspecification in the original regression equation.


Example

The lecture notes for Chapter 5 gave two competing models that

explained sales of a fast food franchise by price and advertising:

ii3i21i eapsales ++++ββββ++++ββββ++++ββββ====

ii3i21i v)aln()pln()salesln( ++++αααα++++αααα++++αααα====

For model selection purposes, the Ramsey RESET test statistics may

reveal specification problems in one or both or none of the functional

forms above.

The model estimation uses N = 50 observations.

For the linear model, the RESET test statistics from the artificial

models (1), (2) and (3) are:

0H F-statistic p-value

01 ====γγγγ 2.71 0.107

021 ====γγγγ====γγγγ 1.88 0.164

0321 ====γγγγ====γγγγ====γγγγ 2.76 0.053

For all test statistics, at a 5% significance level, the p-values give no

evidence to reject the null hypothesis of no misspecification.

For the given data set, the linear model appears adequate.


Note that the first RESET test statistic is calculated as

2.7091.646 ============ 22 )(tF

t= 1.646 is the t-statistic for testing 0:H 10 ====γγγγ in the artificial

regression (1). The degrees of freedom is 50 – 4 = 46.

With Microsoft Excel the p-value for the test can be found with the

function:

T.DIST.2T(1.646, 46) = 0.107

This is equivalent to comparing 2t with an F-distribution with (1, 46)

degrees of freedom. That is, the same result for the p-value

calculation can be found with the Microsoft Excel function:

F.DIST.RT(2.709, 1, 46) = 0.107

The p-value is above the significance level of 0.05.

There is no evidence to suggest the linear model is inadequate.


For the log-log model, the RESET test statistics are:

0H F-statistic p-value

01 ====γγγγ 2.91 0.095

021 ====γγγγ====γγγγ 2.01 0.145

0321 ====γγγγ====γγγγ====γγγγ 2.03 0.123

Again, all calculated p-values for the RESET tests are above the level

of 0.05. The null hypothesis of no model misspecification is not

rejected by the data.

Therefore, the log-log model also appears to describe the data.

In this example, the RESET test was not able to say which functional

form may be better.

Calculation Note:

The magnitude of the powers of the predictions 2iy , 3

iy and 4iy

can become very large to give an ill-conditioned data set.

Least squares estimation of the artificial regression can then fail.

To avoid such numerical problems a solution is to rescale the

dependent variable before equation estimation.


Multi-collinearity

Exact or perfect collinearity means an explanatory variable can be

written as a linear combination of other explanatory variables.

The result is that a numerical solution for the parameter estimates is

not possible from the least squares method.

Suppose the explanatory variables are highly correlated so that there

is almost perfect collinearity.

This situation is called multi-collinearity.

How are the least squares estimation results affected by

multi-collinearity ?

None of the standard assumptions of the Gauss-Markov theorem are

violated. Therefore, the least squares estimator is the best (minimum

variance) estimator compared to any other linear unbiased estimator.

That is, the least squares estimator is BLUE.

The problem is that it may be difficult to isolate the separate effects of

the individual explanatory variables with any precision.

That is, the individual parameter estimates will show relatively large

standard errors and, for some of the slope coefficients, it will be

difficult to reject 0:H k0 ====ββββ .

However, an F-statistic for testing the overall significance of the

regression will strongly reject the null hypothesis that all slope

coefficients are zero.


How can multi-collinearity be detected ?

� Before model estimation

Look at the sample correlations among the explanatory

variables. A correlation greater than, say, |0.8| or |0.9|

may suggest the presence of multi-collinearity.

� After model estimation

• for t-statistics for tests of significance, high p-values

mean that a statistically significant relationship between

the dependent variable y and the explanatory variable

cannot be shown.

• the F-statistic for the overall significance of the

regression shows that the explanatory variables are

important in explaining the dependent variable.

Multi-collinearity is a problem with the data set.

Another sample from the population may show none of the signs of

multi-collinearity.

A related problem is when the numerical data for an explanatory

variable shows little variation in the sample. It will be difficult to

identify the impact of this variable even though it may be important

to the economic model.


Example: The Cobb-Douglas production function, introduced in

earlier lecture notes, is stated as the regression equation:

ii3i21i e)Kln()Lln()Qln( ++++ββββ++++ββββ++++ββββ====

For the data set of 33 firms, the sample correlation between the

explanatory variables )Lln( and )Kln( is 0.986.

The correlation close to one is a signal of multi-collinearity.

Now consider how this is revealed in the estimation results.

The estimated model is:

)Kln()Lln()Q(nl iii 0.4880.5590.129 ++++++++−−−−====

(−0.24) (0.68) (0.69) (t-statistic)

(0.815) (0.499) (0.494) (p-value)

The reported t-statistics and p-values are for a test of the null

hypothesis that the corresponding coefficient is zero against a two-

sided alternative.

For each slope coefficient the hypothesis of a zero coefficient is not

rejected. This implies that the estimates are unreliable.

Testing the overall significance of the model with an F-test for the

joint hypothesis 0:H 320 ====ββββ====ββββ gives an F-statistic of 33.12

with a p-value less than 0.005. (With the Stata regress command

these results accompany the 2R (R-squared).)

This gives strong evidence to reject the null hypothesis and conclude

that labour and capital inputs are important in explaining output.


By introducing the constant returns to scale restriction 132 ====ββββ++++ββββ

the restricted model, discussed in earlier lecture notes, is stated:

(((( )))) (((( )))) iii21ii eKLlnKQln ++++ββββ++++ββββ====

The estimated equation is:

(((( )))) (((( ))))iiii KLlnKQnl 0.3980.020 ++++====

(0.38) (0.71) (t-statistic)

(0.707) (0.482) (p-value)

The slope coefficient estimate of 0.398 gives the output elasticity

with respect to labour. Again, a test of significance for the slope

coefficient has a p-value greater than usual significance levels (such

as 0.10 or 0.05) to indicate that the estimation is still imprecise.

It appears that, in the data set, labour and capital are used in a

relatively fixed proportion leading to a multicollinearity problem.

For estimation of the restricted model the explanatory variable

(((( ))))KLln does not have the variability required to identify a labour

elasticity with any precision.


Another Example

A data set for the manufacturing sector has quarterly time series on

inventories (y) and sales (x). A model recognizes that inventories

depend on the sales in the current quarter as well as the sales in the

three previous quarters:

t3t52t41t3t21t exxxxy ++++ββββ++++ββββ++++ββββ++++ββββ++++ββββ==== −−−−−−−−−−−−

for t = 4, 5, . . . , T

where 1tx −−−− is a one-period lag,

2tx −−−− is a two-period lag, and

3tx −−−− is a three-period lag.

Note that the t subscript is used for time series observations.

The total number of observations is T.

With lagged explanatory variables, the initial observations will be

undefined. In this example, the first three observations are undefined

and therefore, the sample period for the estimation starts at

observation t = 4.

This is called a distributed lag model.

It is typical that the explanatory variables in this model are highly

correlated. Therefore, it will be difficult to estimate the individual

slope coefficients with any useful precision.


One solution is to incorporate parameter restrictions in the model.

For this example, it may be reasonable to consider that the response

of inventories to sales further in the past will be relatively small

compared to sales in the current or previous quarter. Parameter

restrictions that will give this type of behaviour are:

232

1ββββ====ββββ , 24

4

1ββββ====ββββ and 25

6

1ββββ====ββββ

This scheme of declining weights has been proposed for simplicity.

Substitution gives the restricted model:

tt21

t3t2t1tt21t

ez

ex6

1x

4

1x

2

1xy

++++ββββ++++ββββ====

++++

++++++++++++ββββ++++ββββ==== −−−−−−−−−−−−

where 3t2t1ttt x6

1x

4

1x

2

1xz −−−−−−−−−−−− ++++++++++++====

Least squares estimation of the restricted model gives parameter

estimates 1111b and 2b .

Estimates for the other model parameters are then calculated as:

23 b2

1b ==== , 24 b

4

1b ==== and 25 b

6

1b ====


The motivation of restricted estimation is to get more precise

estimates.

However, if the restrictions are incorrect then the restricted estimator

gives a biased estimation rule.

This suggests that, before model estimation, the economic theory

must be carefully studied, and, after model estimation, various

testing exercises are of interest to reveal possible model inadequacies.

chap6

Documents