lecture9: hypothesistestsandconfidenceintervalsin...

63
Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression Zheng Tian Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 1 / 55

Upload: others

Post on 16-Feb-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

  • Lecture 9: Hypothesis Tests and Confidence Intervals inMultiple Regression

    Zheng Tian

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 1 / 55

  • Outline

    1 Hypothesis Tests and Confidence Intervals For a Single Coefficient

    2 Tests of Joint Hypotheses

    3 Confidence Sets for multiple coefficients

    4 Warm-up Exercises

    5 Model Specification for Multiple Regression

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 2 / 55

  • Hypothesis Tests and Confidence Intervals For a Single Coefficient

    1 Hypothesis Tests and Confidence Intervals For a Single Coefficient

    2 Tests of Joint Hypotheses

    3 Confidence Sets for multiple coefficients

    4 Warm-up Exercises

    5 Model Specification for Multiple RegressionZheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 3 / 55

  • Hypothesis Tests and Confidence Intervals For a Single Coefficient

    The basic multiple regression model

    Consider the following model

    Y = β0ι + β1X1 + β2X2 + · · ·+ βkXk + u (1)

    Y,X1,X2, . . . ,Xk , and u are n × 1 vectors of the dependent variable,regressors, and errorsβ0, β1, β2, . . . , and βk are parameters.ι is the n × 1 vector of 1s.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 3 / 55

  • Hypothesis Tests and Confidence Intervals For a Single Coefficient

    Review of Var(β̂|X )

    The homoskedasticity-only covariance matrix if ui is homoskedastic

    Var(β̂|X) = σ2u(X′X)−1 (2)

    The heteroskedasticity-robust covariance matrix if ui is heteroskedastic

    Varh(β̂|X) = (X′X)−1

    Σ(X′X)−1 (3)

    where Σ = X′ΩX, and Ω = Var(u|X).

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 4 / 55

  • Hypothesis Tests and Confidence Intervals For a Single Coefficient

    The multivariate normal distribution of β̂

    We know that if the least squares assumptions hold, β̂ has an asymptoticmultivariate normal distribution as

    β̂d−−→ N(β,Σβ̂) (4)

    where Σβ̂ = Var(β̂|X) for which use Equation (2) for the homoskedastic case andEquation (3) for the heteroskedastic case.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 5 / 55

  • Hypothesis Tests and Confidence Intervals For a Single Coefficient

    The estimator of Var(β̂|X )The estimator of σ2u

    s2u =1

    n − k − 1

    n∑i=1

    û2i (5)

    Thus, the estimator of the homoskedasticity-only covariance matrix is

    V̂ar(β̂) = s2u (X′X)−1 (6)

    The estimator of Σ

    Σ̂ =n

    n − k − 1

    n∑i=1

    XiX′i û2i (7)

    where Xi is the vector of the ith observation of (k + 1) regressors. Theheteroskedasticity-consistent (robust) covariance matrix estimator is

    V̂arh(β̂) = (X′X)−1

    Σ̂(X′X)−1 (8)

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 6 / 55

  • Hypothesis Tests and Confidence Intervals For a Single Coefficient

    The estimator of SE (β̂j)

    We can get the standard error of β̂j as the square root of the jth diagonal element

    of V̂ar(β̂) for homoskedasticity and V̂arh(β̂) for heteroskedasticity. That is,

    Homoskedasticity-only standard error: SE (β̂j) =

    ([V̂ar(β̂)

    ](j,j)

    ) 12

    Heteroskedasticity-robust standard error: SE (β̂j) =

    ([V̂arh(β̂)

    ](j,j)

    ) 12

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 7 / 55

  • Hypothesis Tests and Confidence Intervals For a Single Coefficient

    The t-statistic

    We can perform a two-sided hypothesis test as

    H0 : βj = βj,0 vs. H1 : βj 6= βj,0

    We still use the t-statistic, computed as t = (β̂j − βj,0)/SE (β̂j), whereSE (β̂j) is the standard error of β̂j .

    Under the null hypothesis, we have, in large samples, t a∼ N(0, 1). Therefore,the p-value can still be computed as 2Φ(−|tact |).The null hypothesis is rejected at the 5% significant level when the p-value isless than 0.05, or equivalently, if |tact | > 1.96. (Replace the critical valuewith 1.64 at the 10% level and 2.58 at the 1% level.)

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 8 / 55

  • Hypothesis Tests and Confidence Intervals For a Single Coefficient

    Confidence intervals for a single coefficient

    The confidence intervals for a single coefficient can be constructed as before usingthe t-statistic.

    Given large samples, a 95% two-sided confidence interval for the coefficient βj is[β̂j − 1.96SE (β̂j), β̂j + 1.96SE (β̂j)

    ]

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 9 / 55

  • Hypothesis Tests and Confidence Intervals For a Single Coefficient

    Application to test scores and the student-teacher ratio

    The estimated model can be written as follows

    ̂TestScore = 686.0(8.7)

    − 1.10(0.43)

    × STR − 0.650(0.031)

    × PctEl

    We test H0 : β1 = 0 vs H1 : β1 6= 0. The t-statistic for this test can becomputed as t = (−1.10− 0)/0.43 = −2.54 < −1.96, and the p-value is2Φ(−2.54) = 0.011 < 0.05. Based on either the t-statistic or the p-value, wecan reject the null hypothesis at the 5% level.The confidence interval that contains the true value of β1 with a 95%probability can be computed as −1.10± 1.96× 0.43 = (−1.95,−0.26).

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 10 / 55

  • Hypothesis Tests and Confidence Intervals For a Single Coefficient

    Adding expenditure per pupil to the equation

    Now we add a new explanatory variable in the regression, Expn, that is theexpenditure per pupil in the district in thousands of dollars.

    ̂TestScore = 649.6(15.5)

    − 0.29(0.48)

    × STR + 3.87(1.59)

    × Expn − 0.656(0.032)

    × PctEl

    The magnitude of the coefficient on STR decreases from 1.10 to 0.29 afterExpn is added.The standard error of the coefficient on STR increases from 0.43 to 0.48after Expn is added.Consequently, in the new model, the t-statistic for the coefficient becomest = −0.29/0.48 = −0.60 > −1.96 so that we cannot reject the zerohypothesis at the 5% level. (neither can we at the 10% level).

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 11 / 55

  • Hypothesis Tests and Confidence Intervals For a Single Coefficient

    How can we interpret such changes?

    The decrease in the magnitude of the coefficient reflects that expenditure perpupil is an important factor that carry over some influence of student-teacherratio on test scores.

    In other words, holding expenditure per pupil and the percentage ofEnglish-learners constant, reducing class sizes by hiring more teachers haveonly small effect on test scores

    The increase in the standard error reflects that Expn and STR are correlatedso that there is imperfect multicollinearity in this model. In fact, thecorrelation coefficient between the two variables is 0.48, which is relativelyhigh.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 12 / 55

  • Tests of Joint Hypotheses

    1 Hypothesis Tests and Confidence Intervals For a Single Coefficient

    2 Tests of Joint Hypotheses

    3 Confidence Sets for multiple coefficients

    4 Warm-up Exercises

    5 Model Specification for Multiple RegressionZheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 13 / 55

  • Tests of Joint Hypotheses

    The unrestricted model

    Consider the following multiple regression model

    Y = β0ι + β1X1 + β2X2 + · · ·+ βkXk + u (9)

    We call Equation (9) as the full model or the unrestricted model because β0 to βkcan take any value without restrictions.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 13 / 55

  • Tests of Joint Hypotheses

    Joint hypothesis: a case of two zero restrictions

    Question: Are the coefficients on the first two regressors zero?Joint hypotheses

    H0 : β1 = 0, β2 = 0, vs. H1 : either β1 6= 0 or β2 6= 0 (or both)

    This is a joint hypothesis because β1 = 0 and β2 = 0 must hold at the sametime. So if either of them is invalid, the null hypothesis is rejected as a whole.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 14 / 55

  • Tests of Joint Hypotheses

    The restricted model with two zero restrictions

    If the null hypothesis is true, we have

    Y = β0 + β3X3 + β4X4 + · · ·+ βkXk + u (10)

    We call Equation (10) as the restricted model because we impose tworestrictions β1 = 0 and β2 = 0.To test these two restrictions jointly means that we need to use a singlestatistic to test these restrictions simultaneously. That statistic is F-statistic.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 15 / 55

  • Tests of Joint Hypotheses

    Why not use t-statistic and test individual coefficients oneat a time?

    Let us test the null hypothesis above using t-statistics for β1 and β2separately. That is, t1 is the t-statistic for β1 = 0 and t2 is the t-statistic forβ2 = 0.

    Compute the t-statistics t1 for β1 = 0 and t2 for β2 = 0. We call this"one-at-a-time" testing procedure.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 16 / 55

  • Tests of Joint Hypotheses

    What’s the problem with the one-at-a-time procedure

    How can we reject the null hypothesis with this procedure?

    Using the one-at-a-time procedure, at the 5% significance level, we can reject thenull hypothesis of H0 : β1 = 0 and β2 = 0 when either |t1| > 1.96 or |t2| > 1.96(or both). In other words, the null is not rejected only when both |t1| ≤ 1.96 and|t2| ≤ 1.96.

    What is the probability of committing Type I error?

    Assume t1 and t2 to be independent. Then,

    Pr(|t1| ≤ 1.96 & |t2| ≤ 1.96) = Pr(|t1| ≤ 1.96)Pr(|t2| ≤ 1.96) = 0.952 = 90.25%

    So the probability of rejecting the null when it is true is 1− 90.25% = 9.75%. Wemay reject the null hypothesis with a higher probability than what we havepre-specified with the significant level.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 17 / 55

  • Tests of Joint Hypotheses

    Joint hypothesis involving one coefficient for each restriction

    q restrictions

    H0 : β1 = β1,0, β2 = β2,0, . . . , βq = βq,0 versusH1 : at least one restriction does not hold

    The restricted modelSuppose that we are testing the q zero hypotheses, that is, q restrictions,β1 = β2 = · · · = βq = 0. The restricted model is

    Y = β0 + βq+1Xq+1 + βq+2Xq+2 + · · ·+ βkXk + u (11)

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 18 / 55

  • Tests of Joint Hypotheses

    Joint linear hypotheses

    Joint hypotheses include linear hypotheses like the followings

    1H0 : β1 = β2 vs. H1 : β1 6= β2

    2H0 : β1 + β2 = 1 vs. H1 : β1 + β2 6= 1

    3

    H0 : β1 + β2 = 0, 2β2 + 4β3 + β4 = 3 vs.H1 : at least one restriction does not hold

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 19 / 55

  • Tests of Joint Hypotheses

    A general form of joint hypotheses

    We can use a matrix form to represent all linear hypotheses regarding thecoefficients in Equation (9) as follows

    H0 : Rβ = r vs. H1 : Rβ 6= r (12)

    where R is a q × (k + 1) matrix with the full row rank, β represent the k + 1regressors, including the intercept, and r is a q × 1 vector of real numbers.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 20 / 55

  • Tests of Joint Hypotheses

    Examples of Rβ = r

    For H0 : β1 = 0, β2 = 0

    R =(β0 β1 β2 β3 · · · βk

    R1 0 1 0 0 · · · 0R2 0 0 1 0 · · · 0

    )and r =

    (00

    )For H0 : β1 + β2 = 0, 2β2 + 4β3 + β4 = 3, β1 = 2β3 + 1

    R =

    β0 β1 β2 β3 β4 · · · βk

    R1 0 1 1 0 0 · · · 0R2 0 0 2 4 1 · · · 0R3 0 1 0 −2 0 · · · 0

    and r =031

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 21 / 55

  • Tests of Joint Hypotheses

    The general form of the F-statistic

    To test the null hypothesisH0 : Rβ = r

    we compute the F-statistic

    F =1q

    (Rβ̂ − r)′[RV̂ar(β̂)R′

    ]−1(Rβ̂ − r) (13)

    β̂ is the estimated coefficients by OLS and V̂ar(β̂) is the estimatedcovariance matrix.

    For homoskedastic errors, we can compute V̂ar(β̂) as in Equation (6)

    For heteroskedastic errors, we can compute V̂arh(β̂) as in Equation (8)

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 22 / 55

  • Tests of Joint Hypotheses

    The F distribution, the critical value, and the p-value

    The F distributionIf the least square assumptions hold, under the null hypothesis, the F-statistic isasymptotically distributed as F distribution with degree of freedom (q,∞). Thatis,

    Fa∼ F (q,∞)

    The critical value and the p-value of F test.

    The 5% critical value of the F test using F-statistic is cα such that

    Pr(F < cα) = 0.95

    And the p-value of F test can be computed as

    p-value = Pr(F > F act)

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 23 / 55

  • Tests of Joint Hypotheses

    The F-statistic when q = 2

    The F-statistic for testing the null hypothesis of H0 : β1 = 0, β2 = 0 can beproved to take the following form,

    F =12t21 + t

    22 − 2ρ̂t1,t2t1t21− ρ̂2t1,t2

    (14)

    For simplicity, suppose t1 and t2 are independent so that ρ̂t1,t2 = 0. ThenF = 12 (t

    21 + t

    22 ).

    Under the null hypothesis, both t1 and t2 have standard normal distributionasymptotically. Then t21 + t

    22 has a chi-squared distribution with 2 degrees of

    freedom.It follows that F = 12 (t

    21 + t

    22 ) has asymptotically distributed as F (2,∞).

    The discussion about F-statistic in Equation (14) will become complicatedwhen ρ̂t1,t2 6= 0.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 24 / 55

  • Tests of Joint Hypotheses

    The homoskedasticity-only F-statistic

    Assume that Var(ui |Xi ) = σ2u for i = 1, . . . , n, i.e., ui has a constantconditional variance for all i .Test hypotheses

    H0 : the restricted model with q restrictions Rβ = rH1 : the unrestricted model Y = β0 + β1X1 + · · ·+ βkXk + u

    The homoskedasticity-only F-statistic can be computed as

    F =(RSSR − USSR)/qUSSR/(n − k − 1)

    (15)

    RSSR is the sum of squared residuals of the restricted modelUSSR is the sum of squared residuals of the unrestricted model.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 25 / 55

  • Tests of Joint Hypotheses

    The homoskedasticity-only F-statistic (cont’d)

    Dividing the numerator and the denominator in F-statistic by TSS , we getanother expression of the homoskedasticity-only F-statistic in terms of R2 as

    F =(R2unrestrict − R2restrict)/q

    (1− R2unrestrict)/(n − k − 1)(16)

    Suppose that all least square assumptions and the homoskedasticityassumption hold, then we know that

    F ∼ F (q, n − k − 1)

    So we can get the p-value and the critical value from the distributionF (q, n − k − 1).

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 26 / 55

  • Tests of Joint Hypotheses

    Understanding the homoskedasticity-only F-statistic

    RSSR ≥ USSR and R2unrestrict ≥ R2restrict are always true. Why?The unrestricted model have more regressors than the restricted model.

    SSR will decrease whenever an additional regressor is included in the modeland the coefficient on the new regressor is not zero.

    In other words, R2 in the unrestricted model will increase when a newregressor is added with a nonzero coefficient.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 27 / 55

  • Tests of Joint Hypotheses

    Understanding the homoskedasticity-only F-statistic (cont’d)

    Suppose that the null hypothesis is true. That is, the true model is really therestricted one. What would this imply?

    The explanatory power of the additional regressors in the unrestricted modelshould be very small.

    That means that USSR cannot be too much smaller than RSSR, or R2unrestrictcannot be too much larger than R2restrict if the null hypothesis is true.

    That means F should not be a large positive number under the null hypothesis.

    If we compute an F-statistic that is large enough compared with a criticalvalue at some significance level, then we can reject the null hypothesis.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 28 / 55

  • Tests of Joint Hypotheses

    Transformation of joint hypothesis testing to singlehypothesis testing

    The goal: to convert a joint hypothesis to a single hypothesis.Consider the following model

    Y = β0 + β1X1 + β2X2 + u

    The null hypothesis is H0 : β1 = β2.How can we convert H0 to a single hypothesis test?

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 29 / 55

  • Tests of Joint Hypotheses

    Transformation of joint hypothesis testing to singlehypothesis testing (cont’d)

    Rewrite the model

    Y = β0 + (β1 − β2)X1 + β2(X1 + X2) + u

    Define γ = β1 − β2 and W = X1 + X2. Then the original model becomes

    Y = β0 + γX1 + β2W + u

    Instead of testing β1 − β2 = 0, we test H0 : γ = 0 using the t-statisticcomputed from the transformed model.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 30 / 55

  • Tests of Joint Hypotheses

    Application of the California school districts

    The estimated model

    ̂TestScore = 649.6(15.5)

    − 0.29(0.48)

    ×STR− 3.87(1.59)

    ×Expn− 0.656(0.032)

    ×PctEl , R2 = 0.4366

    The joint hypothesis

    H0 : β1 = 0, and β2 = 0 v.s. H1 : β1 6= 0 or β2 6= 0

    The heteroskedasticity-robust F statistic is 5.43. The critical value of theF2,∞ distribution at the 5% significance level is 3.00. So we reject the nullhypothesis.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 31 / 55

  • Tests of Joint Hypotheses

    Application to test scores and the student-teacher ratio(cont’d)

    The homoskedasticity-only F statistic.

    ̂TestScore = 664.7(1.0)

    − 0.671(0.032)

    × PctEl , R2 = 0.4149

    Now we have R2unrestricted = 0.4366, R2restricted = 0.4149, q = 2, n = 420, and

    k = 3.The homoskedasticity-only F statistic is computed as

    F =(0.4366− 0.4149)/2

    (1− 0.4366)/(420− 3− 1)= 8.01

    Because 8.01 exceeds the 1% critical value of 4.61 from the F2,∞distribution, the null hypothesis is rejected.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 32 / 55

  • Tests of Joint Hypotheses

    Check the critical value using the F table

    Figure: The table of critical values of theF (q,∞) distribution

    Figure: The table of critical values at the5% level of the F (m, n) distribution

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 33 / 55

  • Confidence Sets for multiple coefficients

    1 Hypothesis Tests and Confidence Intervals For a Single Coefficient

    2 Tests of Joint Hypotheses

    3 Confidence Sets for multiple coefficients

    4 Warm-up Exercises

    5 Model Specification for Multiple RegressionZheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 34 / 55

  • Confidence Sets for multiple coefficients

    Confidence set: definition

    A 95% confidence set for two or more coefficients is

    a set that contains the true population values of these coefficients in 95% ofrandomly drawn samples.Equivalently, the set of coefficient values that cannot be rejected at the 5%significance level.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 34 / 55

  • Confidence Sets for multiple coefficients

    How to construct a confidence set

    Suppose that we construct the confidence set for β1 = β1,0, β2 = β2,0.

    Let Fβ1,β2 be the heteroskedasticity-robust or homoskedasticity-onlyF-statistic.A 95% confidence set is

    {β1, β2 : Fβ1,β2 < cF}

    where cF is the 5% critical value of the F (2,∞) distribution, which is closeto 3 in this case.This set has coverage rate 95% because the test on which it is based has thesize of 5%.Therefore the confidence set constructed as the nonrejected values containsthe true value 95% of the time.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 35 / 55

  • Confidence Sets for multiple coefficients

    The confidence set based on the F-statistic is an ellipse

    According to Equation (14), the confidence set for β1, and β2 is{β1, β2 : F =

    12t21 + t

    22 − 2ρ̂t1,t2t1t21− ρ̂2t1,t2

    ≤ 3}

    Plugging the formula of t1 and t2, the F-statistic becomes( β̂1 − β1,0SE (β̂1)

    )2+

    (β̂2 − β2,0SE (β̂2)

    )2+ 2ρ̂t1,t2

    (β̂1 − β1,0SE (β̂1)

    )(β̂2 − β2,0SE (β̂2)

    ) ≤ 3

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 36 / 55

  • Confidence Sets for multiple coefficients

    The confidence set: an illustration

    Figure: 95% Confidence Set for Coefficients on STR and Expn

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 37 / 55

  • Warm-up Exercises

    Question 1

    The following linear hypothesis can be tested using the F-test with the exceptionof

    A) β2 = 1 and β3 = β4/β5B) β2 = 0C) β1 + β2 = 1 and β3 = −2β4D) β0 = β1 and β1 = 0

    Answer: A

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 38 / 55

  • Warm-up Exercises

    Question 1

    The following linear hypothesis can be tested using the F-test with the exceptionof

    A) β2 = 1 and β3 = β4/β5B) β2 = 0C) β1 + β2 = 1 and β3 = −2β4D) β0 = β1 and β1 = 0

    Answer: A

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 38 / 55

  • Warm-up Exercises

    Question 2

    To test joint linear hypotheses in the multiple regression model, you need to

    A) compare the sums of squared residuals from the restricted andunrestricted model.

    B) use the heteroskedasticity-robust F-statistic.C) use several t-statistics and perform tests using the standard normal

    distribution.D) compare the adjusted R2 for the model which imposes the

    restrictions, and the unrestricted model.

    Answer: B

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 39 / 55

  • Warm-up Exercises

    Question 2

    To test joint linear hypotheses in the multiple regression model, you need to

    A) compare the sums of squared residuals from the restricted andunrestricted model.

    B) use the heteroskedasticity-robust F-statistic.C) use several t-statistics and perform tests using the standard normal

    distribution.D) compare the adjusted R2 for the model which imposes the

    restrictions, and the unrestricted model.

    Answer: B

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 39 / 55

  • Warm-up Exercises

    Question 3

    Let R2unrestricted and R2restricted be 0.4366 and 0.4149 respectively. The difference

    between the unrestricted and the restricted model is that you have imposed tworestrictions. There are 420 observations and 3 regressors including the intercept.The homoskedasticity-only F-statistic in this case is

    A) 4.61B) 8.01C) 10.34D) 7.71

    Answer: B

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 40 / 55

  • Warm-up Exercises

    Question 3

    Let R2unrestricted and R2restricted be 0.4366 and 0.4149 respectively. The difference

    between the unrestricted and the restricted model is that you have imposed tworestrictions. There are 420 observations and 3 regressors including the intercept.The homoskedasticity-only F-statistic in this case is

    A) 4.61B) 8.01C) 10.34D) 7.71

    Answer: B

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 40 / 55

  • Warm-up Exercises

    Question 4

    The formulation Rβ = r to test a hypotheses

    A) allows for restrictions involving both multiple regression coefficientsand single regression coefficients.

    B) is F-distributed in large samples.C) allows only for restrictions involving multiple regression coefficients.D) allows for testing linear as well as nonlinear hypotheses.

    Answer: A

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 41 / 55

  • Warm-up Exercises

    Question 4

    The formulation Rβ = r to test a hypotheses

    A) allows for restrictions involving both multiple regression coefficientsand single regression coefficients.

    B) is F-distributed in large samples.C) allows only for restrictions involving multiple regression coefficients.D) allows for testing linear as well as nonlinear hypotheses.

    Answer: A

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 41 / 55

  • Model Specification for Multiple Regression

    1 Hypothesis Tests and Confidence Intervals For a Single Coefficient

    2 Tests of Joint Hypotheses

    3 Confidence Sets for multiple coefficients

    4 Warm-up Exercises

    5 Model Specification for Multiple RegressionZheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 42 / 55

  • Model Specification for Multiple Regression

    Omitted variable bias in multiple regression

    We should always be alert to omitted variable bias in the OLS estimation.For omitted variable bias to arise, two things must be true:

    1 At least one of the included regressors must be correlated with the omittedvariable.

    2 The omitted variable must be a determinant of the dependent variable, Y .

    With omitted variable bias, the least square assumption E (u|X ) = 0 does nothold any more.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 42 / 55

  • Model Specification for Multiple Regression

    The problem of the assumption of E (u|X ) = 0 and controlvariables

    The assumption of E (u|X ) = 0 is essential to ensures unbiasedness andconsistency. However, it is too strong to be completely realized in practice.What we can do is that we divide all regressors into two groups:

    1 One group consists of regressors whose causal effects on Y are our researchinterest so that we want unbiased estimates of these coefficients.

    2 Another group consists of regressors whose causal effects on Y are not ourfocus. But if we omit them, we would risk making omitted variable bias in thecoefficients that we do care.

    The regressors in the latter group are called control variable.W use an assumption that is weaker than the assumption of E (u|X ) = 0 toensure that the estimated coefficients on the regressors in the first groups areunbiased, maintaining the causal implication that we want.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 43 / 55

  • Model Specification for Multiple Regression

    Control variables: definition

    DefinitionA control variable W is a variable that is correlated with, and controls for, anomitted causal factor in the regression of Y on X, but which itself does notnecessarily have a causal effect on Y.

    The role of control variablesA control variable is not the object of interest in the study; rather it is a regressorincluded to hold constant factors that, if neglected, could lead to the estimatedcausal effect of interest to suffer from omitted variable bias.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 44 / 55

  • Model Specification for Multiple Regression

    The test score example

    TestScore = 700.2(5.6)

    − 1.00(0.27)

    STR − 0.122(0.033)

    PctEL− 0.547(0.024)

    LchPct, R̄2 = 0.773

    Where PctEL = percent English learners in the school district, LchPct = percentof students receiving a free/subsidized lunch.

    Which variable is the variable of interest? STRWhich variables are control variables? Do they have causal implications?What do they control for?

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 45 / 55

  • Model Specification for Multiple Regression

    What makes an effective control variable?

    Three interchangeable statements about what makes an effective controlvariable:

    An effective control variable is one which, when included in the regression,makes the error term uncorrelated with the variable of interest.Holding constant the control variable(s), the variable of interest is “as if”randomly assigned.Among individuals (entities) with the same value of the control variable(s), thevariable of interest is uncorrelated with the omitted determinants of Y .

    Control variables need not be causal, and their coefficients generally do nothave a causal interpretation.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 46 / 55

  • Model Specification for Multiple Regression

    Conditional mean independence: definition

    Conditional mean independence: given the control variable, the mean of uidoesn’t depend on the variable of interest.

    Let Xi denote the variable of interest and Wi denote the control variable(s).W is an effective control variable if conditional mean independence holds:

    E (ui |Xi ,Wi ) = E (ui |Wi )

    Conditional mean independence substitute the first least square assumptionrequiring E (ui |Xi ,Wi ) = 0.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 47 / 55

  • Model Specification for Multiple Regression

    Implications of conditional mean independence

    Consider the regression model

    Y = β0 + β1X + β2W + u

    where X is the variable of interest and W is an effective control variable so thatconditional mean independence holds. In addition, suppose that the other leastsquare assumptions hold. Then,

    β1 has a causal interpretation.β̂1 is unbiased.The coefficient on the control variable, β̂2 is in general biased.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 48 / 55

  • Model Specification for Multiple Regression

    β1 has a causal interpretation

    The expected change in Y resulting from a change in X , holding W constant, is:

    E (Y |X = x + ∆x ,W = w)− E (Y |X = x ,W = w)= β0 + β1(x + ∆x) + β2w + E (u|X = x + ∆x ,W = w)− β0 + β1x + β2w + E (u|X = x ,W = w)

    = β1∆x + (E (u|W = w)− E (u|W = w))= β1∆x

    In the second equality, we use conditional mean independenceE (u|X = x + ∆x ,W = w) = E (u|X = x ,W = w) = E (u|W = w).

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 49 / 55

  • Model Specification for Multiple Regression

    β̂1 is unbiased and β̂2 is biased

    For convenience, suppose that E (u|W ) = γ0 + γ1W . Thus, under conditionalmean independence, we have

    E (u|X ,W ) = E (u|W ) = γ0 + γ1W

    Let v = u − E (u|W ) so that

    E (v |X ,W ) = E (u|X ,W )− E (u|W ) = 0

    Then, it follows that

    u = E (u|X ,W ) + v = γ0 + γ1W + v

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 50 / 55

  • Model Specification for Multiple Regression

    β̂1 is unbiased and β̂2 is biased (cont’d)

    Then, the original model Y = β0 + β1X + β2W + u becomes

    Y = β0 + β1X + β2W + γ0 + γ1W + v

    = (β0 + γ0) + β1X + (β2 + γ1)W + v

    = δ0 + β1X + δ2W + v

    where δ0 = β0 + γ0 and δ2 = β2 + γ2.

    What can we conclude from the new model?

    The new model satisfy E (v |X ,W ) = 0 so that the OLS estimator ofδ0, β1, and δ2 are unbiased.The estimated coefficients in the original model are actually β̂1 and δ̂2, whichwe know that E (β̂1) = β1 and E (β̂2) = δ2 6= β2 in general.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 51 / 55

  • Model Specification for Multiple Regression

    Model specification in theory and in practice

    The following steps are advocated to set up a regression model:

    Step 1: Set up a base specification

    A core or base set of regressors should be chosen using a combination of expertjudgment, economic theory, and knowledge of how data were collected. Theregression using this base set of regressors is referred to as a base specification.This step involves the following consideration:

    1 identifying the variable of interest.2 thinking of the omitted causal effects that could result in omitted variable

    bias.3 including those omitted causal effects if you can. If you can’t, include

    variables correlated with them that serve as control variables.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 52 / 55

  • Model Specification for Multiple Regression

    Model specification in theory and in practice (cont’d)

    Step 2: Set up alternative specifications

    Also specify a range of plausible alternative model specifications, which includeadditional candidate variables.

    1 If the estimates of the coefficients of interest are numerically similar acrossthe alternative specifications, then this provides evidence that the estimatesfrom your base specification are reliable.

    2 If the estimates of the coefficients of interest change substantially acrossspecifications, this often provides evidence that the original specification hadomitted variable bias.

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 53 / 55

  • Model Specification for Multiple Regression

    Model specification in theory and in practice (cont’d)

    Step 3: Use test statistics to judge a model specification1 Use R2 and R̄2 to see the overall goodness of fit of a model specification.

    Caution: a high R2 or R̄2 does not mean that you have eliminated omittedvariable bias. Neither does a high R2 or R̄2 mean that the included variablesand the model as a whole are statistically significant.

    2 Use t-statistic to check the significance of individual coefficients, and useF-statistic to check the overall significance of the model as a whole. That is,use F test for

    H0 : β1 = β2 = · · · = βk = 0

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 54 / 55

  • Model Specification for Multiple Regression

    Analysis of the test score data set

    Zheng Tian Lecture 9: Hypothesis Tests and Confidence Intervals in Multiple Regression 55 / 55