Transcript
Page 1: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Multiple Regression Analysis: EstimationECONOMETRICS (ECON 360)

BEN VAN KAMMEN, PHD

Page 2: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

OutlineMotivation.

Mechanics and Interpretation.

Expected Values and Variances of the Estimators.

Page 3: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Motivation for multiple regressionConsider the following results of a regression of the number of crimes reported in Milwaukee on the search volume (on Google) for the term โ€œice creamโ€โ—ฆ which Iโ€™m using as a proxy for ice cream sales.โ—ฆ A couple caveats about these data.

Dep. Var.: Crimes Coefficient ( ๏ฟฝ๐œท๐œท ) Std. Err. tIce Cream 1.5437 .4306 3.58

N=82; ๐‘…๐‘…2=0.1384

Page 4: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Motivation for multiple regression (continued)Ice cream is somehow an input into or a reason for committing crimes? โ—ฆ Evidenced by its positive association with crime.

Common sense and past experiences with ice cream, however, argue that there is probably no causal connection between the two. โ—ฆ A spurious correlation thatโ—ฆ disappears when we control for a confounding variable that is related to both x and y.

In this case summarized as โ€œgood weatherโ€, โ—ฆ i.e., people eat more ice cream when it is warm and also go outside more when it is warm (up to a

point, at least). โ—ฆ The increase in social interaction occasioned by warm weather, then, creates more opportunities for

conflicts and crimes, according to this informal theory, while coincidentally leading to more ice cream sales, as well.

Page 5: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Motivation for multiple regression (concluded)To use regression analysis to disconfirm the theory that ice cream causes more crime, perform a regression that controls for the effect of weather in some way.

Either,โ—ฆ Examine sub-samples of days in which the weather is (roughly) the same but ice cream consumption

varies, orโ—ฆ Explicitly control for the weather by including it in a multiple regression model.

Page 6: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Multiple regression definedMultiple regression expands the regression model using more than 1 regressor / explanatory variable / โ€œindependent variableโ€.

For 2 regressors, we would model the following relationship.๐‘ฆ๐‘ฆ = ๐›ฝ๐›ฝ0 + ๐›ฝ๐›ฝ1๐‘ฅ๐‘ฅ1 + ๐›ฝ๐›ฝ2๐‘ฅ๐‘ฅ2 + ๐‘ข๐‘ข,

and estimate separate effects (๐›ฝ๐›ฝ1 and ๐›ฝ๐›ฝ2)โ€”for each explanatory variable (๐‘ฅ๐‘ฅ1 and ๐‘ฅ๐‘ฅ2).

Assuming enough data (and reasons for adding additional variables), the model with ๐‘˜๐‘˜ > 1regressors:

๐‘ฆ๐‘ฆ = ๐›ฝ๐›ฝ0 + ๐›ฝ๐›ฝ1๐‘ฅ๐‘ฅ1 + ๐›ฝ๐›ฝ2๐‘ฅ๐‘ฅ2 + ๐›ฝ๐›ฝ3๐‘ฅ๐‘ฅ3+. . . +๐›ฝ๐›ฝ๐‘˜๐‘˜๐‘ฅ๐‘ฅ๐‘˜๐‘˜ + ๐‘ข๐‘ข,

would not be anything fundamentally different from a simple regression.

Page 7: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Multiple OLS and simple OLSMultiple regression relaxes the assumption that all other factors are held fixed.

Instead of, ๐ธ๐ธ ๐‘’๐‘’๐‘’๐‘’๐‘’๐‘’๐‘’๐‘’๐‘’๐‘’ ๐‘ฅ๐‘ฅ1 = 0, one needs only assume that ๐ธ๐ธ ๐‘’๐‘’๐‘’๐‘’๐‘’๐‘’๐‘’๐‘’๐‘’๐‘’ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2 . . . ๐‘ฅ๐‘ฅ๐‘˜๐‘˜ = 0.โ—ฆ A more realistic assumption when dealing with non-experimental data.

One other way of thinking about multiple regression:โ—ฆ simple regression as a special caseโ€”in which {๐‘ฅ๐‘ฅ2, . . . ๐‘ฅ๐‘ฅ๐‘˜๐‘˜} have been relegated to the error term and

treated as mean independent of ๐‘ฅ๐‘ฅ1. โ—ฆ I.e., simple regression has you estimating:

๐‘ฆ๐‘ฆ = ๐›ฝ๐›ฝ0 + ๐›ฝ๐›ฝ1๐‘ฅ๐‘ฅ1 + ๐œ€๐œ€; ๐œ€๐œ€ โ‰ก ๐›ฝ๐›ฝ2๐‘ฅ๐‘ฅ2 + ๐›ฝ๐›ฝ3๐‘ฅ๐‘ฅ3+. . . +๐›ฝ๐›ฝ๐‘˜๐‘˜๐‘ฅ๐‘ฅ๐‘˜๐‘˜ + ๐‘ข๐‘ข.

Page 8: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Multiple OLS and simple OLS (continued)To demonstrate why this could be a bad strategy, consider the ice cream-crime problem again, assuming that the true relationship follows:

๐‘๐‘๐‘’๐‘’๐‘๐‘๐‘๐‘๐‘’๐‘’ = ๐›ฝ๐›ฝ0 + ๐›ฝ๐›ฝ1๐‘๐‘๐‘๐‘๐‘’๐‘’๐‘๐‘๐‘’๐‘’๐‘’๐‘’๐‘–๐‘–๐‘๐‘ + ๐›ฝ๐›ฝ2๐‘ก๐‘ก๐‘’๐‘’๐‘๐‘๐‘ก๐‘ก๐‘’๐‘’๐‘’๐‘’๐‘–๐‘–๐‘ก๐‘ก๐‘ข๐‘ข๐‘’๐‘’๐‘’๐‘’ + ๐‘ข๐‘ข;๐ธ๐ธ ๐‘ข๐‘ข ๐‘๐‘๐‘๐‘๐‘’๐‘’๐‘๐‘๐‘’๐‘’๐‘’๐‘’๐‘–๐‘–๐‘๐‘, ๐‘ก๐‘ก๐‘’๐‘’๐‘๐‘๐‘ก๐‘ก๐‘’๐‘’๐‘’๐‘’๐‘–๐‘–๐‘ก๐‘ก๐‘ข๐‘ข๐‘’๐‘’๐‘’๐‘’ = 0.

My informal theory states that ๐›ฝ๐›ฝ1 = 0 and ๐›ฝ๐›ฝ2 > 0.

Page 9: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Multiple OLS and simple OLS (continued)Estimating the simple regression between ice cream and crime, was as if the model was transformed:

๐‘๐‘๐‘’๐‘’๐‘๐‘๐‘๐‘๐‘’๐‘’ = ๐›ฝ๐›ฝ0 + ๐›ฝ๐›ฝ1๐‘๐‘๐‘๐‘๐‘’๐‘’๐‘๐‘๐‘’๐‘’๐‘’๐‘’๐‘–๐‘–๐‘๐‘ + ๐œ€๐œ€; ๐œ€๐œ€ = ๐›ฝ๐›ฝ2๐‘ก๐‘ก๐‘’๐‘’๐‘๐‘๐‘ก๐‘ก๐‘’๐‘’๐‘’๐‘’๐‘–๐‘–๐‘ก๐‘ก๐‘ข๐‘ข๐‘’๐‘’๐‘’๐‘’ + ๐‘ข๐‘ข,

which I estimated under the assumption that: ๐ธ๐ธ ๐œ€๐œ€ ๐‘๐‘๐‘๐‘๐‘’๐‘’๐‘๐‘๐‘’๐‘’๐‘’๐‘’๐‘–๐‘–๐‘๐‘ = 0.

However this likely a bad assumption, since it is probable that: ๐ธ๐ธ(๐‘ก๐‘ก๐‘’๐‘’๐‘๐‘๐‘ก๐‘ก๐‘’๐‘’๐‘’๐‘’๐‘–๐‘–๐‘ก๐‘ก๐‘ข๐‘ข๐‘’๐‘’๐‘’๐‘’|๐‘๐‘๐‘๐‘๐‘’๐‘’๐‘๐‘๐‘’๐‘’๐‘’๐‘’๐‘–๐‘–๐‘๐‘) โ‰  0.

Page 10: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Multiple OLS and simple OLS (concluded)High (low) values of icecream correspond with high (low) values of temperature.

The assumptions underlying simple regression analysis state that when the conditional mean independence is violated, bias is introduced in OLS.โ—ฆ This bias would explain the positive estimate for ๐›ฝ๐›ฝ1 shown above.

We will examine the source of the bias more closely and how to estimate its direction later in this chapter.

First we turn our attention back to the technical aspects of estimating the OLS parameters with multiple regressors.

Page 11: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Results, controlling for temperatureNow the coefficient on ice cream is much smaller (closer to 0) and the effect of temperature is large and significant.โ—ฆ Itโ€™s not precisely zero, but we can no longer reject the null that it has zero effect on crime.

Dep. Var.: Crimes Coefficient ( ๏ฟฝ๐œท๐œท ) Std. Err. tIce Cream 0.4760 .4477 1.06Temperature 1.9492 .4198 4.64

N=82; ๐‘…๐‘…2=0.3231

Page 12: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Mechanics and interpretation of OLSEven with ๐‘˜๐‘˜ > 1 regressors, OLS minimizes the sum of the squares of the residuals.

With ๐‘˜๐‘˜ = 2 regressors:๏ฟฝ๐‘ข๐‘ข๐‘–๐‘– = ๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ0 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๐‘ฅ๐‘ฅ๐‘–๐‘–2 โ†’ ๏ฟฝ๐‘ข๐‘ข๐‘–๐‘–2 = ๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ0 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๐‘ฅ๐‘ฅ๐‘–๐‘–2

2.

๐‘†๐‘†๐‘ข๐‘ข๐‘๐‘ ๐‘’๐‘’๐‘œ๐‘œ ๐‘†๐‘†๐‘†๐‘†๐‘ข๐‘ข๐‘–๐‘–๐‘’๐‘’๐‘’๐‘’๐‘†๐‘† ๐‘…๐‘…๐‘’๐‘’๐‘…๐‘…๐‘๐‘๐‘†๐‘†๐‘ข๐‘ข๐‘–๐‘–๐‘…๐‘…๐‘…๐‘… (๐‘†๐‘†๐‘†๐‘†๐‘…๐‘…) = ๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ0 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๐‘ฅ๐‘ฅ๐‘–๐‘–22

Note is the change in subscripting: each observation is indexed by โ€œiโ€ as before, but the two โ€œxโ€ variables are now distinguished from one another by a โ€œ1โ€ and a โ€œ2โ€ in the subscript.

Page 13: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Minimizing SSRFor ๐‘˜๐‘˜ > 1 regressors:

๐‘†๐‘†๐‘†๐‘†๐‘…๐‘… = ๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ0 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๐‘ฅ๐‘ฅ๐‘–๐‘–2โˆ’. . .โˆ’๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘˜๐‘˜๐‘ฅ๐‘ฅ๐‘–๐‘–๐‘˜๐‘˜2 .

Once again minimization is accomplished by differentiating the above line with respect to each of the ๐‘˜๐‘˜ + 1 statistics,โ—ฆ ๏ฟฝฬ‚๏ฟฝ๐›ฝ0 and โ—ฆ all the (k) slope parameters.

The first order conditions for the minimum are:๐œ•๐œ•๐‘†๐‘†๐‘†๐‘†๐‘…๐‘…๐œ•๐œ•๏ฟฝฬ‚๏ฟฝ๐›ฝ0

= 0 and๐œ•๐œ•๐‘†๐‘†๐‘†๐‘†๐‘…๐‘…๐œ•๐œ•๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘—๐‘—

= 0 โˆ€ ๐‘—๐‘— โˆˆ 1,2, . . . ๐‘˜๐‘˜ .

Page 14: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Minimizing SSR (continued)โ€œSimultaneously choose all the โ€˜betasโ€™ to make the regression model fit as closely as possible to all the data points.โ€

The partial derivatives for the problem look like this:

1๐œ•๐œ•๐‘†๐‘†๐‘†๐‘†๐‘…๐‘…๐œ•๐œ•๏ฟฝฬ‚๏ฟฝ๐›ฝ0

= โˆ’2๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ0 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๐‘ฅ๐‘ฅ๐‘–๐‘–2โˆ’. . .โˆ’๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘˜๐‘˜๐‘ฅ๐‘ฅ๐‘–๐‘–๐‘˜๐‘˜

2๐œ•๐œ•๐‘†๐‘†๐‘†๐‘†๐‘…๐‘…๐œ•๐œ•๏ฟฝฬ‚๏ฟฝ๐›ฝ1

= โˆ’2๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–1 ๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ0 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๐‘ฅ๐‘ฅ๐‘–๐‘–2โˆ’. . .โˆ’๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘˜๐‘˜๐‘ฅ๐‘ฅ๐‘–๐‘–๐‘˜๐‘˜

. . . ๐‘˜๐‘˜ + 1๐œ•๐œ•๐‘†๐‘†๐‘†๐‘†๐‘…๐‘…๐œ•๐œ•๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘˜๐‘˜

= โˆ’2๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–๐‘˜๐‘˜ ๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ0 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๐‘ฅ๐‘ฅ๐‘–๐‘–2โˆ’. . .โˆ’๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘˜๐‘˜๐‘ฅ๐‘ฅ๐‘–๐‘–๐‘˜๐‘˜ .

Page 15: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Example with ๐‘˜๐‘˜ = 2As a representative (and comparatively easy to solve) example, consider the solution when ๐‘˜๐‘˜ =2.

Setting (1) equal to zero (FOC) and solving for ๏ฟฝฬ‚๏ฟฝ๐›ฝ0:

2๐‘›๐‘›๏ฟฝฬ‚๏ฟฝ๐›ฝ0 โˆ’ 2๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฆ๐‘ฆ๐‘–๐‘– + 2๏ฟฝฬ‚๏ฟฝ๐›ฝ1๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–1 + 2๏ฟฝฬ‚๏ฟฝ๐›ฝ1๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–2 = 0

4 ๏ฟฝฬ‚๏ฟฝ๐›ฝ0 =1๐‘›๐‘›๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–2 = ๏ฟฝ๐‘ฆ๐‘ฆ โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๏ฟฝฬ…๏ฟฝ๐‘ฅ1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๏ฟฝฬ…๏ฟฝ๐‘ฅ2

Page 16: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Example with ๐‘˜๐‘˜ = 2 (continued)Doing the same thing with (2) and (3) gives you:

2๐œ•๐œ•๐‘†๐‘†๐‘†๐‘†๐‘…๐‘…๐œ•๐œ•๏ฟฝฬ‚๏ฟฝ๐›ฝ1

= 0 โ†’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–12 = ๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–1๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ0๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–1๐‘ฅ๐‘ฅ๐‘–๐‘–2

3๐œ•๐œ•๐‘†๐‘†๐‘†๐‘†๐‘…๐‘…๐œ•๐œ•๏ฟฝฬ‚๏ฟฝ๐›ฝ2

= 0 โ†’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–22 = ๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–2๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ0๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–2 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–1๐‘ฅ๐‘ฅ๐‘–๐‘–2

Page 17: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Example with ๐‘˜๐‘˜ = 2 (continued)Substituting (4):

2 ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–12 = ๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–1๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝ๐‘ฆ๐‘ฆ โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๏ฟฝฬ…๏ฟฝ๐‘ฅ1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๏ฟฝฬ…๏ฟฝ๐‘ฅ2 ๐‘›๐‘›๏ฟฝฬ…๏ฟฝ๐‘ฅ1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–1๐‘ฅ๐‘ฅ๐‘–๐‘–2

3 ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–22 = ๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–2๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝ๐‘ฆ๐‘ฆ โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๏ฟฝฬ…๏ฟฝ๐‘ฅ1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๏ฟฝฬ…๏ฟฝ๐‘ฅ2 ๐‘›๐‘›๏ฟฝฬ…๏ฟฝ๐‘ฅ2 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–1๐‘ฅ๐‘ฅ๐‘–๐‘–2

Page 18: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Example with ๐‘˜๐‘˜ = 2 (continued)Solving (3) for ๏ฟฝฬ‚๏ฟฝ๐›ฝ2:

(3) โ†’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2 ๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–22 โˆ’ ๐‘›๐‘›๏ฟฝฬ…๏ฟฝ๐‘ฅ22 = ๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–2๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๐‘›๐‘›๏ฟฝฬ…๏ฟฝ๐‘ฅ2 ๏ฟฝ๐‘ฆ๐‘ฆ + ๐‘›๐‘›๏ฟฝฬ…๏ฟฝ๐‘ฅ2๏ฟฝฬ‚๏ฟฝ๐›ฝ1๏ฟฝฬ…๏ฟฝ๐‘ฅ1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–1๐‘ฅ๐‘ฅ๐‘–๐‘–2

โ‡” ๏ฟฝฬ‚๏ฟฝ๐›ฝ2 = ๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–22 โˆ’ ๐‘›๐‘›๏ฟฝฬ…๏ฟฝ๐‘ฅ22โˆ’1

๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–2๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๐‘›๐‘›๏ฟฝฬ…๏ฟฝ๐‘ฅ2 ๏ฟฝ๐‘ฆ๐‘ฆ + ๐‘›๐‘›๏ฟฝฬ…๏ฟฝ๐‘ฅ2๏ฟฝฬ‚๏ฟฝ๐›ฝ1๏ฟฝฬ…๏ฟฝ๐‘ฅ1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–1๐‘ฅ๐‘ฅ๐‘–๐‘–2

โ‡” ๏ฟฝฬ‚๏ฟฝ๐›ฝ2 = ๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ2โˆ’1 ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ2,๐‘ฆ๐‘ฆ โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1 ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2

Page 19: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Example with ๐‘˜๐‘˜ = 2 (continued)Solve it simultaneously with (2) by substituting it in:

2 โ†’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–12 = ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1,๐‘ฆ๐‘ฆ + ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๐‘›๐‘›๏ฟฝฬ…๏ฟฝ๐‘ฅ12 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2 ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2 ; now

๏ฟฝฬ‚๏ฟฝ๐›ฝ1๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–12 = ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1,๐‘ฆ๐‘ฆ + ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๐‘›๐‘›๏ฟฝฬ…๏ฟฝ๐‘ฅ12 โˆ’๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ2,๐‘ฆ๐‘ฆ ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1 ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2

2

๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ2

Page 20: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Example with ๐‘˜๐‘˜ = 2 (continued)Collect all the ๏ฟฝฬ‚๏ฟฝ๐›ฝ1 terms.

๏ฟฝฬ‚๏ฟฝ๐›ฝ1 ๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–12 โˆ’ ๐‘›๐‘›๏ฟฝฬ…๏ฟฝ๐‘ฅ12 โˆ’๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2

2

๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ2= ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1,๐‘ฆ๐‘ฆ โˆ’

๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ2,๐‘ฆ๐‘ฆ ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ2

โ‡” ๏ฟฝฬ‚๏ฟฝ๐›ฝ1 ๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ1 โˆ’๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2

2

๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ2= ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1,๐‘ฆ๐‘ฆ โˆ’

๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ2,๐‘ฆ๐‘ฆ ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ2

โ‡” ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ1 ๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ2 โˆ’ ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2

2

๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ2=๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1,๐‘ฆ๐‘ฆ ๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ2 โˆ’ ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ2,๐‘ฆ๐‘ฆ ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2

๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ2

Page 21: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Example with ๐‘˜๐‘˜ = 2 (concluded)Simplifying gives you:

๏ฟฝฬ‚๏ฟฝ๐›ฝ1 =๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1,๐‘ฆ๐‘ฆ ๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ2 โˆ’ ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ2,๐‘ฆ๐‘ฆ ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2

๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ1 ๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ2 โˆ’ ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ22

Then you can solve for ๏ฟฝฬ‚๏ฟฝ๐›ฝ2 which, after a lot of simplification, is:

๏ฟฝฬ‚๏ฟฝ๐›ฝ2 =๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ2,๐‘ฆ๐‘ฆ ๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ1 โˆ’ ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1,๐‘ฆ๐‘ฆ ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2

๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ1 ๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ2 โˆ’ ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ22 .

Lastly, substitute the above expressions into the solution for the intercept:๏ฟฝฬ‚๏ฟฝ๐›ฝ0 = ๏ฟฝ๐‘ฆ๐‘ฆ โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๏ฟฝฬ…๏ฟฝ๐‘ฅ1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๏ฟฝฬ…๏ฟฝ๐‘ฅ2.

Page 22: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Multiple OLS and variance/covarianceExamine the solutions closely.

They depend, as with simple regression, only on the variances and covariances of the regressors and their covariances with ๐‘ฆ๐‘ฆ.โ—ฆ This is true generally of OLS, even when there are many explanatory variables in the regression.

As this number grows, even to 2, the solution becomes difficult to work out with algebra, but software (like STATA) is very good at performing the calculations and solving for the ๐‘˜๐‘˜ + 1estimates, even when ๐‘˜๐‘˜ is quite large.โ—ฆ How software works out the solution: matrix algebra.

Page 23: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Multiple regression and โ€œpartialling outโ€The โ€œPartialling Outโ€ Interpretation of Multiple Regression is revealed by the matrix and non-matrix estimate of ๏ฟฝฬ‚๏ฟฝ๐›ฝ1.โ—ฆ What goes into ๏ฟฝฬ‚๏ฟฝ๐›ฝ1 in a multiple regression is the variation in ๐‘ฅ๐‘ฅ1 that cannot be โ€œexplainedโ€ by its

relation to the other ๐‘ฅ๐‘ฅ variables.โ—ฆ The covariance between this residual variation in ๐‘ฅ๐‘ฅ1, not explained by other regressors, and ๐‘ฆ๐‘ฆ is what

matters.

It also provides a good way to think of ๏ฟฝฬ‚๏ฟฝ๐›ฝ1 as the partial effect of ๐‘ฅ๐‘ฅ1, holding other factors fixed.

It is estimated using variation in ๐‘ฅ๐‘ฅ1 that is independent of variation in other regressors.

More.

Page 24: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Expected value of the OLS estimatorsNow we resume the discussion of a misspecified OLS regression, e.g., crime on ice cream, by considering the assumptions under which OLS yields unbiased estimates and how misspecifyinga regression can produce biased estimates.โ—ฆ Also what is the direction of the bias?

There is a population model that is linear in parameters.

Assumption MLR.1 states this model which represents the true relationship between ๐‘ฆ๐‘ฆ and all ๐‘ฅ๐‘ฅ.

I.e.,

๐‘ฆ๐‘ฆ = ๐›ฝ๐›ฝ0 + ๐›ฝ๐›ฝ1๐‘ฅ๐‘ฅ1 + ๐›ฝ๐›ฝ2๐‘ฅ๐‘ฅ2 + ๐›ฝ๐›ฝ3๐‘ฅ๐‘ฅ3+. . . +๐›ฝ๐›ฝ๐‘˜๐‘˜๐‘ฅ๐‘ฅ๐‘˜๐‘˜ + ๐‘ข๐‘ข.

Page 25: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

OLS under assumptions MLR 1-4The sample of ๐‘›๐‘› observations is assumed random and indexed by ๐‘๐‘ (MLR.2).

From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅ for an estimate to exist.

With multiple regression, each regressor must have (at least some) variation that is not explained by the other regressors. โ—ฆ The other regressors cannot โ€œpartial outโ€ all of the variation in, say, ๐‘ฅ๐‘ฅ1 while still estimating ๐›ฝ๐›ฝ1. โ—ฆ Perfect multicollinearity is ruled out (by Assumption MLR.3).

When you have a set of regressors that violates this assumption, one of them must be removed from the regression. โ—ฆ Software will usually do this automatically for you.

Page 26: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

OLS under assumptions MLR 1-4 (continued)Finally multiple regression assumes a less restrictive version of mean independence between regressors and error term.

Assumption MLR.4 states that the error term has zero conditional mean, i.e., conditional on all regressors.

๐ธ๐ธ ๐‘ข๐‘ข ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2, . . . , ๐‘ฅ๐‘ฅ๐‘˜๐‘˜ = 0.

This can fail if the estimated regression is misspecified, in terms of the functional form of one of the variables or the omission of a relevant regressor.

The model contains exogenous regressors, with the alternative being an endogenous regressor (explanatory variable).โ—ฆ This is when some ๐‘ฅ๐‘ฅ๐‘—๐‘— is correlated with an omitted variable in the error term.

Page 27: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

UnbiasednessUnder the four assumptions above, the OLS estimates are unbiased, i.e.,

๐ธ๐ธ ๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘—๐‘— = ๐›ฝ๐›ฝ๐‘—๐‘—โˆ€ ๐‘—๐‘—.

This includes cases in which variables have no effect on ๐‘ฆ๐‘ฆ and in which ๐›ฝ๐›ฝ๐‘—๐‘— = 0.โ—ฆ This is the โ€œoverspecifiedโ€ case in the text, in which an irrelevant variable is included in the regression. โ—ฆ Though this does not affect the unbiased-ness of OLS, it does impact the variance of the estimatesโ€”

sometimes in undesirable ways we will study later.

Page 28: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Expected value and variance of OLS estimators: under MLR assumptions 1-4Unbiasedness does not apply when you exclude a variable from the population model when estimating it.โ—ฆ This is called underspecifying the model.โ—ฆ It occurs when, because of an empiricistโ€™s mistake or failure to observe a variable in data, a relevant

regressor is excluded from the estimation.โ—ฆ For the sake of concreteness, consider an example of each.

1. An empiricist regresses crime on ice cream because he has an axe to grind about ice cream or something like that, and he wants to show that it causes crime.

2. An empiricist regresses earnings on education because he cannot observe other determinants of workersโ€™ productivity (โ€œabilityโ€) in data.

Page 29: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Bias from under-specificationSo in both cases, there is a population (โ€œtrueโ€) model that includes two variables:

๐‘ฆ๐‘ฆ = ๐›ฝ๐›ฝ0 + ๐›ฝ๐›ฝ1๐‘ฅ๐‘ฅ1 + ๐›ฝ๐›ฝ2๐‘ฅ๐‘ฅ2 + ๐‘ข๐‘ข,

but the empiricist estimates a version that excludes ๐‘ฅ๐‘ฅ2, generating estimates different from unbiased OLSโ€”which are denoted with the โ€œtildeโ€ on the next line.

๏ฟฝ๐‘ฆ๐‘ฆ = ๏ฟฝ๐›ฝ๐›ฝ0 + ๏ฟฝ๐›ฝ๐›ฝ1๐‘ฅ๐‘ฅ1; ๐‘ฅ๐‘ฅ2 is omitted from the estimation. So,

๏ฟฝ๐›ฝ๐›ฝ1 =โˆ‘๐‘–๐‘–=1๐‘›๐‘› ๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ…๏ฟฝ๐‘ฅ1 ๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝ๐‘ฆ๐‘ฆ

โˆ‘๐‘–๐‘–=1๐‘›๐‘› ๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ…๏ฟฝ๐‘ฅ1 2 =โˆ‘๐‘–๐‘–=1๐‘›๐‘› ๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ…๏ฟฝ๐‘ฅ1 ๐‘ฆ๐‘ฆ๐‘–๐‘–โˆ‘๐‘–๐‘–=1๐‘›๐‘› ๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ…๏ฟฝ๐‘ฅ1 2 .

Page 30: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Omitted variable biasIf the true relationship includes ๐‘ฅ๐‘ฅ2, however, the bias in this estimate is revealed by substituting the population model for ๐‘ฆ๐‘ฆ๐‘–๐‘–.

๏ฟฝ๐›ฝ๐›ฝ1 =โˆ‘๐‘–๐‘–=1๐‘›๐‘› ๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ…๏ฟฝ๐‘ฅ1 ๐›ฝ๐›ฝ0 + ๐›ฝ๐›ฝ1๐‘ฅ๐‘ฅ๐‘–๐‘–1 + ๐›ฝ๐›ฝ2๐‘ฅ๐‘ฅ๐‘–๐‘–2 + ๐‘ข๐‘ข๐‘–๐‘–

โˆ‘๐‘–๐‘–=1๐‘›๐‘› ๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ…๏ฟฝ๐‘ฅ1 2

Simplify the numerator:

๐›ฝ๐›ฝ0๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ…๏ฟฝ๐‘ฅ1 + ๐›ฝ๐›ฝ1๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–1 ๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ…๏ฟฝ๐‘ฅ1 + ๐›ฝ๐›ฝ2๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–2 ๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ…๏ฟฝ๐‘ฅ1 + ๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ…๏ฟฝ๐‘ฅ1 ๐‘ข๐‘ข๐‘–๐‘– .

Then take expectations:

โ‡” ๐ธ๐ธ ๏ฟฝ๐›ฝ๐›ฝ1 =0 + ๐›ฝ๐›ฝ1 โˆ‘๐‘–๐‘–=1๐‘›๐‘› ๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ…๏ฟฝ๐‘ฅ1 2 + ๐›ฝ๐›ฝ2๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2 + 0

โˆ‘๐‘–๐‘–=1๐‘›๐‘› ๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ…๏ฟฝ๐‘ฅ1 2 ,

since the regressors are uncorrelated with the error from the correctly specified model.

Page 31: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Omitted variable bias (concluded)A little more simplifying gives:

โ‡” ๐ธ๐ธ ๏ฟฝ๐›ฝ๐›ฝ1 = ๐›ฝ๐›ฝ1 + ๐›ฝ๐›ฝ2๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2

โˆ‘๐‘–๐‘–=1๐‘›๐‘› ๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ…๏ฟฝ๐‘ฅ1 2 .

The estimator using the misspecified model is equal to the true parameter (๐›ฝ๐›ฝ1) plus a second term that captures the bias.

This bias is non-zero unless one of two things is true:

1. ๐›ฝ๐›ฝ2 is zero. There is no bias from excluding an irrelevant variable.

2. ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2 is zero. This means that ๐‘ฅ๐‘ฅ1 is independent of the error term, varying it does not induce variation in an omitted variable.

Page 32: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Direction of omitted variable biasThe textbook calls the fraction in the bias term, ๐›ฟ๐›ฟ1,โ—ฆ the regression coefficient you would get if you regressed ๐‘ฅ๐‘ฅ2 on ๐‘ฅ๐‘ฅ1.

๐›ฟ๐›ฟ1 โ‰ก๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ1

This term, along with ๐›ฝ๐›ฝ2, helps predict the direction of the bias.

Specifically, if ๐›ฟ๐›ฟ1and ๐›ฝ๐›ฝ2 are the same sign (+/-), the bias is positive, and if they are opposite signs, the bias is negative.

1. ๐›ฟ๐›ฟ1,๐›ฝ๐›ฝ2 same sign โ†’ ๐ธ๐ธ ๏ฟฝ๐›ฝ๐›ฝ1 > ๐›ฝ๐›ฝ1 "upward bias"

2. ๐›ฟ๐›ฟ1,๐›ฝ๐›ฝ2 opposite sign โ†’ ๐ธ๐ธ ๏ฟฝ๐›ฝ๐›ฝ1 < ๐›ฝ๐›ฝ1 "downward bias"

Page 33: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Omitted variable bias (concluded)One more way to think about bias: the ceteris paribus effect of ๐‘ฅ๐‘ฅ1 on ๐‘ฆ๐‘ฆ.

๏ฟฝ๐›ฝ๐›ฝ1 estimates this effect, but it does so by lumping the true effect in with an unknown quantity of misleading effects, e.g., you take the population model,

๐‘ฆ๐‘ฆ = ๐›ฝ๐›ฝ0 + ๐›ฝ๐›ฝ1๐‘ฅ๐‘ฅ1 + ๐›ฝ๐›ฝ2๐‘ฅ๐‘ฅ2 + ๐‘ข๐‘ข, and estimate ๐‘ฆ๐‘ฆ = ๐›ฝ๐›ฝ0 + ๐›ฝ๐›ฝ1๐‘ฅ๐‘ฅ1 + ๐ถ๐ถ;๐ถ๐ถ โ‰ก ๐›ฝ๐›ฝ2๐‘ฅ๐‘ฅ2 + ๐‘ข๐‘ข. So,

๐œ•๐œ•๐ธ๐ธ(๐‘ฆ๐‘ฆ|๐‘ฅ๐‘ฅ)๐œ•๐œ•๐‘ฅ๐‘ฅ1

= ๐›ฝ๐›ฝ1 +๐œ•๐œ•๐ถ๐ถ๐œ•๐œ•๐‘ฅ๐‘ฅ1

= ๐›ฝ๐›ฝ1 + ๐›ฝ๐›ฝ2๐œ•๐œ•๐ธ๐ธ(๐‘ฅ๐‘ฅ2|๐‘ฅ๐‘ฅ1)

๐œ•๐œ•๐‘ฅ๐‘ฅ1.

The partial derivative of ๐‘ฅ๐‘ฅ2 with respect to ๐‘ฅ๐‘ฅ1 is the output you would get if you estimated a regression of ๐‘ฅ๐‘ฅ2 on ๐‘ฅ๐‘ฅ1, i.e., ๐›ฟ๐›ฟ1.

Page 34: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Standard error of the OLS estimatorWhy is variance of an estimator (โ€œstandard errorโ€) so important? โ—ฆ Inference: next chapter.

Generating and reporting only point estimates is irresponsible.โ—ฆ It leads your (especially an uninitiated) audience to an overly specific conclusion about your results. โ—ฆ This is bad for you (empiricist) and them (policy makers, customers, other scholars) because they may

base decisions on your findings, e.g., choosing a marginal income tax rate depends crucially on labor supply elasticity.

A point estimate outside the context of its standard error may prompt your audience to rash decisions that they would not make if they knew your estimates were not โ€œpinpointโ€ accurate.โ—ฆ This is bad for you because if they do the โ€œwrongโ€ thing with your results, they will blame you for the

policyโ€™s failure.โ—ฆ And empiricists, as a group, will have their credibility diminished a little, as well.

Page 35: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Variance of OLS estimatorsHere we show that the standard error of the ๐‘—๐‘—๐‘ก๐‘ก๐‘ก estimate, ๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘—๐‘— , is:

๐‘…๐‘…๐‘’๐‘’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘—๐‘— โ‰ก ๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘—๐‘—12 =

โˆ‘๐‘–๐‘–=1๐‘›๐‘› ๏ฟฝ๐‘ข๐‘ข๐‘–๐‘–2

๐‘›๐‘› โˆ’ ๐‘˜๐‘˜ โˆ’ 1 1 โˆ’ ๐‘…๐‘…๐‘—๐‘—2 โˆ‘๐‘–๐‘–=1๐‘›๐‘› ๐‘ฅ๐‘ฅ๐‘–๐‘–๐‘—๐‘— โˆ’ ๏ฟฝฬ…๏ฟฝ๐‘ฅ๐‘—๐‘—2 , where

๏ฟฝ๐‘ข๐‘ข๐‘–๐‘– are the residuals from the regression,

๐‘˜๐‘˜ is the number of regressors, and

๐‘…๐‘…๐‘—๐‘—2 is the proportion of ๐‘ฅ๐‘ฅ๐‘—๐‘—โ€ฒ๐‘…๐‘… variance explained by the other ๐‘ฅ๐‘ฅ variables.

Once again the estimate of the variance assumes homoskedasticity, i.e.,๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ข๐‘ข ๐‘–๐‘–๐‘…๐‘…๐‘…๐‘… ๐‘ฅ๐‘ฅ๐‘—๐‘— = ๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’(๐‘ข๐‘ข) = ๐œŽ๐œŽ2.

Page 36: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Variance of OLS estimators (continued)Most of the components of the variance are familiar from simple regression.โ—ฆ Subscripts (๐‘—๐‘—) and that the degrees of freedom (๐‘›๐‘› โˆ’ ๐‘˜๐‘˜ โˆ’ 1 instead of ๐‘›๐‘› โˆ’ 1 โˆ’ 1) have been generalized

for ๐‘˜๐‘˜ regressors.

The term, (1 โˆ’ ๐‘…๐‘…๐‘—๐‘—2), is new, reflecting the consequence of (imperfect) multicollinearity.

Perfect multicollinearity has already been ruled out by Assumption MLR.3, but the standard error of a multicollinear variable (๐‘…๐‘…๐‘—๐‘—2 โ†’ 1) is very large because the denominator of the expression above will be very small ((1 โˆ’ ๐‘…๐‘…๐‘—๐‘—2) โ†’ 0).

Page 37: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Variance of OLS estimators (continued)Once again the variance of the errors is estimated without bias, i.e., (๐ธ๐ธ ๏ฟฝ๐œŽ๐œŽ2 = ๐œŽ๐œŽ2), by the (degrees of freedom-adjusted) mean squared error:

๐‘†๐‘†๐‘†๐‘†๐‘…๐‘… โ‰ก๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๏ฟฝ๐‘ข๐‘ข๐‘–๐‘–2 ; ๏ฟฝ๐œŽ๐œŽ2 =๐‘†๐‘†๐‘†๐‘†๐‘…๐‘…

๐‘›๐‘› โˆ’ ๐‘˜๐‘˜ โˆ’ 1, so ๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘—๐‘— =

๏ฟฝ๐œŽ๐œŽ2

1 โˆ’ ๐‘…๐‘…๐‘—๐‘—2 โˆ‘๐‘–๐‘–=1๐‘›๐‘› ๐‘ฅ๐‘ฅ๐‘–๐‘–๐‘—๐‘— โˆ’ ๏ฟฝฬ…๏ฟฝ๐‘ฅ๐‘—๐‘—2 ,

Where the residuals are: ๏ฟฝ๐‘ข๐‘ข๐‘–๐‘– โ‰ก ๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ0 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๐‘ฅ๐‘ฅ๐‘–๐‘–1โˆ’. . .โˆ’๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘˜๐‘˜๐‘ฅ๐‘ฅ๐‘–๐‘–๐‘˜๐‘˜ .

Page 38: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Variance of OLS estimators (continued)Observations about the standard error.โ—ฆ When adding a (relevant) regressor to a model, the standard error of an existing regressor can increase

or decrease.โ—ฆ Generally the sum of squares of residuals decreases, since more information is being used to fit the

model. But the degrees of freedom decreases as well.โ—ฆ The standard error depends on Assumption MLR.5 (homoskedasticity). โ—ฆ If it is violated, ๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘—๐‘— is still unbiased, however, the above expression becomes a biased estimate of the

variance of ๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘—๐‘—. โ—ฆ The 8th chapter in the text is devoted to dealing with the problem of heteroskedasticity in regression.

Page 39: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Gauss-Markov Theorem and BLUEIf, however, Assumptions MLR 1-5 hold, the OLS estimators (๏ฟฝฬ‚๏ฟฝ๐›ฝ0through ๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘˜๐‘˜) have minimal variance among the class of linear unbiased estimators, โ—ฆ i.e., it is the โ€œbestโ€ in the sense of minimal variance.

OLS is โ€œBLUEโ€, which stands for Best Linear Unbiased Estimator.

The 5 Assumptions that lead to this conclusion (MLR 1-5) are collectively known as the Gauss-Markov assumptions.

The BLUE-ness of the OLS estimators under those assumptions is known as the Gauss-Markov Theorem.

Page 40: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

ConclusionMultiple regression solves the problem of omitted variables that are correlated with the dependent variable and regressor.

It is estimated in a manner analogous to simple OLS and has similar properties:โ—ฆ Unbiasedness under MLR 1-4,โ—ฆ BLUE under MLR 1-5.

Its estimates are interpreted as partial effects of changing one regressor and holding the others constant.

The estimates have standard errors that can be used to generate confidence intervals and test hypotheses about the parametersโ€™ values.

Page 41: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Data on crime and ice creamThe unit of observation is daily, i.e., these are 82 days on which ice cream search volume and crime reports have been counted in the City of Milwaukee.

All the variables have had the โ€œday of the weekโ€-specific mean subtracted from them, because all of them fluctuate across the calendar week, and this avoids creating another spurious correlation through the days of the week.

Back.

Page 42: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

OLS estimates using matricesUnderstanding what software is doing when it puts out regression estimates is one reason for the following exercise.

Another reason is to demonstrate the treatment of OLS you will experience in graduate-level econometrics classesโ€”one in which everything is espoused using matrix algebra.

Imagine the variables in your regression as a spreadsheet, i.e., with ๐‘ฆ๐‘ฆ in column 1, ๐‘ฅ๐‘ฅ1 in column 2, and each row is a different observation.

With this image in mind, divide the spreadsheet into two matrices,* on which operations may be performed as they are on numbers (โ€œscalarsโ€ in matrix jargon).

*โ€œA rectangular array of numbers enclosed in parentheses. It is conventionally denoted by a capital letter.โ€

Page 43: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

OLS estimates using matrices (continued)The dimensions of a matrix are expressed: ๐‘’๐‘’๐‘’๐‘’๐‘Ÿ๐‘Ÿ๐‘…๐‘… ร— ๐‘๐‘๐‘’๐‘’๐‘…๐‘…๐‘ข๐‘ข๐‘๐‘๐‘›๐‘›๐‘…๐‘….โ—ฆ I.e., ๐‘›๐‘› observations (rows) of a variable ๐‘ฆ๐‘ฆ, would be an ๐‘›๐‘› ร— 1 matrix (โ€œYโ€); ๐‘›๐‘› observations (rows) of ๐‘˜๐‘˜

explanatory variables ๐‘ฅ๐‘ฅ, would be an ๐‘›๐‘› ร— ๐‘˜๐‘˜ matrix (โ€œXโ€).

One operation that can be performed on a matrix is called transposition, denoted by an apostrophe behind the matrixโ€™s name.โ—ฆ Transposing a matrix just means exchanging its rows and columns. E.g.,

๐‘‹๐‘‹ โ‰ก 5 2 13 0 1 โ†’ ๐‘‹๐‘‹โ€ฒ =

5 32 01 1

.

Page 44: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Matrix multiplicationAnother operation that can be performed on some pairs of matrices is multiplication, but only if they are conformable.

โ€œTwo matrices A and B of dimensions ๐‘๐‘ ร— ๐‘›๐‘› and ๐‘›๐‘› ร— ๐‘†๐‘† respectively are conformable to form the product matrix AB, since the number of columns in A is equal to the number of rows in B. The product matrix AB is of dimension ๐‘๐‘ ร— ๐‘†๐‘†, and its ๐‘๐‘๐‘—๐‘—๐‘ก๐‘ก๐‘ก element, ๐‘๐‘๐‘–๐‘–๐‘—๐‘—, is obtained by multiplying the elements of the ๐‘๐‘๐‘ก๐‘ก๐‘ก row of A by the corresponding elements of the ๐‘—๐‘—๐‘ก๐‘ก๐‘ก column of B and adding the resulting products.โ€

Page 45: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

OLS estimates using matrices (continued)Transposition and multiplication are useful in econometrics for obtaining a matrix of covariances between ๐‘ฆ๐‘ฆ and all the ๐‘ฅ๐‘ฅ, as well as among the ๐‘ฅ๐‘ฅ variables.

Specifically when you multiply Y by the transpose of X, you get a matrix (๐‘˜๐‘˜ ร— 1) with elements,

๐‘‹๐‘‹โ€ฒ๐‘Œ๐‘Œ =

๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–1๐‘ฆ๐‘ฆ๐‘–๐‘–

๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–2๐‘ฆ๐‘ฆ๐‘–๐‘–

โ‹ฎ

๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘˜๐‘˜2๐‘ฆ๐‘ฆ๐‘–๐‘–

=

๐‘…๐‘…๐‘ฆ๐‘ฆ1๐‘…๐‘…๐‘ฆ๐‘ฆ2โ‹ฎ๐‘…๐‘…๐‘ฆ๐‘ฆ๐‘˜๐‘˜

;

x and y are expressed as deviations from their means.

Page 46: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

OLS estimates using matrices (continued)Similarly multiplying X by its own transpose gives you a matrix (๐‘˜๐‘˜ ร— ๐‘˜๐‘˜) with elements,

๐‘‹๐‘‹โ€ฒ๐‘‹๐‘‹ =

๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–12 โ€ฆ ๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–1๐‘ฅ๐‘ฅ๐‘–๐‘–๐‘˜๐‘˜

โ‹ฎ โ‹ฑ โ‹ฎ

๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–1๐‘ฅ๐‘ฅ๐‘–๐‘–๐‘˜๐‘˜ โ€ฆ ๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–๐‘˜๐‘˜2

=๐‘…๐‘…11 โ€ฆ ๐‘…๐‘…1๐‘˜๐‘˜โ‹ฎ โ‹ฑ โ‹ฎ๐‘…๐‘…๐‘˜๐‘˜1 โ€ฆ ๐‘…๐‘…๐‘˜๐‘˜๐‘˜๐‘˜

.

Page 47: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

OLS estimates using matrices (continued)Xโ€™X and Xโ€™Y are matrices of the variances of the explanatory variables and covariances among all variables. They can be combined to obtain the matrix (๐‘˜๐‘˜ ร— 1) of coefficient estimates. โ—ฆ This is the part for which computers are really indispensable.

First of all, you invert the matrix ๐‘‹๐‘‹๐‘‹๐‘‹๐‘‹. This means solving for the values that make the following hold:

๐‘‹๐‘‹โ€ฒ๐‘‹๐‘‹ ๐‘‹๐‘‹โ€ฒ๐‘‹๐‘‹ โˆ’1 = ๐ผ๐ผ โ‡”๐‘…๐‘…11 โ€ฆ ๐‘…๐‘…1๐‘˜๐‘˜โ‹ฎ โ‹ฑ โ‹ฎ๐‘…๐‘…๐‘˜๐‘˜1 โ€ฆ ๐‘…๐‘…๐‘˜๐‘˜๐‘˜๐‘˜

๐‘–๐‘–11 โ€ฆ ๐‘–๐‘–1๐‘˜๐‘˜โ‹ฎ โ‹ฑ โ‹ฎ๐‘–๐‘–๐‘˜๐‘˜1 โ€ฆ ๐‘–๐‘–๐‘˜๐‘˜๐‘˜๐‘˜

=1 โ€ฆ 0โ‹ฎ โ‹ฑ โ‹ฎ0 โ€ฆ 1

.

Page 48: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

OLS estimates using matrices (continued)To illustrate using the case of ๐‘˜๐‘˜ = 2, you have to solve for all the โ€œaโ€ terms using simultaneous equations.

๐‘…๐‘…11 ๐‘…๐‘…12๐‘…๐‘…21 ๐‘…๐‘…22

๐‘–๐‘–11 ๐‘–๐‘–12๐‘–๐‘–21 ๐‘–๐‘–22 = 1 0

0 1 , so๐‘–๐‘–11๐‘…๐‘…11 + ๐‘–๐‘–21๐‘…๐‘…12 = 1,๐‘–๐‘–11๐‘…๐‘…21 + ๐‘–๐‘–21๐‘…๐‘…22 = 0,

๐‘–๐‘–12๐‘…๐‘…11 + ๐‘–๐‘–22๐‘…๐‘…12 = 0, and๐‘–๐‘–12๐‘…๐‘…21 + ๐‘–๐‘–22๐‘…๐‘…22 = 1.

Page 49: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

OLS estimates using matrices (continued)Solving the first two simultaneously gives you:

๐‘–๐‘–11 =๐‘…๐‘…22

๐‘…๐‘…11๐‘…๐‘…22 โˆ’ ๐‘…๐‘…12๐‘…๐‘…21and ๐‘–๐‘–21 =

๐‘…๐‘…21๐‘…๐‘…12๐‘…๐‘…21 โˆ’ ๐‘…๐‘…11๐‘…๐‘…22

.

Similarly the last to give you:๐‘–๐‘–12 =

๐‘…๐‘…12๐‘…๐‘…12๐‘…๐‘…21 โˆ’ ๐‘…๐‘…11๐‘…๐‘…22

and ๐‘–๐‘–22 =๐‘…๐‘…11

๐‘…๐‘…11๐‘…๐‘…22 โˆ’ ๐‘…๐‘…12๐‘…๐‘…21.

To get the coefficient estimates, you do the matrix equivalent of dividing covariance by variance, i.e., you solve for the vector B:

ฮ’ = ๐‘‹๐‘‹โ€ฒ๐‘‹๐‘‹ โˆ’1 ๐‘‹๐‘‹โ€ฒ๐‘Œ๐‘Œ .

Page 50: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

OLS estimates using matrices (continued)For the ๐‘˜๐‘˜ = 2 case,

ฮ’ =

๐‘…๐‘…22๐‘…๐‘…11๐‘…๐‘…22 โˆ’ ๐‘…๐‘…12๐‘…๐‘…21

๐‘…๐‘…12๐‘…๐‘…12๐‘…๐‘…21 โˆ’ ๐‘…๐‘…11๐‘…๐‘…22

๐‘…๐‘…21๐‘…๐‘…12๐‘…๐‘…21 โˆ’ ๐‘…๐‘…11๐‘…๐‘…22

๐‘…๐‘…11๐‘…๐‘…11๐‘…๐‘…22 โˆ’ ๐‘…๐‘…12๐‘…๐‘…21

๐‘…๐‘…๐‘ฆ๐‘ฆ1๐‘…๐‘…๐‘ฆ๐‘ฆ2 .

Multiplying this out gives the same solution as minimizing the Sum of Squared Residuals.

ฮ’ =

๐‘…๐‘…22๐‘…๐‘…๐‘ฆ๐‘ฆ1 โˆ’ ๐‘…๐‘…12๐‘…๐‘…๐‘ฆ๐‘ฆ2๐‘…๐‘…11๐‘…๐‘…22 โˆ’ ๐‘…๐‘…12๐‘…๐‘…21๐‘…๐‘…11๐‘…๐‘…๐‘ฆ๐‘ฆ2 โˆ’ ๐‘…๐‘…21๐‘…๐‘…๐‘ฆ๐‘ฆ1๐‘…๐‘…11๐‘…๐‘…22 โˆ’ ๐‘…๐‘…12๐‘…๐‘…21

=

๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ2 ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ(๐‘ฅ๐‘ฅ1,๐‘ฆ๐‘ฆ) โˆ’ ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2 ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ2,๐‘ฆ๐‘ฆ๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ1 ๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ2 โˆ’ ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2 2

๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ1 ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ2,๐‘ฆ๐‘ฆ โˆ’ ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2 ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1,๐‘ฆ๐‘ฆ๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ1 ๐‘‰๐‘‰๐‘–๐‘–๐‘’๐‘’ ๐‘ฅ๐‘ฅ2 โˆ’ ๐ถ๐ถ๐‘’๐‘’๐ถ๐ถ ๐‘ฅ๐‘ฅ1, ๐‘ฅ๐‘ฅ2 2

Page 51: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

OLS estimates using matrices (concluded)Solving OLS for ๐‘˜๐‘˜ > 2 is not fundamentally different, and the solution will always be a function of the variances and covariances of the variables in the model.โ—ฆ It just gets more difficult (at an accelerating rate) to write down solutions for the individual coefficients. โ—ฆ It is left as an exercise to show that the โ€œpartialling outโ€ interpretation of OLS holds.

ฮ’ = ๐‘‹๐‘‹โ€ฒ๐‘‹๐‘‹ โˆ’1 ๐‘‹๐‘‹โ€ฒ๐‘Œ๐‘Œ =

โˆ‘๐‘–๐‘–=1๐‘›๐‘› ๏ฟฝฬ‚๏ฟฝ๐‘’๐‘–๐‘–1๐‘ฆ๐‘ฆ๐‘–๐‘–โˆ‘๐‘–๐‘–=1๐‘›๐‘› ๏ฟฝฬ‚๏ฟฝ๐‘’๐‘–๐‘–12

โ‹ฎโˆ‘๐‘–๐‘–=1๐‘›๐‘› ๏ฟฝฬ‚๏ฟฝ๐‘’๐‘–๐‘–๐‘˜๐‘˜๐‘ฆ๐‘ฆ๐‘–๐‘–โˆ‘๐‘–๐‘–=1๐‘›๐‘› ๏ฟฝฬ‚๏ฟฝ๐‘’๐‘˜๐‘˜12

,

in which ๏ฟฝฬ‚๏ฟฝ๐‘’๐‘–๐‘–๐‘—๐‘— is the residual for observation ๐‘๐‘ from a regression of ๐‘ฅ๐‘ฅ๐‘—๐‘—on all the other regressors.โ—ฆ ๏ฟฝฬ‚๏ฟฝ๐‘’๐‘–๐‘–1 โ‰ก ๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝ๐‘ฅ๐‘ฅ๐‘–๐‘–1; ๏ฟฝ๐‘ฅ๐‘ฅ๐‘–๐‘–1 is the fitted value.

Back.

Page 52: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Partialling out interpretation of OLSThe expression for ๏ฟฝฬ‚๏ฟฝ๐›ฝ1 can be derived from the first order condition for ๐‘ฅ๐‘ฅ1 from the least squares minimization.

This states that:๐œ•๐œ•๐‘†๐‘†๐‘†๐‘†๐‘…๐‘…๐œ•๐œ•๏ฟฝฬ‚๏ฟฝ๐›ฝ1

= โˆ’2๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๐‘ฅ๐‘ฅ๐‘–๐‘–1 ๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ0 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๐‘ฅ๐‘ฅ๐‘–๐‘–2โˆ’. . .โˆ’๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘˜๐‘˜๐‘ฅ๐‘ฅ๐‘–๐‘–๐‘˜๐‘˜ = 0.

Substituting in the definition of the residuals from the previous line, you have:

๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๏ฟฝ๐‘ฅ๐‘ฅ๐‘–๐‘–1 + ๏ฟฝฬ‚๏ฟฝ๐‘’๐‘–๐‘–1 ๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ0 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๐‘ฅ๐‘ฅ๐‘–๐‘–2โˆ’. . .โˆ’๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘˜๐‘˜๐‘ฅ๐‘ฅ๐‘–๐‘–๐‘˜๐‘˜ = 0

Page 53: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Partialling out interpretation of OLS (continued)

โ‡”๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๏ฟฝ๐‘ฅ๐‘ฅ๐‘–๐‘–1 ๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ0 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๐‘ฅ๐‘ฅ๐‘–๐‘–2โˆ’. . .โˆ’๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘˜๐‘˜๐‘ฅ๐‘ฅ๐‘–๐‘–๐‘˜๐‘˜ +

๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๏ฟฝฬ‚๏ฟฝ๐‘’๐‘–๐‘–1 ๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ0 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๐‘ฅ๐‘ฅ๐‘–๐‘–2โˆ’. . .โˆ’๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘˜๐‘˜๐‘ฅ๐‘ฅ๐‘–๐‘–๐‘˜๐‘˜ = 0.

The term in brackets is the OLS residual from regressing ๐‘ฆ๐‘ฆ on ๐‘ฅ๐‘ฅ1. . . ๐‘ฅ๐‘ฅ๐‘˜๐‘˜ , โ€œ๏ฟฝ๐‘ข๐‘ข๐‘–๐‘–โ€, and ๏ฟฝ๐‘ฅ๐‘ฅ1 is a linear function of the other ๐‘ฅ๐‘ฅ variables, i.e.,

๏ฟฝ๐‘ฅ๐‘ฅ๐‘–๐‘–1 = ๏ฟฝ๐›พ๐›พ0 + ๏ฟฝ๐›พ๐›พ1๐‘ฅ๐‘ฅ๐‘–๐‘–2+. . . +๏ฟฝ๐›พ๐›พ๐‘˜๐‘˜๐‘ฅ๐‘ฅ๐‘–๐‘–๐‘˜๐‘˜ ,

. . .

Page 54: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Partialling out interpretation of OLS (continued). . . so one may write:

๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๏ฟฝ๐‘ฅ๐‘ฅ๐‘–๐‘–1 ๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ0 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๐‘ฅ๐‘ฅ๐‘–๐‘–2โˆ’. . .โˆ’๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘˜๐‘˜๐‘ฅ๐‘ฅ๐‘–๐‘–๐‘˜๐‘˜ = ๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๏ฟฝ๐›พ๐›พ0 + ๏ฟฝ๐›พ๐›พ1๐‘ฅ๐‘ฅ๐‘–๐‘–2+. . . +๏ฟฝ๐›พ๐›พ๐‘˜๐‘˜๐‘ฅ๐‘ฅ๐‘–๐‘–๐‘˜๐‘˜ ๏ฟฝ๐‘ข๐‘ข๐‘–๐‘– .

Since the OLS residuals are uncorrelated with each ๐‘ฅ๐‘ฅ๐‘—๐‘—, and the sum of the residuals is zero, this entire line equals zero and drops out.

Page 55: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Partialling out interpretation of OLS (continued)Then you have:

๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๏ฟฝฬ‚๏ฟฝ๐‘’๐‘–๐‘–1 ๐‘ฆ๐‘ฆ๐‘–๐‘– โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ0 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๐‘ฅ๐‘ฅ๐‘–๐‘–1 โˆ’ ๏ฟฝฬ‚๏ฟฝ๐›ฝ2๐‘ฅ๐‘ฅ๐‘–๐‘–2โˆ’. . .โˆ’๏ฟฝฬ‚๏ฟฝ๐›ฝ๐‘˜๐‘˜๐‘ฅ๐‘ฅ๐‘–๐‘–๐‘˜๐‘˜ = 0, and

the residuals (๏ฟฝฬ‚๏ฟฝ๐‘’1) are uncorrelated with the other regressors (2 through ๐‘˜๐‘˜) and sum to zero, as well. So,

๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๏ฟฝฬ‚๏ฟฝ๐‘’๐‘–๐‘–1๐‘ฆ๐‘ฆ๐‘–๐‘– = ๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๏ฟฝฬ‚๏ฟฝ๐›ฝ1(๏ฟฝฬ‚๏ฟฝ๐‘’๐‘–๐‘–12+๏ฟฝฬ‚๏ฟฝ๐‘’๐‘–๐‘–1 ๏ฟฝ๐‘ฅ๐‘ฅ๐‘–๐‘–1) โ‡”๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๏ฟฝฬ‚๏ฟฝ๐‘’๐‘–๐‘–1๐‘ฆ๐‘ฆ๐‘–๐‘– = ๏ฟฝฬ‚๏ฟฝ๐›ฝ1๏ฟฝ๐‘–๐‘–=1

๐‘›๐‘›

๏ฟฝฬ‚๏ฟฝ๐‘’๐‘–๐‘–12 .

Page 56: Multiple Regression Analysis: Estimationbvankamm/Files/360 Notes/02 - Multiple Regression...From simple regression, we know that there must be variation in ๐‘ฅ๐‘ฅfor an estimate

Partialling out interpretation of OLS (concluded)Then you get the โ€œpartialling outโ€ expression:

โ‡” ๏ฟฝฬ‚๏ฟฝ๐›ฝ1 =โˆ‘๐‘–๐‘–=1๐‘›๐‘› ๏ฟฝฬ‚๏ฟฝ๐‘’๐‘–๐‘–1๐‘ฆ๐‘ฆ๐‘–๐‘–โˆ‘๐‘–๐‘–=1๐‘›๐‘› ๏ฟฝฬ‚๏ฟฝ๐‘’๐‘–๐‘–1

2 .

Back.


Top Related