Download - Ie Slide07

Introductory Econometrics

ECON2206/ECON3209

Slides07

Lecturer: Minxian Yang

ie_Slides07 my, School of Economics, UNSW 1

7. Multiple Regression Model: Qualitative Variables (Ch7)

7. Multiple Regression Model: Binary Variables

• Lecture plan

– Qualitative information and dummy (binary) variables

– Regression with dummy regressors

– Interactions with dummy regressors

– Binary dependent variable: linear probability model



• Qualitative information

– Many variables in social sciences are qualitative

(non-numerical) factors that takes two values.

eg. female, married, insured, etc.

– They can be defined as binary (0-1) valued variables,

known as dummy variables.

eg. 1 for female, married, insured, etc.

0 for male, unmarried, uninsured, etc.

– The assignment of values (0,1) is often determined by

interpretation convenience.



• Dummy explanatory variables

– We use dummy variables to incorporate qualitative

information into regression models.

eg. Wage model

wage = β0 + δ0 female + β1 educ + u,

where δ0 characterise

the gender difference in wage.

Under the ZCM assumption,

E(wage | female=1, educ)

= β0 + δ0 + β1 educ,

E(wage | female=0, educ)

= β0 + β1 educ,

δ0 represents an intercept shift.




– Interpretation of dummy

eg. Wage model (continued)

wage = β0 + δ0 female + β1 educ + u.

• Would you add the male dummy in the model?

No, doing so leads to perfect collinearity (violation of MLR3).

• Males are treated as the base group (against which

comparisons are made).

• We could regress wage on male and educ, where females

would be base group and coefficient interpretation would be

different.



• Dummy explanatory variableseg. Wage model (continued): WAGE1.RAW

wage = −1.57 − 1.81female + 0.572educ

(0.72) (0.26) (0.049)

+ 0.025exper + 0.141tenure

(0.012) (0.021)

n = 526, R2 = 0.364

• Negative intercept is not meaningful here.

• A female worker is predicted to earn $1.81 less than a male

worker at the same level of educ, exper and tenure.

Compare the above with the simple regression

wage = 7.10 − 2.51female, n = 526, R2 = 0.116

(0.21) (0.30)

But the coefficient of female here is not at all ceteris paribus.




– log dependent variable

eg. Wage model (continued): what if y = log(wage)?

log(wage) = .501 − .301female + .087educ

(.102) (.037) (.007)

+ .005 exper + .017tenure

(.002) (.003)

n = 526, R2 = 0.392

• A female worker is predicted to earn 30.1% less than a male

worker at the same level of educ, exper and tenure.

• A more accurate calculation:

log(wagef/wagem) = −.301

(wagef − wagem)/wagem = exp(−.301) − 1 = −.260 = −26%


1)Δˆexp(ˆ

ˆΔ xβ

y

y



– Program evaluation

• To evaluate a social economic programs (eg. IT

training), individuals are divided into two groups:

a control or base group (nonparticipants) and a

treatment group (participants) .

• The outcomes (eg. wage) of the two groups can be

used to evaluate the program, using a multiple

regression model.

• A dummy variable represents the treatment group,

while other factors (eg. educ, exper) should be included

as controls.

• The coefficient of the dummy is the causal or ceteris

paribus effect of the program.



• Dummy variables for multiple categories

– What we do if individuals are from more than two

categories? (eg. gender-marriage, etc)

– In general, for g groups, we need g−1 dummy

variables, with the intercept for the base group.

– The coefficient on the dummy of a group is the

difference in the intercepts between that group and

the base group.

eg. (Wage model again) The base group is singmale :

wage = β0 + δ1 singfemale + δ2 marrfemale + δ3 marrmale

+ β1 educ + ... + u.


many controls here


• Dummy variables for ordinal information

– Consider a variable that takes multiple values, where

the order matters but the scale is not meaningful.

eg. A borrower’s credit rating is on the scale of 0-4 with

0=very risky, 1=risky, 2=neutral, 3=safe, 4=very safe.

– Use separate dummies for the multiple values.

eg. Credit rating dummies: with “very risky” as the base

group, cr1 = 1 if risky, cr2 = 1 if neutral, cr3 = 1 if safe,

cr4 = 1 if very safe.

– If an ordinal variable takes too many values, group

them into a small number of categories.

eg. Law school rankings: not sensible to use a dummy for

each value. Rather, use 5 dummies to indicate if the

rank is in the top 10, 11-25, 26-40, 41-60, rest.ie_Slides07 my, School of Economics, UNSW 10


• Dummy variables for ordinal information

– Example 7.7: effects of attractiveness on wage

• BEAUTY.RAW: the attractiveness of each person in the

sample was ranked as “below average”, “average”, or

“above average”.

log(wagem) = −0.164 belavg + 0.016 abvavg + ...

(0.046) (0.033)

n = 700, adj.R2 = 0.403

• M: insignificant abvavg, but 16.4% less wage when belavg.

log(wagef) = −0.124 belavg + 0.035 abvavg + ...

(0.066) (0.049)

n = 409, adj.R2 = 0.330

• F: insignificant abvavg, but 12.4% less wage when belavg.


t-stat= −.164/.064= −2.56

t-stat= −.124/.066= −1.88


• Interactions among dummy variables

– Dummy variables can be interacted in the same way

as quantitative variables.

eg. (Wage model again) The base group is singmale :

wage = β0 + δ1 singfemale + δ2 marrfemale + δ3 marrmale

+ β1 educ + ... + u.

The dummies here are the interactions of two dummy

variables: female and married.



• Interactions among dummy variables

– Example 7.9 (Krueger,1993): the effect of computer

use on wage

log(wage) = .177 cwork + .070 chome + .017 cwork∙chome + ...

(.009) (.019) (.023)

• base group = those who do no use computers;

• 17.7% wage differential (over base group) for using a

computer at work (but not at home);

• 7% wage differential (over base group) for using a computer

at home (but not at work);

• 26.4% wage differential (over base group) for using a

computer at both home and work (exp(.177+.070+.017)-1 = .302);

• Interaction term is insignificant (t-stat = .017/.023 <1), meaning

that the effect of cwork is not linearly related to chome.


other

factors


• Allowing for different slopes

– Interacting a dummy with a quantitative variable

allows for different slope parameters.

eg. log(wage) = (β0 + δ0 female) + (β1 + δ1 female) educ + u

• female = 0: intercept and

slope are β0 and β1.

• female = 1: intercept and

slope are (β0+δ0)

and (β1 + δ1).

• Differences in intercept

and slope are measured

by δ0 and δ1 respectively.



• Allowing for different slopes

– Interacting a dummy with a quantitative variable.

• (continued) To estimate, we use OLS for

log(wage) = β0 + δ0 female + β1 educ + δ1 female∙educ + u

where δ1 is the effect of the interaction of female and educ.

• A number of hypotheses of interest can be tested in this model.

a) the return to education is the same for men and women

(H0: δ1 = 0).

b) expected wages are the same for men and women

who have the same level of education

(H0: δ1 = 0 and δ0 = 0).



• Testing for differences in regression functions

across groups

– Different groups may have different coefficients. For

two groups (say f and m), the unrestricted model

involves 2(k+1) coefficients.

yf = βf,0 + βf,1 x1 + ... + βf,k xk + uf , and

ym = βm,0 + βm,1 x1 + ... + βm,k xk + um . (ur)

– Under the null hypothesis that there is no difference in

β coefficients across groups,

H0: βf,j = βm,j, j = 0, 1, ..., k,

the restricted model involves k+1 coefficientsis

y = β0 + β1 x1 + ... + βk xk + u . (r)



• Testing for differences in regression functions

– Chow test

• Estimate the (ur) separately to obtain SSRf , SSRm and

SSRur = SSRf + SSRm .

• Estimate (r) to obtain SSRr.

• Test statistic is the F-stat

F = [(SSRr -SSRur)/(k+1)]/[SSRur/(n-2(k+1))],

which follows the Fk+1,n-2(k+1) distribution under the null.

• Reject H0 when F > c (the Fk+1,n-2(k+1) critical value).

– Chow test is for the “all-same” null. What about

“H0: βf,j = βm,j, j = 1, ..., k” for instance?

• With properly defined the restricted and unrestricted

models, the usual F-test procedure readily applies.



• Binary response and linear probability model

– Consider the case where the dependent variable

(response) is binary: y = 0 or 1.

eg. y represents whether or not: a person had a hospital

insurance last year; a person was employed last week;

a household purchased a car last year; etc.

– When the response (y) is influenced by a number of

explanatory variables (x’s), we may write

y = β0 + β1 x1 + ... + βk xk + u .

– But how do we interpret the β coefficients?

Because y takes on only two values, we must be

careful in interpreting βj.




– Notice that for binary response

P(y=1|x) = E(y|x) = β0 + β1 x1 + ... + βk xk ,

when ZCM assumption holds. The PRF is the

probability of “success” for given x’s.

– P(y=1|x) is known as the response probability, and

the regression model with a binary dependent

variable is called the linear probability model (LPM).

– The parameter βj is interpreted as the change in the

probability of success caused by a one-unit

increase in xj :

ΔP(y=1|x) = βj Δxj .


E(y|x) = P(y=1|x)1 + P(y=0|x)0



– The SRF

is the estimate of PRF

P(y=1|x) = E(y|x) = β0 + β1 x1 + ... + βk xk .

– The interpretation of the predicted value is the

predicted probability of success.

eg. Married women labour force participation

inlf = .586 + .038 educ − .262 kidslt6 + ....

(.154) (.007) (.034)

where inlf = 1 when the woman is in the labour force.

1 more year of education, holding everything else fixed,

increases the probability of in-labour-force by 0.038.


kkxxy ˆˆˆˆ 110

# kids less than 6 years old



– Shortcomings of LPM

• The predicted probability can be either less than 0 or

greater than 1. (eg. the in-labour-force probability for

those with kidslt6 ≥ 4 is predicted to be negative.)

– Linear function is not suitable for modelling probabilities.

– Logit model: P(y=1|x) = {1+exp[-(β0 + β1 x1 + ... + βk xk)]}-1

– Probit model: P(y=1|x) = Φ(β0 + β1 x1 + ... + βk xk)

• For LPM, it can easily be shown that

Var(u|x) = Var(y|x) = P(y=1|x)[1 − P(y=1|x)],

ie, MLR5 does not hold as the conditional variance

depends on x’s (heteroskedasticity). It does not cause

estimation bias but does invalidate the standard errors.



• Summary

– Dummy variables are useful to measure the ceteris

paribus differences among different groups in the

sample.

– Dummies are also useful to incorporate ordinal

information.

– Dummies can be interacted with other variables to

allow for different slopes for different groups, and

tests for various hypotheses of interest.

– Chow test and F-test are used to test for differences

in the regressions across groups.

– Binary response leads to the LPM, where the fitted

values are interpreted as probabilities of success.


Download - Ie Slide07

Top Related