econometrics final exam 2012

8/19/2019 Econometrics Final Exam 2012

1/13

University of Cape Town

School of Economics

Eco4016F

Honours Econometrics

Midterm Examination

April 2012

Time: 3 hours Marks: 100

Instructions:

• The examination consists of 4 questions.• Answer all 4 questions.• Full marks are only awarded for questions where all working is shown.• Provide enough mathematical detail (assumptions, calculations etc.) so that the

logical progression of your answer is clear. If you get stuck on the algebra, partmarks will be awarded if you can demonstrate that you know how to approach theproblem.

• Only non-programmable scientific calculators are allowed.• Total number of pages (including cover page): 13


2/13

1. Consider the simple regression model, y = β 0 + β 1x∗ + u, where we have m measures

on x∗. Write these as zh = x∗ + eh for all h = 1, . . . , m. Make the following

assumptions

• Cov(x∗

, u) = 0 (i.e., x∗

would be exogenous if it could be observed)• Cov(x∗, eh) = 0 (i.e., the CEV assumption holds)• the errors are pairwise uncorrelated.• Var(e1) = Var(e2) =, . . . , = Var(em) = σ2e .

Let w = (z1 + . . . + zm)/m be the average of the measures on x∗, so that for each

observation i, wi = (zi1 + . . . + zim)/m is the average of the m measures. Let β̄ 1 bethe OLS estimator from the simple regression yi on 1, wi, for i = 1, . . . , n using arandom sample of data.

(a) Show that

plim(β̄ 1) = β 1

σ2

x∗

[σ2x∗ + (σ2e/m)]

Hint: the plim of β̄ 1 is Cov(w, y)/ Var(w).

(b) With reference to your answer to question 1a, explain why using the average of all m measures of x∗ is better than using any single measure zh.

(20)

2. Consider a simple regression model of the return to schooling

ln(wage) = β 0 + β 1education + u.

(a) Suppose that you know the birth dates of the individuals in your sample. Sup-pose you also knew that children have to stay in school till the age of 16,and cannot begin school till the age of 7. Explain how you might construct aplausible binary instrumental variable with this information.

(b) Now consider the following regression output, where y = log(wage), x =education, and z is your binary instrument.

. s u y x z

Variable | Obs Mean Std. Dev. Min Max

-------------+--------------------------------------------------------

y | 3010 6.261832 .4437976 4.60517 7.784889

x | 3010 13.26346 2.676913 1 18z | 3010 .6820598 .4657535 0 1

. ivregress 2sls y (x = z)

2


3/13

Instrumental variables (2SLS) regression Number of obs = 3010

Wald chi2(1) = 51.20

Prob > chi2 = 0.0000

R-squared = .

Root MSE = .55667

------------------------------------------------------------------------------

y | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

x | .1880626 .0262826 7.16 0.000 .1365497 .2395756

_cons | 3.767472 .3487458 10.80 0.000 3.083942 4.451001

------------------------------------------------------------------------------

Instrumented: x

Instruments: z

. bysort z: su y x

------------------------------------------------------------------------

- > z = 0


-------------+--------------------------------------------------------

y | 957 6.155494 .4328417 4.60517 7.474772

x | 957 12.69801 2.791523 1 18

------------------------------------------------------------------------

- > z = 1


-------------+--------------------------------------------------------

y | 2053 6.311401 .4402214 4.60517 7.784889

x | 2053 13.52703 2.580455 2 18

(i) Verify that the IV estimate β̂ 1 = 0.1880626 can be written as

β̂ 1 = ȳ1 − ȳ0x̄1 − x̄0 ,

where ȳ0 and x̄0 are the sample averages of y and x over the part of thesample with z = 0, and ȳ1 and x̄1 are the sample averages of y and x overthe part of the sample with z = 1.

(ii) Now prove this result mathematically; i.e., show that

β̂ 1 =

ni=1 (zi − z̄)(yi − ȳ)

ni=1 (zi − z̄)(xi − x̄)

= ȳ1 − ȳ0x̄1 − x̄0

Hint: you might find the following results useful:

ni=1

yi − ȳ

= 0; z̄ =

ni=1 zi/n; nz̄ =

ni=1 zi

n = n0 + n1 ȳ =n0n

ȳ0 +

n1n

ȳ1

(30)

3


4/13

3. You and Ms. Analyst work for the Treasury of the Government of South Africa.Ms. Analyst has been tasked by her Boss, Mr. Bigshot, to analyze data from askills training experiment targeted to a sample of young people. Ms. Analyst hasinvited you (an intern) to shadow her during the project. Participants were randomly

assigned to a treatment group and a control group. The control group did not getany training. Those assigned to the treatment group did receive training, and couldenter the programme from 1 January 2010. Some people in the treatment group onlyentered in mid-2011. The programme ended on 31 December 2011. Ms. Analyst hasbeen asked to investigate whether participation in the experiment had any effect onthe participant’s unemployment probability in 2012. The variables she has in herdataset are as follows:

st orage dis play v alue

variable name type format label variable label

----------------------------------------------------------------------------------------------

train byte %9.0g =1 if assigned to treatment group

age byte %9.0g age in 2011

educ byte %9.0g years of educationblack byte %9.0g =1 if black

coloured byte %9.0g = 1 if Coloured

married byte %9.0g =1 if married

nodegree byte %9.0g no tertiary qualification

mosinex byte %9.0g months prior to 1/2012 in experiment

unem08 byte %9.0g =1 if unemployed all of 2008



lwage08 float %9.0g Log of real wage in 2008; zero if wage is 0



agesq int %9.0g age^2

mostrn byte %9.0g months in training

---------------------------------------------------------------------------------------------

The descriptive statistics for these variables are as follows:

. su


-------------+--------------------------------------------------------

train | 445 .4157303 .4934022 0 1

age | 445 25.37079 7.100282 17 55

educ | 445 10.19551 1.792119 3 16

black | 445 .8337079 .3727617 0 1

coloured | 445 .0876404 .2830895 0 1

-------------+--------------------------------------------------------

married | 445 .1685393 .3747658 0 1

nodegree | 445 .7820225 .4133367 0 1

mosinex | 445 18.1236 5.311937 5 24

unem08 | 445 .7325843 .4431092 0 1

unem09 | 445 .6494382 .4776829 0 1

-------------+--------------------------------------------------------

unem12 | 445 .3078652 .46213 0 1

lwage08 | 445 .4198245 .8862537 -.809299 3.678089

lwage09 | 445 .2771078 .7967834 -2.599059 3.224548

lwage12 | 445 1.135802 1.136259 -3.106541 4.099463

agesq | 445 693.9775 429.7818 289 3025

-------------+--------------------------------------------------------

mostrn | 445 7.68764 9.656205 0 24

4


5/13

For this question, note that when G specializes to the logistic distribution, we have

G(xβ ) = Λ(xβ) = 1/(1 + e−xβ ) = exβ/(1 + exβ)

The associated density function for the logistic CDF is

g(xβ) ≡ Λ(xβ) = exβ

(1 + exβ)2

(a) Ms. Analyst starts by asking some basic questions: how many young peopleparticipated in the job training programme? Is there any reason to suspect, just by looking at the descriptive statistics, that the programme had an effecton unemployment? Help her answer these questions.

(b) She then runs the following below. What do you think she hopes to find outby running such a regression? Do you think that after looking at the results,her hopes would be dashed? Why/why not?

. global x unem08 unem09 age educ black coloured married

. logit train $x

Iteration 0: log likelihood = -302.1




Logistic regression Number of obs = 445

LR chi2(7) = 10.14

Prob > chi2 = 0.1809

Log likelihood = -297.03096 Pseudo R2 = 0.0168

------------------------------------------------------------------------------train | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

unem08 | .0818541 .3193627 0.26 0.798 -.5440852 .7077934

unem09 | -.3939164 .2966978 -1.33 0.184 -.9754334 .1876006

age | .0134344 .0141347 0.95 0.342 -.0142691 .0411379

educ | .0514515 .0560396 0.92 0.359 -.0583842 .1612872

black | - .3318542 .359652 -0.92 0.356 -1.036759 .3730508

coloured | -.868989 .5029685 -1.73 0.084 -1.854789 .1168112

married | .1536043 .2656568 0.58 0.563 -.3670735 .6742822

_cons | -.6917486 .788755 -0.88 0.380 -2.23768 .8541828

------------------------------------------------------------------------------

(c) She then proceeds to the main business at hand: investigating the effects of the

programme on the probability of unemployment. She runs the two regressionsshown in the abbreviated STATA output below. On the basis of these results,she claims that the training program reduces the probability of being unem-ployed in 2012 to approximately 0.24 and this holds whether one estimates alinear probability model or a Probit model. Is she correct? Why/why not?(Hint : start by transforming the logit coefficients so that they are comparableto probit coefficients.)

5


6/13

. reg unem12 train

Source | SS df MS Number of obs = 445

-------------+------------------------------ F( 1, 443) = 6.26

Model | 1.32226401 1 1.32226401 Prob > F = 0.0127

Residual | 93.5002079 443 .211061417 R-squared = 0.0139

-------------+------------------------------ Adj R-squared = 0.0117

Total | 94.8224719 444 .213564126 Root MSE = .45941

------------------------------------------------------------------------------

unem12 | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

train | -.1106029 .0441888 -2.50 0.013 -.1974486 -.0237572

_cons | .3538462 .0284917 12.42 0.000 .2978505 .4098418

------------------------------------------------------------------------------

. logit unem12 train


LR chi2(1) = 6.30

Prob > chi2 = 0.0120


------------------------------------------------------------------------------

unem12 | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

train | -.5328045 .2149117 -2.48 0.013 -.9540237 -.1115854

_cons | -.6021754 .1296994 -4.64 0.000 -.8563816 -.3479692

------------------------------------------------------------------------------

(d) She then runs the following Logit model where she controls for other factors.

Verify that the marginal effect of educ = −.0003291 that STATA computes isindeed correct.. logit unem12 train $x


LR chi2(8) = 22.63

Prob > chi2 = 0.0039


------------------------------------------------------------------------------

unem12 | Coef. Std. Err. z P>|z| [95% Conf. Interval]

-------------+----------------------------------------------------------------

train | -.5531597 .2198136 -2.52 0.012 -.9839865 -.122333

unem08 | .1958456 .3522676 0.56 0.578 -.4945862 .8862775

unem09 | .0826864 .3245389 0.25 0.799 -.5533982 .718771

age | .0003619 .0151608 0.02 0.981 -.0293527 .0300766educ | -.0015829 .0605203 -0.03 0.979 -.1202005 .1170346

black | 1.102427 .5009618 2.20 0.028 .1205603 2.084294

coloured | -.2436418 .6937825 -0.35 0.725 -1.60343 1.116147

married | -.1358157 .29638 -0.46 0.647 -.7167097 .4450783

_cons | -1.707333 .9105037 -1.88 0.061 -3.491887 .0772213

------------------------------------------------------------------------------

6


7/13

. predict xbhat3, index

. su xbhat3


-------------+--------------------------------------------------------

xbhat3 | 445 -.8722221 .5529723 -2.650983 -.3112156

. mfx

Marginal effects after logit

y = Pr(unem12) (predict)

= .29479214

------------------------------------------------------------------------------

variable | dy/dx Std. Err. z P>|z| [ 95% C.I. ] X

---------+--------------------------------------------------------------------

train*| -.112445 .04331 -2.60 0.009 -.197337 -.027552 .41573

unem08*| .0399288 .07035 0.57 0.570 - .097947 . 177805 .732584

unem09*| .017101 .06676 0.26 0.798 -.113751 .147953 .649438

age | .0000752 .00315 0.02 0.981 - .006102 . 006253 25.3708

educ | -.0003291 .01258 -0.03 0.979 -.024988 .02433 10.1955

black*| .191368 .06889 2.78 0.005 .056337 .326399 .833708

coloured*| -.0484808 .13163 -0.37 0.713 -.306464 .209503 .08764

married*| -.0277014 .05925 -0.47 0.640 -.143828 .088425 .168539

------------------------------------------------------------------------------

(*) dy/dx is for discrete change of dummy variable from 0 to 1

(e) How should Ms. Analyst interpret the coefficient on unem08?

(f) On the basis of this evidence, Ms. Analyst believes that “skills training” couldbe the magic bullet in combating unemployment in South Africa, and that thetake-home message of the experiment is to offer the training to all young peoplein South Africa (i.e., to scale up the training programme). Mr. Bigshot how-

ever, is not convinced because he believes that the programme has a strongerbenefit to people who don’t have a long history of being unemployed. Is he cor-rect? Does it follow that the programme should not be scaled up? Why/whynot?

(20)

4. Consider the following labour supply model

whrs = β 0 + β 1kl6 + β 2k618 + β 3nwifeinc + β 4wa + β 5wa2 + β 6we + β 7we2

+β 8(wa × we) + β 9lww2 + u

The STATA output given below shows the variable definitions, as well as the re-

gression output where the given labour supply model has been estimated using theTobit approach. Study the output and then answer the questions that follow.

7


8/13

---------------------------------------------------------------------------------------

storage display

variable name type format variable label

---------------------------------------------------------------------------------------

lfp float %9.0g A dummy variable = 1 if woman worked in 2011, else 0

whrs float %9.0g Number of hours the woman worked in 2011kl6 float %9.0g Number of children less than 6 years old in household

k618 float %9.0g Number of children between ages 6 and 18 in household

wa float %9.0g Woman’s age

we float %9.0g Woman’s educational attainment, in years

lww float %9.0g Log of woman’s hourly earnings (defined only for lfp = 1)

lww2 float %9.0g Log of woman’s hourly earnings (imputed when lfp = 0)

ax float %9.0g Actual years of woman’s previous labor market experience

prin float %9.0g Woman’s Property Income in rands

nwifeinc float %9.0g Prin/1000

we2 float %9.0g Square of Education

wa2 float %9.0g Square of Age

wawe float %9.0g Age times Education

-----------------------------------------------------------------------------------------

. tobit whrs $W $C $I lww2, ll(0) robust

Tobit regression Number of obs = 753

F( 6, 747) = 20.78

Prob > F = 0.0000

Log pseudolikelihood = -3891.0413 Pseudo R2 = 0.0161

------------------------------------------------------------------------------

| Robust

whrs | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

wa | -36.49 461 7.6 13938 - 4.79 0.0 00 -51. 44187 -2 1.547 35

we | 104.5203 26.88408 3.89 0.000 51.74297 157.2977

kl6 | - 1044.926 133.85 -7.81 0.000 -1307.693 -782.1587

k618 | - 100.1 299 41. 95978 - 2.39 0.0 17 -18 2.503 -1 7.756 79

nwifeinc | -22.20082 5.068873 -4.38 0.000 -32.15175 -12.24988

lww2 | 202.4707 113.6298 1.78 0.075 -20.60111 425.5425

_cons | 1172.027 474.3755 2.47 0.014 240.7594 2103.295

-------------+----------------------------------------------------------------

/sigma | 1258.636 48.07551 1164.257 1353.015

------------------------------------------------------------------------------

Obs. summary: 325 left-censored observations at whrs0) (predict, ystar(0,.))= 663.35746

------------------------------------------------------------------------------

variable | dy/dx Std. Err. z P>|z| [ 95% C.I. ] X

---------+--------------------------------------------------------------------

wa | -21.62883 4.47256 -4.84 0.000 -30.3949 -12.8628 42.5378

we | 61.94481 15.632 3.96 0.000 31.3063 9 2.5833 12.2869

kl 6 | -619. 2837 78.2 97 -7.91 0. 000 -772. 744 - 465.8 24 .2377 16

k61 8 | -59 .3428 24.9 51 -2.38 0. 017 -108. 245 - 10.44 02 1.353 25

nwifeinc | -13.15749 2.96863 -4.43 0.000 -18.9759 -7.33908 20.129

lww2 | 119.9959 67.414 1.78 0.075 - 12.1337 2 52.125 1.09613

------------------------------------------------------------------------------8


9/13

(a) Verify that marginal effect

∂ E(whrs|x)∂lww2

= 119.9959

that STATA computes is indeed correct. Is this partial effect of any economicimportance? Why/why not?

(b) Study carefully the marginal effects for the conditional mean functionE(whrs|whrs > 0,x) given in the STATA output below. Then answer thequestions that follow. .

. mfx compute, predict(e(0,.))

Marginal effects after tobit

y = E(whrs|whrs>0) (predict, e(0,.))

= 1119.292

------------------------------------------------------------------------------

variable | dy/dx Std. Err. z P>|z| [ 95% C.I. ] X---------+--------------------------------------------------------------------

wa | -15.24023 3.15081 -4.84 0.000 -21.4157 -9.06477 42.5378

we | 43.64793 11.029 3.96 0.000 22.0316 65.2643 12.2869

kl6 | -436.3634 55.076 -7.92 0.000 -544.31 -328.417 .237716

k618 | -41.81448 17.57 -2.38 0.017 -76.2506 -7.37833 1.35325

nwifeinc | -9.271113 2.0924 -4.43 0.000 -13.3721 -5.17009 20.129

lww2 | 84.55224 47.489 1.78 0.075 - 8.52464 1 77.629 1.09613

------------------------------------------------------------------------------

(i) Letting z ≡ xβ/σ, show that

∂ E(whrs|whrs > 0, x)∂lww

= 202.47071 −z

φ(z)

Φ(z) − φ(z)2

Φ(z)2

In answering this question, the following results might come in handy:

φ(z) = 1√

2πe−z

2/2 ; E(whrs|whrs > 0,x) = xβ+σ φ(z)Φ(z)

∂φ

∂z = −zφ(z) ; ∂ Φ

∂z = φ(z)

(ii) Verify that the marginal effect

∂ E(whrs|whrs > 0,x)∂lww2

= 84.55224

that STATA computes is indeed correct.

(iii) Do the signs on lww and nwifeinc make economic sense? Explain.

(iv) With reference to your answer to questions 4a and 4(b)ii, what do youmake of the fact that the marginal effect for the sample of working women issmaller than for the combined sample of working and non-working women?

9


10/13

(c) Calculate the elasticity of labour supply with respect to the hourly wage rate,for women that choose whrs > 0. Hint: the log wage elasticity formula whendealing with the conditional mean function E(whrs|whrs > 0,x) is

∂ E(whrs|whrs > 0,x)

∂lww2 × lww2

E(whrs|whrs > 0,x)Note also that the header to the marginal effects reported by STATA in question

4b tells you that E(whrs|whrs > 0,x) = 1119.292(d) Calculate the elasticity of labour supply with respect to the wife’s property

income variable nwifeinc.

(e) In what follows, wherever the word “income” is used, take it to mean income asdefined in this data set; i.e., “property income”. You may interpret “propertyincome” as any income earned from the sale of property that would carry a taxobligation known as a capital gains tax.

A policy maker is contemplating dropping the capital gains tax rate. He asks

his research office to estimate the income elasticity of labour supply. The staff at the research office have the same dataset you have been working with in thisquestion. They produce the following STATA output.

. su lww2 nwifeinc whrs if lfp == 1


-------------+--------------------------------------------------------

lww2 | 428 1.190173 .7231978 -2.054164 3.218876

nwifeinc | 428 18.93748 10.59135 -.0290575 91

whrs | 428 1302.93 776.2744 12 4950

. reg whrs $W $C $I lww2, robust

Linear regression Number of obs = 753

F( 6, 746) = 19.77

Prob > F = 0.0000

R-squared = 0.1226

Root MSE = 819.43

------------------------------------------------------------------------------

| Robust

whrs | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------

wa | -20.3053 4.687209 -4.33 0.000 -29.50699 -11.10361

we | 43.42601 17.05139 2.55 0.011 9.951603 76.90043

kl6 | -497.8102 62.25698 -8.00 0.000 -620.0299 -375.5905

k618 | -79.02626 23.68683 -3.34 0.001 -125.527 -32.52548

nwifeinc | -10.47326 2.383235 -4.39 0.000 -15.15191 -5.79462

lww2 | 112.8796 90.96162 1.24 0.215 -65.69165 291.4508

_cons | 1383.116 293.9374 4.71 0.000 806.0734 1960.159

------------------------------------------------------------------------------

(i) On the basis of these results, the staff at the research office argue to thepolicy maker that at higher levels of work hours, labour supply is muchless income elastic. Are they correct?

10


11/13

(ii) The policy maker, accepting the finding of question 4(e)i, remarks thatthis implies that a 1% increase in income is much more likely to lead toa negative “employment effect” than a negative “earnings” effect (i.e., thepolicy shift is more likely to induce a worker to switch from positive to 0

work hours than it is likely to induce a worker to work less hours). Whatis the underlying economic rationale behind this remark?

(iii) Based on the findings of questions 4(e)i and 4(e)ii above, the policy makerproposes to the Treasury that the proposed capital gains tax cut shouldnot be implemented because the resulting disincentive to enter into em-ployment would exceed the resulting disincentive to work longer hours.Assess this claim in the context of the Tobit model. Hint: there are twoissues to consider here: (i) is the negative income elasticity large enoughto warrent concern when the Tobit model is used, and is it necessarily thecase that the employment effect claimed in question 4(e)ii follows.

(30)

11


12/13

12


13/13

13

econometrics final exam 2012

Documents