econometrics final exam 2012
TRANSCRIPT
-
8/19/2019 Econometrics Final Exam 2012
1/13
University of Cape Town
School of Economics
Eco4016F
Honours Econometrics
Midterm Examination
April 2012
Time: 3 hours Marks: 100
Instructions:
• The examination consists of 4 questions.• Answer all 4 questions.• Full marks are only awarded for questions where all working is shown.• Provide enough mathematical detail (assumptions, calculations etc.) so that the
logical progression of your answer is clear. If you get stuck on the algebra, partmarks will be awarded if you can demonstrate that you know how to approach theproblem.
• Only non-programmable scientific calculators are allowed.• Total number of pages (including cover page): 13
-
8/19/2019 Econometrics Final Exam 2012
2/13
1. Consider the simple regression model, y = β 0 + β 1x∗ + u, where we have m measures
on x∗. Write these as zh = x∗ + eh for all h = 1, . . . , m. Make the following
assumptions
• Cov(x∗
, u) = 0 (i.e., x∗
would be exogenous if it could be observed)• Cov(x∗, eh) = 0 (i.e., the CEV assumption holds)• the errors are pairwise uncorrelated.• Var(e1) = Var(e2) =, . . . , = Var(em) = σ2e .
Let w = (z1 + . . . + zm)/m be the average of the measures on x∗, so that for each
observation i, wi = (zi1 + . . . + zim)/m is the average of the m measures. Let β̄ 1 bethe OLS estimator from the simple regression yi on 1, wi, for i = 1, . . . , n using arandom sample of data.
(a) Show that
plim(β̄ 1) = β 1
σ2
x∗
[σ2x∗ + (σ2e/m)]
Hint: the plim of β̄ 1 is Cov(w, y)/ Var(w).
(b) With reference to your answer to question 1a, explain why using the average of all m measures of x∗ is better than using any single measure zh.
(20)
2. Consider a simple regression model of the return to schooling
ln(wage) = β 0 + β 1education + u.
(a) Suppose that you know the birth dates of the individuals in your sample. Sup-pose you also knew that children have to stay in school till the age of 16,and cannot begin school till the age of 7. Explain how you might construct aplausible binary instrumental variable with this information.
(b) Now consider the following regression output, where y = log(wage), x =education, and z is your binary instrument.
. s u y x z
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
y | 3010 6.261832 .4437976 4.60517 7.784889
x | 3010 13.26346 2.676913 1 18z | 3010 .6820598 .4657535 0 1
. ivregress 2sls y (x = z)
2
-
8/19/2019 Econometrics Final Exam 2012
3/13
Instrumental variables (2SLS) regression Number of obs = 3010
Wald chi2(1) = 51.20
Prob > chi2 = 0.0000
R-squared = .
Root MSE = .55667
------------------------------------------------------------------------------
y | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
x | .1880626 .0262826 7.16 0.000 .1365497 .2395756
_cons | 3.767472 .3487458 10.80 0.000 3.083942 4.451001
------------------------------------------------------------------------------
Instrumented: x
Instruments: z
. bysort z: su y x
------------------------------------------------------------------------
- > z = 0
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
y | 957 6.155494 .4328417 4.60517 7.474772
x | 957 12.69801 2.791523 1 18
------------------------------------------------------------------------
- > z = 1
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
y | 2053 6.311401 .4402214 4.60517 7.784889
x | 2053 13.52703 2.580455 2 18
(i) Verify that the IV estimate β̂ 1 = 0.1880626 can be written as
β̂ 1 = ȳ1 − ȳ0x̄1 − x̄0 ,
where ȳ0 and x̄0 are the sample averages of y and x over the part of thesample with z = 0, and ȳ1 and x̄1 are the sample averages of y and x overthe part of the sample with z = 1.
(ii) Now prove this result mathematically; i.e., show that
β̂ 1 =
ni=1 (zi − z̄)(yi − ȳ)
ni=1 (zi − z̄)(xi − x̄)
= ȳ1 − ȳ0x̄1 − x̄0
Hint: you might find the following results useful:
ni=1
yi − ȳ
= 0; z̄ =
ni=1 zi/n; nz̄ =
ni=1 zi
n = n0 + n1 ȳ =n0n
ȳ0 +
n1n
ȳ1
(30)
3
-
8/19/2019 Econometrics Final Exam 2012
4/13
3. You and Ms. Analyst work for the Treasury of the Government of South Africa.Ms. Analyst has been tasked by her Boss, Mr. Bigshot, to analyze data from askills training experiment targeted to a sample of young people. Ms. Analyst hasinvited you (an intern) to shadow her during the project. Participants were randomly
assigned to a treatment group and a control group. The control group did not getany training. Those assigned to the treatment group did receive training, and couldenter the programme from 1 January 2010. Some people in the treatment group onlyentered in mid-2011. The programme ended on 31 December 2011. Ms. Analyst hasbeen asked to investigate whether participation in the experiment had any effect onthe participant’s unemployment probability in 2012. The variables she has in herdataset are as follows:
st orage dis play v alue
variable name type format label variable label
----------------------------------------------------------------------------------------------
train byte %9.0g =1 if assigned to treatment group
age byte %9.0g age in 2011
educ byte %9.0g years of educationblack byte %9.0g =1 if black
coloured byte %9.0g = 1 if Coloured
married byte %9.0g =1 if married
nodegree byte %9.0g no tertiary qualification
mosinex byte %9.0g months prior to 1/2012 in experiment
unem08 byte %9.0g =1 if unemployed all of 2008
unem09 byte %9.0g =1 if unemployed all of 2009
unem12 byte %9.0g =1 if unemployed all of 2012
lwage08 float %9.0g Log of real wage in 2008; zero if wage is 0
lwage09 float %9.0g Log of real wage in 2009; zero if wage is 0
lwage12 float %9.0g Log of real wage in 2012; zero if wage is 0
agesq int %9.0g age^2
mostrn byte %9.0g months in training
---------------------------------------------------------------------------------------------
The descriptive statistics for these variables are as follows:
. su
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
train | 445 .4157303 .4934022 0 1
age | 445 25.37079 7.100282 17 55
educ | 445 10.19551 1.792119 3 16
black | 445 .8337079 .3727617 0 1
coloured | 445 .0876404 .2830895 0 1
-------------+--------------------------------------------------------
married | 445 .1685393 .3747658 0 1
nodegree | 445 .7820225 .4133367 0 1
mosinex | 445 18.1236 5.311937 5 24
unem08 | 445 .7325843 .4431092 0 1
unem09 | 445 .6494382 .4776829 0 1
-------------+--------------------------------------------------------
unem12 | 445 .3078652 .46213 0 1
lwage08 | 445 .4198245 .8862537 -.809299 3.678089
lwage09 | 445 .2771078 .7967834 -2.599059 3.224548
lwage12 | 445 1.135802 1.136259 -3.106541 4.099463
agesq | 445 693.9775 429.7818 289 3025
-------------+--------------------------------------------------------
mostrn | 445 7.68764 9.656205 0 24
4
-
8/19/2019 Econometrics Final Exam 2012
5/13
For this question, note that when G specializes to the logistic distribution, we have
G(xβ ) = Λ(xβ) = 1/(1 + e−xβ ) = exβ/(1 + exβ)
The associated density function for the logistic CDF is
g(xβ) ≡ Λ(xβ) = exβ
(1 + exβ)2
(a) Ms. Analyst starts by asking some basic questions: how many young peopleparticipated in the job training programme? Is there any reason to suspect, just by looking at the descriptive statistics, that the programme had an effecton unemployment? Help her answer these questions.
(b) She then runs the following below. What do you think she hopes to find outby running such a regression? Do you think that after looking at the results,her hopes would be dashed? Why/why not?
. global x unem08 unem09 age educ black coloured married
. logit train $x
Iteration 0: log likelihood = -302.1
Iteration 1: log likelihood = -297.04498
Iteration 2: log likelihood = -297.03096
Iteration 3: log likelihood = -297.03096
Logistic regression Number of obs = 445
LR chi2(7) = 10.14
Prob > chi2 = 0.1809
Log likelihood = -297.03096 Pseudo R2 = 0.0168
------------------------------------------------------------------------------train | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
unem08 | .0818541 .3193627 0.26 0.798 -.5440852 .7077934
unem09 | -.3939164 .2966978 -1.33 0.184 -.9754334 .1876006
age | .0134344 .0141347 0.95 0.342 -.0142691 .0411379
educ | .0514515 .0560396 0.92 0.359 -.0583842 .1612872
black | - .3318542 .359652 -0.92 0.356 -1.036759 .3730508
coloured | -.868989 .5029685 -1.73 0.084 -1.854789 .1168112
married | .1536043 .2656568 0.58 0.563 -.3670735 .6742822
_cons | -.6917486 .788755 -0.88 0.380 -2.23768 .8541828
------------------------------------------------------------------------------
(c) She then proceeds to the main business at hand: investigating the effects of the
programme on the probability of unemployment. She runs the two regressionsshown in the abbreviated STATA output below. On the basis of these results,she claims that the training program reduces the probability of being unem-ployed in 2012 to approximately 0.24 and this holds whether one estimates alinear probability model or a Probit model. Is she correct? Why/why not?(Hint : start by transforming the logit coefficients so that they are comparableto probit coefficients.)
5
-
8/19/2019 Econometrics Final Exam 2012
6/13
. reg unem12 train
Source | SS df MS Number of obs = 445
-------------+------------------------------ F( 1, 443) = 6.26
Model | 1.32226401 1 1.32226401 Prob > F = 0.0127
Residual | 93.5002079 443 .211061417 R-squared = 0.0139
-------------+------------------------------ Adj R-squared = 0.0117
Total | 94.8224719 444 .213564126 Root MSE = .45941
------------------------------------------------------------------------------
unem12 | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
train | -.1106029 .0441888 -2.50 0.013 -.1974486 -.0237572
_cons | .3538462 .0284917 12.42 0.000 .2978505 .4098418
------------------------------------------------------------------------------
. logit unem12 train
Logistic regression Number of obs = 445
LR chi2(1) = 6.30
Prob > chi2 = 0.0120
Log likelihood = -271.5828 Pseudo R2 = 0.0115
------------------------------------------------------------------------------
unem12 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
train | -.5328045 .2149117 -2.48 0.013 -.9540237 -.1115854
_cons | -.6021754 .1296994 -4.64 0.000 -.8563816 -.3479692
------------------------------------------------------------------------------
(d) She then runs the following Logit model where she controls for other factors.
Verify that the marginal effect of educ = −.0003291 that STATA computes isindeed correct.. logit unem12 train $x
Logistic regression Number of obs = 445
LR chi2(8) = 22.63
Prob > chi2 = 0.0039
Log likelihood = -263.42168 Pseudo R2 = 0.0412
------------------------------------------------------------------------------
unem12 | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
train | -.5531597 .2198136 -2.52 0.012 -.9839865 -.122333
unem08 | .1958456 .3522676 0.56 0.578 -.4945862 .8862775
unem09 | .0826864 .3245389 0.25 0.799 -.5533982 .718771
age | .0003619 .0151608 0.02 0.981 -.0293527 .0300766educ | -.0015829 .0605203 -0.03 0.979 -.1202005 .1170346
black | 1.102427 .5009618 2.20 0.028 .1205603 2.084294
coloured | -.2436418 .6937825 -0.35 0.725 -1.60343 1.116147
married | -.1358157 .29638 -0.46 0.647 -.7167097 .4450783
_cons | -1.707333 .9105037 -1.88 0.061 -3.491887 .0772213
------------------------------------------------------------------------------
6
-
8/19/2019 Econometrics Final Exam 2012
7/13
. predict xbhat3, index
. su xbhat3
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
xbhat3 | 445 -.8722221 .5529723 -2.650983 -.3112156
. mfx
Marginal effects after logit
y = Pr(unem12) (predict)
= .29479214
------------------------------------------------------------------------------
variable | dy/dx Std. Err. z P>|z| [ 95% C.I. ] X
---------+--------------------------------------------------------------------
train*| -.112445 .04331 -2.60 0.009 -.197337 -.027552 .41573
unem08*| .0399288 .07035 0.57 0.570 - .097947 . 177805 .732584
unem09*| .017101 .06676 0.26 0.798 -.113751 .147953 .649438
age | .0000752 .00315 0.02 0.981 - .006102 . 006253 25.3708
educ | -.0003291 .01258 -0.03 0.979 -.024988 .02433 10.1955
black*| .191368 .06889 2.78 0.005 .056337 .326399 .833708
coloured*| -.0484808 .13163 -0.37 0.713 -.306464 .209503 .08764
married*| -.0277014 .05925 -0.47 0.640 -.143828 .088425 .168539
------------------------------------------------------------------------------
(*) dy/dx is for discrete change of dummy variable from 0 to 1
(e) How should Ms. Analyst interpret the coefficient on unem08?
(f) On the basis of this evidence, Ms. Analyst believes that “skills training” couldbe the magic bullet in combating unemployment in South Africa, and that thetake-home message of the experiment is to offer the training to all young peoplein South Africa (i.e., to scale up the training programme). Mr. Bigshot how-
ever, is not convinced because he believes that the programme has a strongerbenefit to people who don’t have a long history of being unemployed. Is he cor-rect? Does it follow that the programme should not be scaled up? Why/whynot?
(20)
4. Consider the following labour supply model
whrs = β 0 + β 1kl6 + β 2k618 + β 3nwifeinc + β 4wa + β 5wa2 + β 6we + β 7we2
+β 8(wa × we) + β 9lww2 + u
The STATA output given below shows the variable definitions, as well as the re-
gression output where the given labour supply model has been estimated using theTobit approach. Study the output and then answer the questions that follow.
7
-
8/19/2019 Econometrics Final Exam 2012
8/13
---------------------------------------------------------------------------------------
storage display
variable name type format variable label
---------------------------------------------------------------------------------------
lfp float %9.0g A dummy variable = 1 if woman worked in 2011, else 0
whrs float %9.0g Number of hours the woman worked in 2011kl6 float %9.0g Number of children less than 6 years old in household
k618 float %9.0g Number of children between ages 6 and 18 in household
wa float %9.0g Woman’s age
we float %9.0g Woman’s educational attainment, in years
lww float %9.0g Log of woman’s hourly earnings (defined only for lfp = 1)
lww2 float %9.0g Log of woman’s hourly earnings (imputed when lfp = 0)
ax float %9.0g Actual years of woman’s previous labor market experience
prin float %9.0g Woman’s Property Income in rands
nwifeinc float %9.0g Prin/1000
we2 float %9.0g Square of Education
wa2 float %9.0g Square of Age
wawe float %9.0g Age times Education
-----------------------------------------------------------------------------------------
. tobit whrs $W $C $I lww2, ll(0) robust
Tobit regression Number of obs = 753
F( 6, 747) = 20.78
Prob > F = 0.0000
Log pseudolikelihood = -3891.0413 Pseudo R2 = 0.0161
------------------------------------------------------------------------------
| Robust
whrs | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
wa | -36.49 461 7.6 13938 - 4.79 0.0 00 -51. 44187 -2 1.547 35
we | 104.5203 26.88408 3.89 0.000 51.74297 157.2977
kl6 | - 1044.926 133.85 -7.81 0.000 -1307.693 -782.1587
k618 | - 100.1 299 41. 95978 - 2.39 0.0 17 -18 2.503 -1 7.756 79
nwifeinc | -22.20082 5.068873 -4.38 0.000 -32.15175 -12.24988
lww2 | 202.4707 113.6298 1.78 0.075 -20.60111 425.5425
_cons | 1172.027 474.3755 2.47 0.014 240.7594 2103.295
-------------+----------------------------------------------------------------
/sigma | 1258.636 48.07551 1164.257 1353.015
------------------------------------------------------------------------------
Obs. summary: 325 left-censored observations at whrs0) (predict, ystar(0,.))= 663.35746
------------------------------------------------------------------------------
variable | dy/dx Std. Err. z P>|z| [ 95% C.I. ] X
---------+--------------------------------------------------------------------
wa | -21.62883 4.47256 -4.84 0.000 -30.3949 -12.8628 42.5378
we | 61.94481 15.632 3.96 0.000 31.3063 9 2.5833 12.2869
kl 6 | -619. 2837 78.2 97 -7.91 0. 000 -772. 744 - 465.8 24 .2377 16
k61 8 | -59 .3428 24.9 51 -2.38 0. 017 -108. 245 - 10.44 02 1.353 25
nwifeinc | -13.15749 2.96863 -4.43 0.000 -18.9759 -7.33908 20.129
lww2 | 119.9959 67.414 1.78 0.075 - 12.1337 2 52.125 1.09613
------------------------------------------------------------------------------8
-
8/19/2019 Econometrics Final Exam 2012
9/13
(a) Verify that marginal effect
∂ E(whrs|x)∂lww2
= 119.9959
that STATA computes is indeed correct. Is this partial effect of any economicimportance? Why/why not?
(b) Study carefully the marginal effects for the conditional mean functionE(whrs|whrs > 0,x) given in the STATA output below. Then answer thequestions that follow. .
. mfx compute, predict(e(0,.))
Marginal effects after tobit
y = E(whrs|whrs>0) (predict, e(0,.))
= 1119.292
------------------------------------------------------------------------------
variable | dy/dx Std. Err. z P>|z| [ 95% C.I. ] X---------+--------------------------------------------------------------------
wa | -15.24023 3.15081 -4.84 0.000 -21.4157 -9.06477 42.5378
we | 43.64793 11.029 3.96 0.000 22.0316 65.2643 12.2869
kl6 | -436.3634 55.076 -7.92 0.000 -544.31 -328.417 .237716
k618 | -41.81448 17.57 -2.38 0.017 -76.2506 -7.37833 1.35325
nwifeinc | -9.271113 2.0924 -4.43 0.000 -13.3721 -5.17009 20.129
lww2 | 84.55224 47.489 1.78 0.075 - 8.52464 1 77.629 1.09613
------------------------------------------------------------------------------
(i) Letting z ≡ xβ/σ, show that
∂ E(whrs|whrs > 0, x)∂lww
= 202.47071 −z
φ(z)
Φ(z) − φ(z)2
Φ(z)2
In answering this question, the following results might come in handy:
φ(z) = 1√
2πe−z
2/2 ; E(whrs|whrs > 0,x) = xβ+σ φ(z)Φ(z)
∂φ
∂z = −zφ(z) ; ∂ Φ
∂z = φ(z)
(ii) Verify that the marginal effect
∂ E(whrs|whrs > 0,x)∂lww2
= 84.55224
that STATA computes is indeed correct.
(iii) Do the signs on lww and nwifeinc make economic sense? Explain.
(iv) With reference to your answer to questions 4a and 4(b)ii, what do youmake of the fact that the marginal effect for the sample of working women issmaller than for the combined sample of working and non-working women?
9
-
8/19/2019 Econometrics Final Exam 2012
10/13
(c) Calculate the elasticity of labour supply with respect to the hourly wage rate,for women that choose whrs > 0. Hint: the log wage elasticity formula whendealing with the conditional mean function E(whrs|whrs > 0,x) is
∂ E(whrs|whrs > 0,x)
∂lww2 × lww2
E(whrs|whrs > 0,x)Note also that the header to the marginal effects reported by STATA in question
4b tells you that E(whrs|whrs > 0,x) = 1119.292(d) Calculate the elasticity of labour supply with respect to the wife’s property
income variable nwifeinc.
(e) In what follows, wherever the word “income” is used, take it to mean income asdefined in this data set; i.e., “property income”. You may interpret “propertyincome” as any income earned from the sale of property that would carry a taxobligation known as a capital gains tax.
A policy maker is contemplating dropping the capital gains tax rate. He asks
his research office to estimate the income elasticity of labour supply. The staff at the research office have the same dataset you have been working with in thisquestion. They produce the following STATA output.
. su lww2 nwifeinc whrs if lfp == 1
Variable | Obs Mean Std. Dev. Min Max
-------------+--------------------------------------------------------
lww2 | 428 1.190173 .7231978 -2.054164 3.218876
nwifeinc | 428 18.93748 10.59135 -.0290575 91
whrs | 428 1302.93 776.2744 12 4950
. reg whrs $W $C $I lww2, robust
Linear regression Number of obs = 753
F( 6, 746) = 19.77
Prob > F = 0.0000
R-squared = 0.1226
Root MSE = 819.43
------------------------------------------------------------------------------
| Robust
whrs | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
wa | -20.3053 4.687209 -4.33 0.000 -29.50699 -11.10361
we | 43.42601 17.05139 2.55 0.011 9.951603 76.90043
kl6 | -497.8102 62.25698 -8.00 0.000 -620.0299 -375.5905
k618 | -79.02626 23.68683 -3.34 0.001 -125.527 -32.52548
nwifeinc | -10.47326 2.383235 -4.39 0.000 -15.15191 -5.79462
lww2 | 112.8796 90.96162 1.24 0.215 -65.69165 291.4508
_cons | 1383.116 293.9374 4.71 0.000 806.0734 1960.159
------------------------------------------------------------------------------
(i) On the basis of these results, the staff at the research office argue to thepolicy maker that at higher levels of work hours, labour supply is muchless income elastic. Are they correct?
10
-
8/19/2019 Econometrics Final Exam 2012
11/13
(ii) The policy maker, accepting the finding of question 4(e)i, remarks thatthis implies that a 1% increase in income is much more likely to lead toa negative “employment effect” than a negative “earnings” effect (i.e., thepolicy shift is more likely to induce a worker to switch from positive to 0
work hours than it is likely to induce a worker to work less hours). Whatis the underlying economic rationale behind this remark?
(iii) Based on the findings of questions 4(e)i and 4(e)ii above, the policy makerproposes to the Treasury that the proposed capital gains tax cut shouldnot be implemented because the resulting disincentive to enter into em-ployment would exceed the resulting disincentive to work longer hours.Assess this claim in the context of the Tobit model. Hint: there are twoissues to consider here: (i) is the negative income elasticity large enoughto warrent concern when the Tobit model is used, and is it necessarily thecase that the employment effect claimed in question 4(e)ii follows.
(30)
11
-
8/19/2019 Econometrics Final Exam 2012
12/13
12
-
8/19/2019 Econometrics Final Exam 2012
13/13
13