· lecture 1lecture 2lecture 3 some examples to get started 1/3 case 1 evaluation of the impact of...
TRANSCRIPT
Lecture 1 Lecture 2 Lecture 3
Quantile Regression for Program Evaluation:
Some (Introductory) ExamplesCESifo Lecture Series
Margherita Fort
University of Bologna, CESifo, IZA
Last updated: May 20, 2014
M.Fort () Quantile Regression Last updated: May 20, 2014 1
Lecture 1 Lecture 2 Lecture 3
Some Examples To Get Started 1/3
Case 1 Evaluation of the impact of welfare reforms on family earnings, income
and labour supply responses: eg AFDC, Jobs First in the US
Theory predicts heterogenous responses in the sign and magnitude of these
effects: eg no change in income at the bottom of the distribution;
fall or no change in income at the top
Mean impacts will average together positive and negative responses,
possibly obscuring the welfare reform effect
Using experimental data, Bitler, Gelbach, Hoynes (AER, 2006) find
• evidence of heterogeneity in the effects consistent with the theory
• that the intra-group variation in the effects at different points of the income
distribution exceeds the inter-group variaiton in mean impacts
Other recent examples: Maynard et al. (JAE, 2009); Ozkan et al. (JAE, 2014)
M.Fort () Quantile Regression Last updated: May 20, 2014 2
Lecture 1 Lecture 2 Lecture 3
Jobs First Features
from Bitler et al. (AER, 2006)990 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 2006
TABLE 1-KEY DIFFERENCES IN JOBS FIRST AND AFDC PROGRAMS
Jobs First AFDC
Earnings disregard All earned income disregarded up to poverty line Months 1-3: $120 + 1/3 (policy also applied to food stamps) Months 4-12: $120
Months > 12: $90 Time limit 21 months (6-month extension if in compliance None
and nontransfer income less than maximum benefit)
Work requirements Mandatory work first, exempt if child < 1 Education/training, exempt if child < 2
Sanctions 1st violation: 20-percent cut for 3 months (Rarely enforced) 2nd violation: 35-percent cut for 3 months 1st: adult removed from grant until compliant 3rd violation: grant cancelled for 3 months 2nd: adult removed
- 3 months
3rd: adult removed - 6 months Other policies * Asset limit $3,000 * Asset limit $1,000
* Partial family cap (50 percent) * 100-hour rule and work history requirement for two-parent families
* Two years transitional Medicaid * One-year transitional Medicaid * Child care assistance * Child support: $50 disregard, $50 * Child support: $100 disregard, full pass-through maximum pass-through
Source: Bloom et al. (2002).
to much recent discussion among policymakers and researchers, our results suggest the possibility that Connecticut's welfare reform reduced income for a nontrivial share of the income distribution after time limits took effect. Fourth, we find that the essential features of our empirical findings could not have been revealed using mean impact analysis on typically defined subgroups: the intra- group variation in QTE greatly exceeds the inter- group variation in mean impacts.
The remainder of the paper is organized as follows. In Section I, we provide an overview of the Jobs First program and its predicted effects. We then discuss our data in Section II. In Sec- tion III, we present empirical evidence that strongly suggests the time limit was an impor- tant program feature, and we present mean treatment effects in Section IV. Our main QTE results appear in Section V. We discuss exten- sions and sensitivity tests in Section VI, and we conclude in Section VII.
I. The Jobs First Program and Its Economic Implications
Below we compare the earnings, transfer, and income distributions between a randomly as- signed treatment group, whose members face the Jobs First eligibility and program rules, and a randomly assigned control group, whose
members face the AFDC eligibility and pro- gram rules. We begin by outlining the two pro- grams and use labor supply theory to generate predictions about earnings, transfers, and in- come under Jobs First compared to AFDC.
Table 1 summarizes the major features of Connecticut's Jobs First waiver program and the existing AFDC program. The Jobs First waiver contained each of the key elements in PRWORA: time limits, work requirements, and financial sanctions. Jobs First's earnings disre- gard policy is quite simple: every dollar of earnings below the federal poverty line (FPL) is disregarded for purposes of benefit determina- tion. This leads to an implicit tax rate of 0 percent for all earnings up to the poverty line, which is a very generous policy by comparison to AFDC's. The statutory AFDC policy disre- garded the first $120 of monthly earnings during a woman's first 12 months on aid, and $90 thereafter. In the first four months, benefits were reduced by two dollars for every three dollars earned, and starting with the fifth month on aid, benefits were reduced dollar for dollar, so that the long-run statutory implicit tax rate on earn- ings above the disregard was 100 percent.3
3In practice, AFDC effective tax rates were less than the 100-percent statutory rate. First, there were work expense and
This content downloaded from 137.204.178.134 on Tue, 22 Apr 2014 13:01:20 PMAll use subject to JSTOR Terms and Conditions
M.Fort () Quantile Regression Last updated: May 20, 2014 3
Lecture 1 Lecture 2 Lecture 3
Predictions From Static Labour Supply Theory
fig1.pdf
VOL. 96 NO. 4 BITLER ET AL.: WHAT MEAN IMPACTS MISS 991
As shown in Table 1, the Jobs First time limit is 21 months, which is currently the shortest in the United States (Office of Family Assistance, 2003, Table 12:10). By contrast, there were no time limits in the AFDC program. In addition, work requirements and financial sanctions were strengthened in the Jobs First program relative to AFDC. For example, the Jobs First work require- ments moved away from general education and training, focusing instead on "work first" training programs. Further, Jobs First exempts from work requirements only women with children under the age of one, and financial sanctions are supposed to be levied on parents who do not comply with work requirements. While Jobs First's sanctions are more stringent than AFDC's, the available evidence suggests that they were rarely used. For more information on these and other features of the Jobs First program, see our earlier working paper (Bitler et al., 2003b) and MDRC's final report on the Jobs First evaluation (Diana Adams-Ciardullo et al., 2002, henceforth the "final report").
Basic labor supply theory makes strong and heterogeneous predictions concerning welfare re- forms like those in Jobs First. In the rest of this section, we discuss the economic impacts of Jobs First on the earnings, transfers, and income distri- butions. We focus on earnings disregards and time limits, since they are the salient features for ex- amining heterogeneous treatment effects.
A. Economic Impacts of Earnings Disregards
To begin, Figure 1 shows a stylized budget constraint in income-leisure space before and
after Jobs First. The AFDC program is repre- sented by line segment AB while Jobs First is represented by AF. The Jobs First program dra- matically affects the budget constraint faced by welfare recipients-lowering the benefit reduc- tion rate to 0 percent and raising the breakeven earnings level to the FPL.4 The effective AFDC benefit reduction rate in this figure is below the statutory long-run rate of 100 percent (see foot- note 3 for a discussion).
What is the impact of this transformation of the on-welfare budget segment from AFDC's AB to Jobs First's AF? To begin, we make the usual static labor supply model assumptions: the woman can freely choose hours of work at the given offered wage, and offered wages are constant. In particular, we ignore any human capital, search-theoretic, or related issues. We
Monthly income
FPL
G
H F
E
D
B C
0 Monthly work hours
FIGURE 1. STYLIZED CONNECTICUT BUDGET CONSTRAINT UNDER AFDC AND JOBS FIRST
child care disregards. Second, AFDC eligibility redetermina- tion occurred less frequently than monthly, so there could be a lag between the month when an AFDC participant earned income and the date when benefits were reduced. Third, the Earned Income Tax Credit (EITC) provides a 40-percent wage subsidy in its phase-in region, which generally ended above Connecticut's maximum benefit level. (The EITC is available to both experimental groups in our data, so it raised the net wage above its before-tax level for both groups.) In Bitler et al. (2003b), we present local nonparametric regressions of transfer payments on earnings and find that the control group members receiving AFDC in our sample faced an effective benefit reduction rate of about one-third, similar to earlier studies of the national caseload in Terra McKinnish et al. (1999) and Thomas Fraker et al. (1985). Also, statutory rules for both AFDC and Jobs First tax away nonlabor income other than child support dollar for dollar; we discuss child support inter- actions in Section VIC.
4 Under AFDC rules, eligibility for AFDC conferred categorical eligibility for food stamps, with a 30-percent benefit reduction rate applied to non-food stamps income. Under Jobs First, food stamps rules mirror those for cash assistance: food stamps benefits are determined after disre- garding all earnings up to the poverty line (though this food stamps disregard expansion operates only while a woman assigned to Jobs First is receiving cash welfare payments). However, losing eligibility for welfare benefits under Jobs First assignment (e.g., through time limits) need not elimi- nate food stamps eligibility, since one could still satisfy the food stamps need standard.
A
This content downloaded from 137.204.178.134 on Tue, 22 Apr 2014 13:01:20 PMAll use subject to JSTOR Terms and Conditions
M.Fort () Quantile Regression Last updated: May 20, 2014 4
Lecture 1 Lecture 2 Lecture 3
Predictions From Static Labour Supply Theory
tab2.pdf
992 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 2006
TABLE 2-PRE-TIME LIMITrr PREDICTED EFFECTS OF JOBS FIRST ASSIGNMENT, BY OPTIMAL CHOICE GIVEN AFDC ASSIGNMENT
Compared to this point, does Jobs Location if First assignment change: Effect on distribution of: assigned to Location on Jobs AFDC After-tax wage? Nonlabor income? First budget set Hours/earnings Transfers Income
A Yes No A 0 0 0 Yes No On AF, left of A + 0 +
C Yes No On AF, left of C + + + D No Yes On AF, right of D - + + E No Yes On AF, left of A - + + H No Yes On AF, left of A - + -
No No H 0 0 0
Notes: Table contains predictions of static labor supply model for women facing AFDC and counterfactual Jobs First disregard rules (assuming all other rules are the same). Points are those labeled in Figure 1. There are two predictions for women at points A and H depending on those women's preferences.
also assume that there is no time limit. Later we relax these assumptions.
Consider first the case in which an AFDC- assigned woman locates at point A, working zero hours and receiving the maximum ben- efit payment G. Depending on the woman's preferences (e.g., the steepness of her indif- ference curves), assignment to Jobs First could lead to either of two outcomes. First, she might continue to work zero hours and receive the maximum benefit with no change in income. Second, she might enter the labor market, moving from A to some point on AF; transfer income remains at the maximum ben- efit level, while total income rises. This labor supply prediction-together with others dis- cussed below-is summarized in Table 2, which indicates whether Jobs First changes the after-tax wage (in this case, yes) and nonlabor income (in this case, no). Table 2 then indicates the predicted location on the Jobs First budget set and the expected impact of Jobs First assignment on earnings, trans- fers, and income.
We next consider points such as C, where women work positive hours and receive welfare when they are assigned to AFDC. For such women, assignment to Jobs First has only a price effect: the benefit reduction rate is lower, but there is no change in nonlabor
income at zero hours of work. As long as substitution effects dominate income effects when only the net wage changes, Jobs First will cause an increase in hours, earnings, transfers, and income.
Now imagine that a woman's preferences are such that she would not participate in welfare if assigned to AFDC, instead locating at a point like D. At this point, her earnings would be between the maximum benefit amount and the FPL. Assignment to Jobs First would make this woman income-eligible for welfare even if she did not change her behavior; this is the case of Orley Ashenfelter's (1983) "mechanical" in- duced eligibility effect leading to an increase in transfers. If we assume that both leisure and consumption are normal goods, then we will expect the increase in nonlabor income accom- panying Jobs First assignment to reduce hours of work and increase total income. That is, we expect women who would locate at point D to move to a point on AF that is both right of and above D.
Next consider a woman who would locate at a point like E if assigned to AFDC. At E, earnings are between the poverty line and the sum of the maximum benefit and the poverty line. Such points are clearly dominated under Jobs First assignment: the woman can increase income by reducing hours of work and claiming welfare (an example of Ashenfelter' s behavioral induced eligibility effect). If both leisure and consumption are normal goods, we expect this woman to locate on AF at a point higher than E, so that hours worked decrease, while transfers and income both increase.
5 Note that labor supply theory makes predictions about hours worked. Assuming no change in offered wages, this implies a prediction about earnings. Thus the table includes a single prediction for hours/earnings, which is important, since we observe earnings but not hours in our data.
This content downloaded from 137.204.178.134 on Tue, 22 Apr 2014 13:01:20 PMAll use subject to JSTOR Terms and Conditions
M.Fort () Quantile Regression Last updated: May 20, 2014 5
Lecture 1 Lecture 2 Lecture 3
Some Examples To Get Started 2/3
Case 2 Evaluation of the impact of risk or protective factors on weight/BMI
High cost and long-term effects (both medical and economc) motivate an
interest for factors associated to particularly low-birthweight
BMI outside a range (18-25) is not ideal for health: a variable that has a
positive effect only at the bottom of the BMI distribution can be considered a
protective factor because it is negatively associated to underweight;
conversely the same positive effect above the median/at the top of the BMI
distribution may lead one to consider the variable a risk factor
Mean impacts are less interesting for policy makers than impacts
on too low or too high BMI
Related papers: Abrevaya (2001,EE); Brunello et al. (2009b); Stifel (EHB, 2009)
M.Fort () Quantile Regression Last updated: May 20, 2014 6
Lecture 1 Lecture 2 Lecture 3
Some Examples To Get Started 3/3
Case 3 Research on income or wage inequality, wage structure“The school is a promising place to increase the skills and incomes of individuals. As a result,
educational policies have the potential to decrease existing, and growing, inequalities in income”
Ashenfelter et al. (2001)
What is the impact of education on (within-levels) wage inequality?
To answer this question we need to assess if returns to education are
heterogenous over the wage distribution
Do public sector firm ownership and lack of competition matter for wages?Shifting workers from the public sector to the private sector has an ambigoustheoretical impact on wages, given the interplay of ownership andcompetition. Because isses such as pay equity and fairness are encountered inpolitical discussion, privatization is likely to affect not only the average wagebut also the distribution of wages.
Related papers: Martins et al (LE, 2004); Brunello et al. (EJ, 2009); Melly et al.
(JEEA, forth.)
M.Fort () Quantile Regression Last updated: May 20, 2014 7
Lecture 1 Lecture 2 Lecture 3
Remarks
Theoretical models may predict heterogeneout impacts of “treatments”
Mean impact may miss distributional effects of a policy or a treatment
Policy makers may be intrinsically interested inthe impact of a policy on extreme values of the distribution of the outcomesassessing effects on inequality
There are many reasons to go beyond the average
Quantile regression is an appropriate toolto examine heterogeneous effects onthe distribution of a continuous outcome
Quantile regression coefficients may not have a causal interpretation
No multivariate quantile definition (yet)
Conditional and Unconditional Quantiles are two different objects (!)
Quantile Treatment Effect do not speak about quantiles of the treatmenteffect distribution without further assumptions (!)
M.Fort () Quantile Regression Last updated: May 20, 2014 8
Lecture 1 Lecture 2 Lecture 3
Are the example mentioned related to
your research?
Can you think about research questions
in your field for which heterogeneity matters?
(Share them!)
M.Fort () Quantile Regression Last updated: May 20, 2014 9
Lecture 1 Lecture 2 Lecture 3
What we cover in these lectures (introductory level)
Quantile Regression (QR) with Exogenous RegressorsTheory (fundamentals) and Applications (some examples)
QR with Endogenous Regressors, Instrumental Variable Approaches:IV-QTELATE-QTE
QR with Endogenous Regressors: The Causal Chain Model
Topics related to identification & estimation of effects of covariates onconditional quantiles
What we do not cover (but maybe relevant for your research)
Unconditional Quantile Regression
Quantile Regression for Time Series, Panel Data and Discrete Data
Censored Quantile Regression and Quantile Regression for Duration Analysis
Decomposition Analysis with QR or Unconditional Quantile Regression
Identification Strategies for QTE that rely on approaches other than IVs
Nonlinear Quantile Regression
. . .
M.Fort () Quantile Regression Last updated: May 20, 2014 10
Lecture 1 Lecture 2 Lecture 3
Step 1: We Go Through the Fundamentals
Define quantile (percentile)
The simplest quantile regression (QR) model: the two-sampletreatment-control model
QR interpretation: basics and examples
Estimation (intuition only) & testing (a little bit)
Key properties of the QR estimator
M.Fort () Quantile Regression Last updated: May 20, 2014 11
Lecture 1 Lecture 2 Lecture 3
Disclaimer . . .
Going through the fundamentals may not be fun
but . . . no free lunches
M.Fort () Quantile Regression Last updated: May 20, 2014 12
Lecture 1 Lecture 2 Lecture 3
Distributions . . .
Y random variable with cumulative distribution function (c.d.f.) FY(·)
FY(y) ≡ Prob[Y ≤ y] y ∈ Y 0 < FY(y) ≤ 1
0.2
.4.6
.81
ecdf
x
-4 -2 0 2 4x
M.Fort () Quantile Regression Last updated: May 20, 2014 13
Lecture 1 Lecture 2 Lecture 3
. . . Quantiles
Quantile function of Y, for any 0 < τ < 1 QuantY (τ) ≡ qY (τ) ≡ F−1Y (τ)
τth quantile ≡ QY (τ) = inf y : F (y) ≥ τ
-4-2
02
4x
0 .2 .4 .6 .8 1ecdfx
M.Fort () Quantile Regression Last updated: May 20, 2014 14
Lecture 1 Lecture 2 Lecture 3
E.c.d.f. F(y) Quantile F−1(τ)
0.2
.4.6
.81
ecdf
x
-4 -2 0 2 4x
-4-2
02
4x
0 .2 .4 .6 .8 1ecdfx
Recalling two known result
1. For any known c.d.f. F (·), taken U ˜U (0, 1) and Y = F−1(U), then FY (y ) = F (y )∀y ∈ R
2. For any continuous r.v. Y with c.d.f. FY (·), taken U = FY (y ), then U ˜U (0, 1)
M.Fort () Quantile Regression Last updated: May 20, 2014 15
Lecture 1 Lecture 2 Lecture 3
A simple (historical) example : food expenditure & income
In 1857, Engel highlighted that, the income elasticity of demand of food is leq 1:
when a households’ income X increases, the proportion of money they spend on
food Y decreases, even if actual expenditure on food rises.0
500
1000
1500
2000
500 1000 1500 2000 2500 3000Income
Food expenditure Predict. Food exp.(mean)
E [Y |X ] = β0 + β1X (β0, β1) : ∑ni=1(yi − β0 − β1xi )
2 = min
M.Fort () Quantile Regression Last updated: May 20, 2014 16
Lecture 1 Lecture 2 Lecture 3
Boxplot: Conditional distribution of food expenditure by income levels
050
01,
000
1,50
02,
000
Foo
d ex
pend
iture
Low income (below median) High Income (above median)
Limits of the boxes: 1st and 3rd quartiles of food expenditure (Y) in each class.
Median: horizontal line in the middle of the box.
The mean food expenditure increases across groups but also the dispersionchanges
You could “draw lines” connecting conditional percentiles: the slopes of these
lines will not be constant across percentiles
M.Fort () Quantile Regression Last updated: May 20, 2014 17
Lecture 1 Lecture 2 Lecture 3
Scatter plot, E [Y |X ] & QR
050
010
0015
0020
00
500 1000 1500 2000 2500 3000Income
Food expenditure Predict. Food exp.(mean)Predict. Food exp.(median) Predict. Food exp.(25th p)
QuantY (0.5|X ) = β0,0.5 + β1,0.5X
(β0,0.5, β1,0.5) :n
∑i=1
|yi − β0,0.5 − β1,0.5xi | = min
M.Fort () Quantile Regression Last updated: May 20, 2014 18
Lecture 1 Lecture 2 Lecture 3
The Two-Sample Treatment-Control Model . . .
To interpret the meaning of β1,0.5 we consider the case with two levels of income
(the treatment) only, X0 (low income) and X1 (high income)C.d.f. Quantiles
0.2
.4.6
.81
E.c
.d.f
0 500 1000 1500 2000Food expenditure
Food exp., low income Food exp., high income0
500
1000
1500
2000
Foo
d ex
pend
iture
0 .2 .4 .6 .8 1
Food expenditure Food expenditure
Providing households with additional income may increase their food expenditure
differently depending on their propensity to consume or on their love for food
M.Fort () Quantile Regression Last updated: May 20, 2014 19
Lecture 1 Lecture 2 Lecture 3
QR: interpretation (preview)P.d.f. C.d.f.
010
20
30
Perc
ent
0 500 1000 1500 2000
Food exp. | High income Food exp. | Low income
Food exp. conditional pdf |Income
0.2
.4.6
.81
E.c
.d.f
0 500 1000 1500 2000Food expenditure
Food exp., low income Food exp., high income
The quantile treatment efffect (QTE) is the change in food expenditure required
to keep the individual in the same quantile of the high income (treated)
distribution (G(·)) and low income (control) distribution (F(·), the horizontal
distance δ(x) between the distributions F and G: F (x)= G (x + δ(x))
M.Fort () Quantile Regression Last updated: May 20, 2014 20
Lecture 1 Lecture 2 Lecture 3
Doksum’s (1974) treatment effect function δ(τ)
F (y) food expenditure cdf when income is low
G (y) food expenditure cdf when income is high
δ(τ) = G−1(τ)− F−1(τ), 0 < τ < 1
Taking τ = F (x) and changing variables δ(x) = G−1(F (x))− x
An (analog) estimate δ(τ) could:take the difference of the sample quantiles; orcan be obtained through (parametric) quantile regression
This does not say that, for instance, an individual
who is at the τ-th quantile on the low income distribution
will be at the τ-th quantile on the high income distribution
after an increase in income
M.Fort () Quantile Regression Last updated: May 20, 2014 21
Lecture 1 Lecture 2 Lecture 3
When Shall One Use Quantile Regression?
The regression model is yi = β0 + β0xi + εi , iid errors ε ˜ Fε(·)Then FY (y) = Fε(y − β0 − β1x) and QuantY (τ) = β0 + β1x + Quantε(τ)
quantiles of food expenditure when income is low:
QuantY (τ) =
QR intercept︷ ︸︸ ︷β0 + Quantε(τ) +β1x
quantiles of food expenditure when income is high:
QuantY (τ) =
F−1(τ) (x=0)︷ ︸︸ ︷β0 + Quantε(τ) +β1x︸ ︷︷ ︸
G−1(τ) (x=1)
= x ′i β(τ)
x ′i ≡ [1 xi ] β(τ) ≡ [β0 +Quantε(τ) β1]′
The regression model is nested in a QR model QuantY (τ) = x ′i β(τ) that
restricts the effect of X to be constant at all quantiles
M.Fort () Quantile Regression Last updated: May 20, 2014 22
Lecture 1 Lecture 2 Lecture 3
Presenting QR results
Plot graphs of the coefficient estimates with confidence bounds:
y -axis: β(τ); x-axis: quantile
Show the corresponding OLS coefficient estimate on the graph
Interpret the meaning (! need some caution)
Koenker & Hallock adapted from Tukey
“Never estimate intercepts, always estimate centercepts!”
Interpret the pattern (take location, location/scale models as reference)
Plot the estimated conditional quantile functions at
x = x to check for crossings
M.Fort () Quantile Regression Last updated: May 20, 2014 23
Lecture 1 Lecture 2 Lecture 3
Presenting QR results: intercept & income coefficients . . .
0.00
50.0
010
0.00
150.
0020
0.00
Inte
rcep
t
.25 .5 .75Quantile
Fig_constant
0.40
0.50
0.60
0.70
Inco
me
.25 .5 .75Quantile
Fig_income
M.Fort () Quantile Regression Last updated: May 20, 2014 24
Lecture 1 Lecture 2 Lecture 3
Presenting QR results: centercept & income coefficients
500.
0060
0.00
700.
0080
0.00
Inte
rcep
t
.1 .25 .5 .75 .9Quantile
Fig_constant
0.30
0.40
0.50
0.60
0.70
res_
inco
me
.1 .25 .5 .75 .9Quantile
Fig_res_income
M.Fort () Quantile Regression Last updated: May 20, 2014 25
Lecture 1 Lecture 2 Lecture 3
Some useful benchmark examples
to learn how to
interpret the pattern of
quantile regression coefficients
M.Fort () Quantile Regression Last updated: May 20, 2014 26
Lecture 1 Lecture 2 Lecture 3
Treatment Effect is Location Shift 1/3
P.d.f. C.d.f.
02
46
8P
erc
ent
0 5 10
y |control y |treatment
y conditional pdf |xLocation Model
0.2
.4.6
.81
0 5 10Quantile (Q_y(tau))
Ecdf y | control Ecdf y |treatment
M.Fort () Quantile Regression Last updated: May 20, 2014 27
Lecture 1 Lecture 2 Lecture 3
Treatment Effect is Location Shift 2/3
C.d.f. Quantiles
0.2
.4.6
.81
0 5 10Quantile (Q_y(tau))
Ecdf y | control Ecdf y |treatment
05
10
Quantile
(Q
_y(t
au))
0 .2 .4 .6 .8 1
Quantile y | control Quantile y |treatment
M.Fort () Quantile Regression Last updated: May 20, 2014 28
Lecture 1 Lecture 2 Lecture 3
Treatment Effect is Location Shift 3/3
Quantiles QTE
05
10Q
uant
ile (
Q_y
(tau
))
0 .2 .4 .6 .8 1
Quantile y | control Quantile y |treatment
3.30
3.40
3.50
3.60
3.70
x
.1 .25 .5 .75 .9Quantile
Fig_treatment
The regressor X only affects the location of the distribution of Y
Focusing on the impact of X on the conditional average is enough
The impact of X is the same across quantiles
M.Fort () Quantile Regression Last updated: May 20, 2014 29
Lecture 1 Lecture 2 Lecture 3
Case A: Income Affects only the Location of the Food Expenditure Distribuion
05
1015
20
0 2 4 6 8 10x
y Fitted cond. quantile 0.10
Location Effect Example
yi = β0 + β1xi + u
M.Fort () Quantile Regression Last updated: May 20, 2014 30
Lecture 1 Lecture 2 Lecture 3
Case A: Income Affects only the Location of the Food Expenditure Distribuion
05
1015
20
0 2 4 6 8 10x
y Fitted cond. quantile 0.10Fitted cond. quantile 0.50
Location Effect Example
yi = β0 + β1xi + u
QuantY (τ|X ) = βτ + β1X
M.Fort () Quantile Regression Last updated: May 20, 2014 31
Lecture 1 Lecture 2 Lecture 3
Case A: Income Affects only the Location of the Food Expenditure Distribuion
05
1015
20
0 2 4 6 8 10x
y Fitted cond. quantile 0.10Fitted cond. quantile 0.50 Fitted cond. quantile 0.90
Location Effect Example
yi = β0 + β1xi + u
QuantY (τ|X ) = βτ + β1X = β0F−1u (τ) + β1X
(βτ, β1) :
∑yi≥βτ+β1xi
τ|yi − βτ + β1xi |+ ∑yi<βτ+β1xi
(1− τ)|yi − βτ + β1xi | = min
M.Fort () Quantile Regression Last updated: May 20, 2014 32
Lecture 1 Lecture 2 Lecture 3
Treatment Effect is a Location & Scale Shift 1/3
P.d.f. C.d.f.
02
46
8P
erc
ent
-10 0 10 20
y |treatment y |control
y conditional pdf |xLocation and Scale Model
0.2
.4.6
.81
-10 0 10 20Quantile (tau)
Ecdf y | control Ecdf y |treatment
M.Fort () Quantile Regression Last updated: May 20, 2014 33
Lecture 1 Lecture 2 Lecture 3
Treatment Effect is a Location & Scale Shift 2/3
C.d.f. Quantiles
0.2
.4.6
.81
-10 0 10 20Quantile (tau)
Ecdf y | control Ecdf y |treatment
-10
010
20
Quantile
(ta
u)
0 .2 .4 .6 .8 1
Quantile y | control Quantile y |treatment
M.Fort () Quantile Regression Last updated: May 20, 2014 34
Lecture 1 Lecture 2 Lecture 3
Treatment Effect is a Location & Scale Shift 3/3
Quantiles QTE
-10
010
20Q
uant
ile (
tau)
0 .2 .4 .6 .8 1
Quantile y | control Quantile y |treatment
-2.0
00.
002.
004.
006.
008.
00x
.1 .25 .5 .75 .9Quantile
Fig_treatment
The regressor X only affects the location & scale of the distribution of Y
Focusing on the impact of X on the conditional average is not enough
The impact of X differs across quantiles
M.Fort () Quantile Regression Last updated: May 20, 2014 35
Lecture 1 Lecture 2 Lecture 3
Case B: Income Affects Location & Scale of the Food Exp. Distribuion
020
040
060
080
0
0 2 4 6 8 10x
y Fitted cond. quantile 0.10Fitted cond. quantile 0.50
Location-Scale Effect Example
yi = β0 + β1xi + β2(xi )u
M.Fort () Quantile Regression Last updated: May 20, 2014 36
Lecture 1 Lecture 2 Lecture 3
Case B: Income Affects Location & Scale of the Food Exp. Distribuion
020
040
060
080
0
0 2 4 6 8 10x
y Fitted cond. quantile 0.10Fitted cond. quantile 0.50 Fitted cond. quantile 0.90
Location-Scale Effect Example
yi = β0 + β1xi + (β2(xi ))u
QuantY (τ|X ) = β0F−1u (τ) + β1x +
√β2(xi )F
−1u (τ)
intuition: the rescaled r.v. u = Y−β1x√β2(x)
is distributed independently of X
M.Fort () Quantile Regression Last updated: May 20, 2014 37
Lecture 1 Lecture 2 Lecture 3
Case B: Income Affects Location & Scale of the Food Exp. Distribuion
020
040
060
080
0
0 2 4 6 8 10x
y Fitted cond. quantile 0.10Fitted cond. quantile 0.50 Fitted cond. quantile 0.70Fitted cond. quantile 0.90
Location-Scale Effect Example
yi = β0 + β1xi + β2(xi )u
QuantY (τ|X ) = β0,τ + β3,τx = x ′β(τ)
(βτ, β1) :
∑yi−x ′i β(τ)≥0
τ|yi − x ′i β(τ)|+ ∑yi−x ′i β(τ)<0
(1− τ)|yi − x ′i β(τ)| = min
M.Fort () Quantile Regression Last updated: May 20, 2014 38
Lecture 1 Lecture 2 Lecture 3
Scatter plot, E[Y|X] & QR
050
010
0015
0020
00
500 1000 1500 2000 2500 3000Income
Food expenditure Predict. Food exp.(mean)Predict. Food exp.(median) Predict. Food exp.(25th p)
QuantY (0.5|X ) = β0,0.5 + β1,0.5X
(β0,0.5, β1,0.5) : ∑ni=1 |yi − β0,0.5 − β1,0.5xi | = min
M.Fort () Quantile Regression Last updated: May 20, 2014 39
Lecture 1 Lecture 2 Lecture 3
(No) Crossings
QuantY (τ) =
F−1(τ) (x=0)︷ ︸︸ ︷β0(τ) +β1(τ)x︸ ︷︷ ︸G−1(τ) (x=1)
= x ′i β(τ)
x ′i ≡ [1 xi ] β ≡ [β0(τ) β1(τ)]′
!! Y |X = x can be simulated by setting y = x ′β(U) U ˜ U (0, 1)
Crucial: all the coordinates in β(U) are determined by a single draw from U (0, 1)
Percentiles are ordered. Implicit in the formulation QuantY (τ) = x ′i β(τ) is the
requirement that QuantY (τ) is monotone increasing in τ, ∀x .
Crossings may be observed for extreme values of x .
At x = x , quantiles should not cross. If conditional quantile functions cross at a
significant number of points, this can be interpreted as evidence of modelmisspecification.
M.Fort () Quantile Regression Last updated: May 20, 2014 40
Lecture 1 Lecture 2 Lecture 3
Treatment Effect is a Shape Change 1/3
P.d.f. C.d.f.
010
20
30
40
50
Perc
ent
-20 -10 0 10 20
y |treatment y |control
y conditional pdf |xChange in Shape
0.2
.4.6
.81
-20 -10 0 10 20Quantile (tau)
Ecdf y | control Ecdf y |treatment
M.Fort () Quantile Regression Last updated: May 20, 2014 41
Lecture 1 Lecture 2 Lecture 3
Treatment Effect is a Shape Change 2/3
C.d.f. Quantiles
0.2
.4.6
.81
-20 -10 0 10 20Quantile (tau)
Ecdf y | control Ecdf y |treatment
-20
-10
010
20
Quantile
(ta
u)
0 .2 .4 .6 .8 1
Quantile y | control Quantile y |treatment
M.Fort () Quantile Regression Last updated: May 20, 2014 42
Lecture 1 Lecture 2 Lecture 3
Treatment Effect is a Shape Change 3/3
Quantiles QTE
-20
-10
010
20Q
uant
ile (
tau)
0 .2 .4 .6 .8 1
Quantile y | control Quantile y |treatment
0.50
1.00
1.50
2.00
2.50
3.00
x
.1 .25 .5 .75 .9Quantile
Fig_treatment
The regressor X only affects the shape of the distribution of Y
Focusing on the impact of X on the conditional average is not enough
The impact of X differs across quantiles
Narrower spacing of the lower quantiles indicates higher density and short lower tail
Wider spacing of the upper quantiles indicates lower density and long upper tail
M.Fort () Quantile Regression Last updated: May 20, 2014 43
Lecture 1 Lecture 2 Lecture 3
Summary: Interpretation of the Pattern of QR coefficients
Location Shift Location-Scale Shift
3.30
3.40
3.50
3.60
3.70
x
.1 .25 .5 .75 .9Quantile
Fig_treatment
-2.0
00.
002.
004.
006.
008.
00x
.1 .25 .5 .75 .9Quantile
Fig_treatment
see also fig.2.9 Koenker (2005)Shape Change
0.50
1.00
1.50
2.00
2.50
3.00
x
.1 .25 .5 .75 .9Quantile
Fig_treatment
M.Fort () Quantile Regression Last updated: May 20, 2014 44
Lecture 1 Lecture 2 Lecture 3
Intuition on the Estimation of QR coefficients
To estimate the parameters for the τ − th quantile, we seek the values of theparameters such that we have a fraction τ of positive and a fraction (1− τ)of negative residuals; e.g. for the median, we need to have as many positiveas negative residuals
Robustness: an outlier observation matters for estimation only if it changesthe sign of the residual
Median 25th quantile
0.5
11.
52
Sym
met
ric L
oss
Fun
ctio
n
-4 -2 0 2 4Residual
Symmetric Loss Function Symmetric Loss Function
MedianSymmetric Loss Function
0.5
11.
52
2.5
Asy
mm
etric
Los
s F
unct
ion
-4 -2 0 2 4 6Residual
Asymmetric Loss Function Asymmetric Loss Function
25th percentileAsymmetric Loss Function
! Linear programming
M.Fort () Quantile Regression Last updated: May 20, 2014 45
Lecture 1 Lecture 2 Lecture 3
Properties of QR estimator: Equivariance
It guarantee a coherent interpretation of the results when the data or the
underlying model are modified not in an essential way.
Scale equivariance
For any a > 0, β(τ; ay , X ) = aβ(τ; y , X ) and
β(τ;−ay , X ) = aβ(1− τ; y , X )
Regression Shift
For any γ ∈ Rp, β(τ; y + X γ, X ) = β(τ; y , X ) + γ
Reparametrization of Design
For any |A| 6= 0, β(τ; y + AX , X ) = A−1 β(τ; y , X )
The analog invariance properties hold in mean regression
M.Fort () Quantile Regression Last updated: May 20, 2014 46
Lecture 1 Lecture 2 Lecture 3
Equivariance to Monotone Transformations
For any monotone function h(·), conditional quantile functions are equivariant
Quanth(Y )(τ|x) = h(QuantY (τ|x))
i.e. the quantiles of the transformed variable are simply the transformed quantiles
of the original variable
The analog property does not hold in mean regression
M.Fort () Quantile Regression Last updated: May 20, 2014 47
Lecture 1 Lecture 2 Lecture 3
Every Serious Estimate Deserves a Standard Error
1 Finite sample distribution of the estimator: limited applicability in practice
2 Asymptotic distribution of the estimator:√n(β(τ)− β(τ)) ˜ N (0, τ(1− τ)H−1
n JnH−1n )
Jn(τ) = n−1 ∑ni=1 xix
′i = X ′X Hn(τ) = limn→∞ n−1 ∑n
i=1 xix′i fi (ξi (τ))
fi (ξi (τ)) is the conditional density of the response evaluated at the τ-thconditional quantile
remark 1 the factor τ(1− τ) tends to make quantiles more precise in the tails;remark 2 the factor Hn(τ) tends to make quantiles less precise in regions of lowdensity; this effect typically dominates;
remark 3: nonparametric density (the sparsity parameter or quantile-density
function) estimation required
3 (Some form of) Resampling (bootstrap)
M.Fort () Quantile Regression Last updated: May 20, 2014 48
Lecture 1 Lecture 2 Lecture 3
Few Words on Testing
The results on the distribution of a single vector of QR parameters can be
extended to derive the asymptotic covariance matrix for distinct quantile thus
allowing to contrast estimates of the slope coefficients across quantiles
Test of the location-shift model or of simmetry can be described as
test of linear restriction btw the coefficients of a regressor at different
quantiles → Wald tests can be constructed
Other approaches: quantile likelihood ratio tests, rank-based inference, . . .
M.Fort () Quantile Regression Last updated: May 20, 2014 49
Lecture 1 Lecture 2 Lecture 3
Discrete or Continuos Treatment
Binary case yi = yiC · (1− xi ) + yiT · xi = yiC + (yiT − yiC ) · xi
QY (τ) ≡ F−1yC (τ) · (1− x) + F−1
yT (τ)) · x
= F−1yC (τ) + (F−1
yT (τ)− F−1yC (τ)) · x
= α(τ) + β(τ) · x
Discrete case: p treatments yi = yiC + ∑pj=1(yij − yiC ) · xij = yiC + ∑p
j δij · xij
QY (τ) ≡ = F−1yC (τ) + ∑j (F
−1yj (τ)− F−1
yC (τ)) · x
= α(τ) + ∑j δj (τ) · xj
Continuous caseQY (τ) = α(τ) + γ(τ)x
M.Fort () Quantile Regression Last updated: May 20, 2014 50
Lecture 1 Lecture 2 Lecture 3
Stata Example: Food Expenditure (Y) and Income (X)
M.Fort () Quantile Regression Last updated: May 20, 2014 51
Lecture 1 Lecture 2 Lecture 3
QTE of Jobs First on Earnings
from Bitler et al. (AER, 2006)
1000 THE AMERICAN ECONOMIC REVIEW SEPTEMBER 2006
Quarterly
impact
1,000
800
600
400
200
0
-200
-400
-600
10 20 30 40 50 60 70 80 90 Quantile
FIGURE 3. QUANTILE TREATMENT EFFECTS ON THE DISTRIBUTION OF EARNINGS, QUARTERS 1-7
Notes: Solid line is QTE; dotted lines provide bootstrapped 90-percent confidence intervals; dashed line is mean impact; all statistics computed using inverse propensity-score weighting. See text for more details.
group over the first seven quarters and 55 per- cent of corresponding AFDC group person- quarters. For quantiles 49-82, Jobs First group earnings are greater than control group earn- ings, yielding positive QTE estimates. Between quantiles 83 and 87, earnings are again equal (though non-zero). Finally, for quantiles 88-97, AFDC group earnings exceed Jobs First group earnings, yielding negative QTE estimates. The only quantile having a statistically significant QTE based on a two-sided test is the ninety- second-for all other quantiles between 89 and 96, the two-sided QTE confidence intervals in- clude zero in the confidence interval. On the other hand, one-sided tests yield p-values of 0.10 or lower for all QTE in the 90-95 quantile range.'6 These results are what basic labor sup- ply theory, discussed above, predicts. That is, the QTE at the low end are zero, they rise, and then they eventually become negative (if im- precisely estimated). The negative effects alt the top of the earnings distribution are partic- ularly interesting given that they have typi- cally not been found in other programs (e.g.,
Nada Eissa and Jeffrey B. Liebman's, 1996, study of the EITC).
The variation in Jobs First's impact across the quantiles of the distributions appears unmistak- ably significant, both statistically and substan- tively; these results suggest that the mean treatment effect is far from sufficient to charac- terize Jobs First's effects on earnings."7
Figure 4 plots the earnings QTE results in quarters 8-16, after the time limit takes effect for at least some women. For the first 76 quan- tiles, these results are broadly similar to those for the pre-time limit period (though they have a somewhat wider range and become positive slightly earlier). For quantiles 77-97, we again find negative treatment effects (with a few being zero), but none of them is individually signifi-
16 To test whether these QTE estimates are jointly sig- nificantly negative, we carry out two sets of tests. Details are somewhat complicated, so we relegate them to Appen- dix B. Our basic conclusion, however, is that there is some marginal evidence that these QTE are jointly different from zero.
17 Under the null of constant treatment effects, all QTE must equal the mean treatment effect. This null can be rejected decisively simply by noting the large fraction of the treatment group earnings distribution having zero earnings (Heckman et al., 1997, make a similar point regarding treatment effects of job training). We did conduct more formal tests for the null that the $800(= $500 - (-$300)) range of the estimated QTE could have been generated under the null that all quantiles of the Jobs First distribution equal the mean treatment effect plus the corresponding quantiles of the AFDC distribution. These tests, which impose the null by using paired bootstrap sample draws from the AFDC group sample and then adding the mean treatment effect to each sample quantile in one of the pairs, soundly reject the equality of the QTEs.
This content downloaded from 137.204.178.134 on Tue, 22 Apr 2014 13:01:20 PMAll use subject to JSTOR Terms and Conditions
M.Fort () Quantile Regression Last updated: May 20, 2014 52
Lecture 1 Lecture 2 Lecture 3
References those who fit in the slide . . .
Abrevaya (2001) ‘The Effects of Demographics and Maternal Behavior on the Distributionof Birth Outcomes’, Empirical Economics pp. 247-257
Brunello, Fort, Weber (2009) ‘Changes in Compulsory Schooling,Education and theDistribution ofWages in Europe’, Economic Journal 110 pp. 516-539
Bitler, Gelbach, Hoynes (2006) ‘What Mean Impacts Miss: Distributional Effects ofWelfare Reform Experiments’, The American Economic Review 96(4) pp. 988-1012
Buchinsky (1998) ‘Recent Advances in Quantile Regression Models: A Practical Guidelinfor Empirical Research’ The Journal of Human Resources 33 (1), pp. 88-126
Koenker & Hallock (2001) ‘Quantile regression’ Journal of Economic Perspectives 15(4),pp. 143-156
Maynard, Qui (2009) ‘Public Insurance and Private Savings: Who Is Affected and By HowMuch?’ Journal of Applied Econometrics 24, pp.282-308
Martins et al. (2004) ‘Does Education Reduce Wage Inequality? Quantile RegressionEvidence from 16 Countries’, Labour Economics, Vol. 11. pp. 355-371
Melly, Puhani (forth.) ‘Do Public Ownership and Lack of Competition Matter for Wagesand Employment? Evidence FRom Personnel Records of a Privatized Firm’, Journal of theEuropean Economic Organization pp.
Koenker (2005) Quantile Regression Chapter 1 and 2 and 6 (covered partially)
Ozkan, Ozbeklik (2014) ‘Who Benefits From Job Corps?A Distributional Analysis of AnActive Labour Market Program’ Journal of Applied Econometrics 24, pp.282-308
M.Fort () Quantile Regression Last updated: May 20, 2014 53
Lecture 1 Lecture 2 Lecture 3
Wrap Up on the
Fundamentals of Quantile Regression (QR)
before Moving to the Discussion on
Identification Strategies in QR
that Exploit Instrumental Variation
M.Fort () Quantile Regression Last updated: May 20, 2014 54
Lecture 1 Lecture 2 Lecture 3
Today’s Running Example (Keep It in Mind To Avoid Getting Lost)
“The school is a promising place to increase the skills and incomes of individuals. As a result,
educational policies have the potential to decrease existing, and growing, inequalities in income”
Ashenfelter et al. (2001)
Does education reduce wage inequality? .e. Are the returns to education
homogenous over the wage distribution?
Policy relevance:
schooling can be a powerful tool to combat inequalityschooling can reduce differences due to genetic & envirnomental factors
Similar questions may be asked for training programs
Related papers:
Martins et al (LE, 2004): assumes exogeneityChernozhucov et al. (2006): IVQTE model; addresses endogeneity issues
√
Abadie et al. (2002) (training): LATE-QTE model; endogeneity issues√
Brunello et al. (EJ, 2009): causal chain model; endogeneity issues√
M.Fort () Quantile Regression Last updated: May 20, 2014 55
Lecture 1 Lecture 2 Lecture 3
Discrete or Continuos Treatment
x treatment (education) y outcome (wages)
Binary case yi = yiC · (1− xi ) + yiT · xi = yiC + (yiT − yiC ) · xi
QY (τ) ≡ F−1yC (τ) · (1− x) + F−1
yT (τ)) · x
= F−1yC (τ) + (F−1
yT (τ)− F−1yC (τ)) · x
= α(τ) + β(τ) · x
Discrete case: p treatments yi = yiC + ∑pj=1(yij − yiC ) · xij = yiC + ∑p
j δij · xij
QY (τ) ≡ = F−1yC (τ) + ∑j (F
−1yj (τ)− F−1
yC (τ)) · x
= α(τ) + ∑j δj (τ) · xj
Continuous caseQY (τ) = α(τ) + γ(τ)x
M.Fort () Quantile Regression Last updated: May 20, 2014 56
Lecture 1 Lecture 2 Lecture 3
Summary: Interpretation of the Pattern of QR coefficients
Location Shift Location-Scale Shift
3.30
3.40
3.50
3.60
3.70
x
.1 .25 .5 .75 .9Quantile
Fig_treatment
-2.0
00.
002.
004.
006.
008.
00x
.1 .25 .5 .75 .9Quantile
Fig_treatment
see also fig.2.9 Koenker (2005)Shape Change
0.50
1.00
1.50
2.00
2.50
3.00
x
.1 .25 .5 .75 .9Quantile
Fig_treatment
M.Fort () Quantile Regression Last updated: May 20, 2014 57
Lecture 1 Lecture 2 Lecture 3
Interpretation of QTE
x treatment (education) y outcome (wages)
QY (τ) = α(τ) + δ(τ)x
We may interpret τ as a latent characteristic
E.g.: τ is an unobserved factor that determines wages (“ability”)
the QTE can be interpreted as “an interaction effect” between unobserved
“ability”/ “propensity to earn high wage”
Without further assumptions, “there is no way of knowing whether the
treatment actually operates in the manner described by δ(τ). In fact, the
treatment may miracoulously make weak subjects especially robust and turn
the strong into a jello. All we can observe from experimental evidence
however is the difference in the two marginal(s)(. . .). This is what the
quantile treatment effect does” (Koenker, 2005)
M.Fort () Quantile Regression Last updated: May 20, 2014 58
Lecture 1 Lecture 2 Lecture 3
With Randomization, We Identify Marginal cdf (& QTE)
Randomization Does Not Help to Learn theDistribution of the Impact
We Cannot Retrieve Joint DistributionsFrom Marginals w/o Additional Assumptions
M.Fort () Quantile Regression Last updated: May 20, 2014 59
Lecture 1 Lecture 2 Lecture 3
Martins and Pereira (2004)Table 1
Data-sets description, descriptive statistics and inequality measures
Country Data set Year No. of Educ. Exp. Log Wage Wage Wage ratios (1)
observationsMean C.V. 10th
percentile
50th
percentile
90th
percentile
9/1 9/5 5/1
Austria Mikrozensus 1993 7175 10.1 21.3 4.57 0.077 65.8 93.8 150 2.28 1.6 1.43
Denmark Long. Lab. Market Reg. 1995 4416 12 19.4 4.97 0.072 96.5 138.4 230.4 2.39 1.67 1.43
Finland Labour Force Survey 1993 1175 11.4 19.5 4.16 0.091 41.9 62.1 106.1 2.53 1.71 1.48
France Training Qualif. +
Employment Survey
1993 4606 11.4 21.9 10.92 0.036 19.8 29.8 54.1 2.73 1.81 1.5
Germany Socio-Economic Panel 1995 1070 11.9 24.7 3.4 0.103 2.64 2.92 3.01 1.45 1.09 1.33
Greece Household Budget Survey 1994 2096 10.1 21.9 6.93 0.092 527 1103 1907 3.62 1.73 2.09
Ireland ESRI Household Survey 1994 1903 12.4 23.8 1.74 0.351 2.5 5.9 11.9 4.74 2.01 2.36
Italy Survey of Household Income
and Wealth
1995 3441 10.1 22.9 2.52 0.163 7.8 12.5 20.8 2.67 1.67 1.6
Netherlands Structure of Earnings Survey 1996 49805 12.5 20 3.23 0.142 15.5 24.9 43.8 2.83 1.75 1.61
Norway Level of Living Survey 1995 870 12.2 20.9 4.65 0.071 71.4 101.1 158 2.21 1.56 1.42
Portugal Personnel Records 1995 28055 6.5 24.5 6.42 0.095 318 531 1456 4.58 2.74 1.67
Spain Wage Structure Survey 1995 118005 8.8 26 7.3 0.071 761 1410 2999 3.94 2.13 1.85
Sweden Level of Living Surveys 1991 1508 11.8 21.5 4.45 0.070 61 81 127 2.08 1.57 1.33
Switzerland Labour Force Survey 1995 6334 13.2 19.8 3.6 0.111 23.9 35.9 60.3 2.53 1.68 1.51
UK Family Expenditures Survey 1995 2183 12.3 22.6 2 0.245 4.1 7.3 13.5 3.33 1.85 1.8
USA Current Population Survey 1995 42347 12.6 18.5 2.33 0.202 5.5 10 19 3.45 1.82 1.9
See Appendix A for a more detailed characterisation of the data sets.
Results for France and Spain refer to yearly earnings. Hourly wages for France and Spain were computing assuming 1760 h/year. Inequality figures (1, 5, 9) refer to 10th,
50th and 90th percentiles.
P.S.Martin
s,P.T.Pereira
/LabourEconomics
11(2004)355–371
358
Data on (gross) hourly wages of full-time male workers; net wages for Austria,Greece, Italy
M.Fort () Quantile Regression Last updated: May 20, 2014 60
Lecture 1 Lecture 2 Lecture 3
Martins and Pereira (2004): Evidence
Fig. 2. Returns to education, QR and OLS.
P.S. Martins, P.T. Pereira / Labour Economics 11 (2004) 355–371362
Fig. 2. Returns to education, QR and OLS.
P.S. Martins, P.T. Pereira / Labour Economics 11 (2004) 355–371362
M.Fort () Quantile Regression Last updated: May 20, 2014 61
Lecture 1 Lecture 2 Lecture 3
Martins and Pereira (2004): Evidence
distribution (ninth–fifth deciles), the exceptions being Germany, Greece, Ireland and
the US.6
4. Empirical results
The empirical results were obtained by regressing the following version of the Mincer
(1974) equation, under Becker’s (1975) framework:
logyi ¼ ah þ bh � educi þ dh1 � expi þ dh2 � exp2i þ ui;
where i = 1,. . .,N (N being the number of observations for each year), h= 0.1,0.2,. . .,0.9is the quantile being analysed, y is the hourly wage, educ is the number of schooling
Fig. 2. (continued).
6 These results are generally in accordance with those presented at Gottschalk and Smeeding (1997). However,
a thorough comparison is impossible as both the time period and the earnings measure covered there are different.
P.S. Martins, P.T. Pereira / Labour Economics 11 (2004) 355–371 363
distribution (ninth–fifth deciles), the exceptions being Germany, Greece, Ireland and
the US.6
4. Empirical results
The empirical results were obtained by regressing the following version of the Mincer
(1974) equation, under Becker’s (1975) framework:
logyi ¼ ah þ bh � educi þ dh1 � expi þ dh2 � exp2i þ ui;
where i = 1,. . .,N (N being the number of observations for each year), h= 0.1,0.2,. . .,0.9is the quantile being analysed, y is the hourly wage, educ is the number of schooling
Fig. 2. (continued).
6 These results are generally in accordance with those presented at Gottschalk and Smeeding (1997). However,
a thorough comparison is impossible as both the time period and the earnings measure covered there are different.
P.S. Martins, P.T. Pereira / Labour Economics 11 (2004) 355–371 363
M.Fort () Quantile Regression Last updated: May 20, 2014 62
Lecture 1 Lecture 2 Lecture 3
Martins and Pereira (2004): Discussion
Evidence of incraesing returns over the conditional wage distribution
Interpretation: more skilled workers, receive higher returns to education;
additional schooling may increase with-group wage inequality (!)
Why?
Over-education: extensions of the lower tail of the wage
distribution of the highly educated
Schooling & ability non-trivial interaction: differences in ability relevant for the
highly educated but less so for the low educated (less dispersion)
Endogeneity: unobserved factors that impact upon pay differentials and
are heterogeneous across workers with any given skills level
M.Fort () Quantile Regression Last updated: May 20, 2014 63
Lecture 1 Lecture 2 Lecture 3
Chernozhucov and Hansen (2006): Evidence
provided by formal education.23 Interpreting the quantile index t as indexing ability,these results are also consistent with a simple model in which individuals acquireeducation up to the point where the cost equals the rate of return and cost dependsnegatively on ability.24 In this case, we would expect the returns to schooling to be
ARTICLE IN PRESS
0.2 0.4 0.6 0.8
0
0.1
0.2
0.3
0.4
0.5
IV-QR: Schooling
0.2 0.4 0.6 0.8
0.06
0.062
0.064
0.066
0.068
0.07
0.072
0.074
0.076
0.078
0.08
QR: Schooling
Fig. 1. The sample size is 329,509. Coefficient estimates are on the vertical axis, while the quantile index is
on the horizontal axis. The shaded region is the 95% confidence band estimated using robust standard
errors. The left panel contains estimates of the returns to schooling obtained through instrumental
variables quantile regression, and the right panel presents estimates of the effect of years of schooling on
earnings obtained through standard quantile regression. For comparison, the dashed line in the first panel
plots the schooling coefficient estimated through standard quantile regression. All estimates were
computed at 0.05 unit intervals for t 2 ½0:05; 0:95�:
Table 1
Process tests for the earning equation. Subsample size ¼ 5n2=5
Null hypothesis Kolmogorov–Smirnov statistic 90% Critical value 95% Critical value
No effect. að�Þ ¼ 0 4.563 2.572 2.935
Constant effect. að�Þ ¼ a 2.630 2.442 2.658
Dominance að�ÞX0 0.000 2.185 2.549
Exogeneity að�Þ ¼ aQRð�Þ 2.510 2.465 2.721
23The term ‘‘ability’’ is used to characterize the unobserved component of earnings, which likely
captures elements of ability and motivation as well as noise.24See, for example, Card (1999).
V. Chernozhukov, C. Hansen / Journal of Econometrics 132 (2006) 491–525512
M.Fort () Quantile Regression Last updated: May 20, 2014 64
Lecture 1 Lecture 2 Lecture 3
Chernozhucov and Hansen (2006): Evidence
provided by formal education.23 Interpreting the quantile index t as indexing ability,these results are also consistent with a simple model in which individuals acquireeducation up to the point where the cost equals the rate of return and cost dependsnegatively on ability.24 In this case, we would expect the returns to schooling to be
ARTICLE IN PRESS
0.2 0.4 0.6 0.8
0
0.1
0.2
0.3
0.4
0.5
IV-QR: Schooling
0.2 0.4 0.6 0.8
0.06
0.062
0.064
0.066
0.068
0.07
0.072
0.074
0.076
0.078
0.08
QR: Schooling
Fig. 1. The sample size is 329,509. Coefficient estimates are on the vertical axis, while the quantile index is
on the horizontal axis. The shaded region is the 95% confidence band estimated using robust standard
errors. The left panel contains estimates of the returns to schooling obtained through instrumental
variables quantile regression, and the right panel presents estimates of the effect of years of schooling on
earnings obtained through standard quantile regression. For comparison, the dashed line in the first panel
plots the schooling coefficient estimated through standard quantile regression. All estimates were
computed at 0.05 unit intervals for t 2 ½0:05; 0:95�:
Table 1
Process tests for the earning equation. Subsample size ¼ 5n2=5
Null hypothesis Kolmogorov–Smirnov statistic 90% Critical value 95% Critical value
No effect. að�Þ ¼ 0 4.563 2.572 2.935
Constant effect. að�Þ ¼ a 2.630 2.442 2.658
Dominance að�ÞX0 0.000 2.185 2.549
Exogeneity að�Þ ¼ aQRð�Þ 2.510 2.465 2.721
23The term ‘‘ability’’ is used to characterize the unobserved component of earnings, which likely
captures elements of ability and motivation as well as noise.24See, for example, Card (1999).
V. Chernozhukov, C. Hansen / Journal of Econometrics 132 (2006) 491–525512
Education affects wages
Is it a location-shift effect? No (QTE heterogeneity)
Is the effect unambiguously beneficial? No
Can we reject exogeneity? Yes
M.Fort () Quantile Regression Last updated: May 20, 2014 65
Lecture 1 Lecture 2 Lecture 3
How do C & H get at the (causal) QTE for US ?
They use an instrumental variable
How? We get there now
M.Fort () Quantile Regression Last updated: May 20, 2014 66
Lecture 1 Lecture 2 Lecture 3
E.c.d.f. F(y) Quantile q(τ) ≡ F−1(τ)0
.2.4
.6.8
1ec
dfx
-4 -2 0 2 4x
-4-2
02
4x
0 .2 .4 .6 .8 1ecdfx
Recalling two known result
1. For any known c.d.f. F (·), taken U ˜U (0, 1) and Y = F−1(U), then FY (y ) = F (y )∀y ∈ R
2. For any continuous r.v. Y with c.d.f. FY (·), taken U = FY (y ), then U ˜U (0, 1)
Y = q(D, UD) UD |D ˜ U (0, 1)
τ → q(d , τ) is the (structural) conditional quantile function
M.Fort () Quantile Regression Last updated: May 20, 2014 67
Lecture 1 Lecture 2 Lecture 3
Identification
Exogenous treatment D :
Prob[Y ≤ q(D, τ)] = Prob[UD ≤ τ|D ] = τ ∀τ ∈ (0, 1)
IVQTE by Chernozhucov and Hansen (2005), Z instrumental variable
Prob[Y ≤ q(D, τ)|Z ] = Prob[UD ≤ τ|D, Z ]
Prob[UD ≤ τ|Z ] = τ ∀τ ∈ (0, 1)
A crucial assumption in the IVQTE model is
The rank variable U is made invariant to D via Z
Rank invariance can be relaxed to rank similarity
Still, it rules out any systematic variation
of the rank across treatment states
M.Fort () Quantile Regression Last updated: May 20, 2014 68
Lecture 1 Lecture 2 Lecture 3
IVQTE Model: Representation & Assumptions
A1. Potential outcomes Yd = q(d , x , Ud ) is the potential outcome Ud ˜ U (0, 1),q(d , x , τ) strictly increasing in τ
A2. Independence |X , Ud ⊥ Z
A3. Selection |X = x , Z = z , D = δ(x , z , ν) ν random
ν is responsible for different choices of D of observationallyidentical individuals
E.g: ν an unobserved information component correlated with Uthat includes factors relevant in making the education decision
A4. Rank invariance (a) or rank similarity (b) (a) Ud = Ud ′ or (b) Ud ˜ Ud ′
E.g.: U is determined by ability and factors that do not vary with d
(a) makes the joint distrbution of potential outcomes {Yd}degenerate
A5. Observed variables Y = q(D, X , UD), D = δ(Z , X , ν),X ,Z
M.Fort () Quantile Regression Last updated: May 20, 2014 69
Lecture 1 Lecture 2 Lecture 3
IVQTE Model: Testable Implications
Testable Implication A1-A5
Prob[Y ≤ q(D, X , τ)|X , Z ] = Prob[UD ≤ τ|X , Z ] = τ ∀τ ∈ (0, 1)
UD ⊥ Z , X
equivalently
Prob[Y − q(D, X , τ) ≤ 0|X , Z ] = τ ∀τ ∈ (0, 1)
i.e.
QY−q(D,X ,τ)(τ|X , Z ) = 0 ∀τ ∈ (0, 1)
M.Fort () Quantile Regression Last updated: May 20, 2014 70
Lecture 1 Lecture 2 Lecture 3
IVQTE Model: In Practice
We are interested in QY (τ|X , D) = α(τ)D + x ′β(τ)
1. Run the “usual” first stage regression: D = x ′δ1 + z ′δ2 + ε and predict D
2. Run the quantile regression QY (τ|X , D) = α(τ)D + x ′β(τ)
3. For a grid of values of α(τ) around the estimated α(τ) from step 2.
run the quantile regression
QY−α(τ)D(τ|X , D, Z ) = x ′β(τ) + γ(τ)Z
4. Choose α(τ) as the value of α(τ) for which |γ(τ)| is closer to zero
5.α(τ), β(α(τ), τ)
is the estimator of the parameters of the conditional quantile function we aimat
Routines developed by Hansen available in Ox/MATLAB
M.Fort () Quantile Regression Last updated: May 20, 2014 71
Lecture 1 Lecture 2 Lecture 3
IVQTE Model, In Practice: Remarks
Rank invariance makes the joint distribution of potential outcomes not truly
multivariate: this does not restrict QTE but affects interpretation
under the IVQTE model assumptions, the QTE is the effect for an individual
that is in the same quantile τ of the treated and control distribution
Rank invariance may be more plausible if the X set is large
Check the pattern of the objective function on the grid-seach:
a flat pattern may suggest identification problems
C & H 2008 propose an alternative estimation method (dual inference
procedure) that is robust to weak instruments; also this alternative method
involves a grid-search step; direct and dual inference procedures will deliver
similar results when the correlation between Z (iv) and treatment is strong
M.Fort () Quantile Regression Last updated: May 20, 2014 72
Lecture 1 Lecture 2 Lecture 3
IVQTE Model, In Practice: Another Example
Figure 4. Treatment effect when the treatment (additional schooling) is assumed to be exogenous. Females only.
-.4
-.3
-.2
-.1
0
.1 .2 .3 .4 .5 .6 .7 .8 .9Quantile
CI 95%, QR QR coeff
FEMALES -7/+7 windowCoefficient of yedu, QR
20
25
30
35
40
.1 .2 .3 .4 .5 .6 .7 .8 .9Quantile
CI 95%, QR QR coeff
conditional distribution of BMI, FEMALES -7/+7 window QR Intercept
Conditional quantiles baseline country (U.K.) at average value of the covariates &zero education
From Brunello, Fabbri, Fort, IZA WP2009; revised in JOLE 2013M.Fort () Quantile Regression Last updated: May 20, 2014 73
Lecture 1 Lecture 2 Lecture 3
Years of Schooling and BMI of European FemalesFigure 5. Treatment effect when the treatment (years of schooling) is assumed to be exogenous and when is treated as endogenous and instrumented with years of compulsory education (ycomp). Females only.
-1.5
-1.2
5-1
-.75
-.5
-.25
0.2
5.5
.1 .2 .3 .4 .5 .6 .7 .8 .9Quantile
CI 95%, IVQR CI 95%, QRIVQR coeff QR coeff
on the conditional distribution of BMI, FEMALES -7/+7 window - QR and IVQR- Effect of years of schooling
-1.5
-1-.
50
.51
1.5
.1 .2 .3 .4 .5 .6 .7 .8 .9Quantile
CI 95%, IVQR CI 95%, QRIVQR coeff QR coeff
on the conditional distribution of BMI, FEMALES -7/+7 window - QR and IVQR- Effect of years of schooling
Education is protective
There is heterogenity but impact is not monotone (as in standard QR)
Imprecise Estimates (as standard QR would show)
From Brunello, Fabbri, Fort, IZA WP2009; revised in JOLE 2013M.Fort () Quantile Regression Last updated: May 20, 2014 74
Lecture 1 Lecture 2 Lecture 3
Years of Schooling and BMI of European Females: Testing
Null hypothesis K-S statistic 90% crit. value 95% crit. value
No effect α(·) = 0 2.799 2.739 2.978
Constant effect α(·) = α(0.5) 1.086 2.748 3.066
Dominance α(·) ≤ 0 0 2.371 2.644
Exogeneity α(·) = αQR (·) 1.542 2.713 2.994
M.Fort () Quantile Regression Last updated: May 20, 2014 75
Lecture 1 Lecture 2 Lecture 3
Years of Schooling and BMI of European Females: Grid-search patternsFigure 6: Objective function at selected quantiles. Quantile: 0.35
Matlab: Variable search grid Matlab: Fixed search grid step 0.025
−0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.3 0.40
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
−1.5 −1 −0.5 0 0.5 1 1.50
0.2
0.4
0.6
0.8
1
1.2
1.4
Matlab: Fixed search grid step 0.05 Ox: Fixed search grid step 0.025
−1.5 −1 −0.5 0 0.5 1 1.50
0.2
0.4
0.6
0.8
1
1.2
1.4
−1.50 −1.25 −1.00 −0.75 −0.50 −0.25 0.00 0.25 0.50 0.75 1.00 1.25 1.50
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45 norm(gamma(alpha)) × alpha
15
M.Fort () Quantile Regression Last updated: May 20, 2014 76
Lecture 1 Lecture 2 Lecture 3
Years of Schooling and BMI of European Females: Grid-search patterns
Figure 9: Objective function at selected quantiles. Quantile: 0.50Matlab: Variable search grid Matlab: Fixed search grid step 0.025
−0.7 −0.6 −0.5 −0.4 −0.3 −0.2 −0.1 0 0.1 0.2 0.30
0.1
0.2
0.3
0.4
0.5
0.6
0.7
−1.5 −1 −0.5 0 0.5 1 1.50
0.5
1
1.5
Matlab: Fixed search grid step 0.05 Ox: Fixed search grid step 0.025
−1.5 −1 −0.5 0 0.5 1 1.50
0.5
1
1.5
−1.50 −1.25 −1.00 −0.75 −0.50 −0.25 0.00 0.25 0.50 0.75 1.00 1.25 1.50
0.05
0.10
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50norm(gamma(alpha)) × alpha
18
M.Fort () Quantile Regression Last updated: May 20, 2014 77
Lecture 1 Lecture 2 Lecture 3
Years of Schooling and BMI of European Females: Grid-search patterns
Figure 13: Objective function at selected quantiles. Quantile: 0.70Matlab: Variable search grid Matlab: Fixed search grid step 0.025
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.40
0.1
0.2
0.3
0.4
0.5
0.6
0.7
−1.5 −1 −0.5 0 0.5 1 1.50
0.5
1
1.5
2
2.5
Matlab: Fixed search grid step 0.05 Ox: Fixed search grid step 0.025
−1.5 −1 −0.5 0 0.5 1 1.50
0.5
1
1.5
2
2.5
−1.50 −1.25 −1.00 −0.75 −0.50 −0.25 0.00 0.25 0.50 0.75 1.00 1.25 1.50
0.1
0.2
0.3
0.4
0.5
0.6
0.7 norm(gamma(alpha)) × alpha
22
M.Fort () Quantile Regression Last updated: May 20, 2014 78
Lecture 1 Lecture 2 Lecture 3
JTPA evalutation within the C&H IVQTE model
Unlike the case considered above, we do not find large differences between the direct and dual inferenceprocedures for IVQR in this case. The similarity between the two approaches is not unexpected due to thestrong correlation between the instrument and endogenous regressor. The close agreement here furthersuggests that not much is lost by considering the dual procedure in cases where identification is strong. It alsoprovides further support for the argument that the differences detected in the previous section are due to weakidentification. Given the robustness of the dual procedure to the presence of weak instruments and its simplecomputation, it seems that this inference procedure will be preferable to the standard procedure in many cases.
The dual confidence bounds are further illustrated in Fig. 5, which plots the IVQR objective function W nðaÞover the parameter space A. a is plotted on the horizontal axis, and the vertical axis shows W nðaÞ. Thehorizontal line in each graph is the 95% critical value for the dual inference procedure, so all points lyingbelow the horizontal line belong to the confidence region for aðtÞ. The graphs in Fig. 3 differ markedly fromthose in Figs. 1 and 2. In particular, all of the objective functions, and hence confidence regions, in Fig. 3 look
ARTICLE IN PRESS
0.2 0.3 0.4 0.5 0.6 0.7 0.8-2000
0
2000
4000
6000
8000QR: Training Effect
τ
Tra
inin
g E
ffect
0.2 0.3 0.4 0.5 0.6 0.7 0.8-0.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4QR: Percentage Impact of Training
τ
Tra
inin
g E
ffect
0.2 0.3 0.4 0.5 0.6 0.7 0.8-2000
0
2000
4000
6000
8000IVQR: Training Effect
τ
Tra
inin
g E
ffect
0.2 0.3 0.4 0.5 0.6 0.7 0.8-0.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4IVQR: Percentage Impact of Training
τ
Tra
inin
g E
ffect
Fig. 4. Estimates of the training impact by QR and by IVQR. Notes: Left column. QR and IVQR estimates of the impact of a job training
program on earnings for t ¼ 0:15, 0.25, 0.50, 0.75, and 0.85. The top panel reports the QR estimate of the training impact, and the bottom
panel reports the IVQR results. In each figure, the solid line represents the point estimates, and the dashed (- -) line represents the 95%
confidence interval formed using the direct inference approach. For the IVQR results, the dash-dot (-.) line represents the 95% confidence
bound constructed using the dual inference procedure described in the text. In both figures, the horizontal axis measures the quantile index
t, and the vertical axis is the impact of training on earning quantiles measured in dollars. Models include covariates as specified in the text,
and the sample size is 5102. Right column. QR and IVQR estimates of the percentage impact of training for t ¼ 0:15, 0.25, 0.50, 0.75, and0.85. The top panel reports the QR estimate of the training impact, and the bottom panel reports the IVQR results. Percentage impacts are
for moving from non-training to training and all other covariates are evaluated at their sample mean. In both figures, the horizontal axis
measures the quantile index t, and the vertical axis is the percentage impact of training.
V. Chernozhukov, C. Hansen / Journal of Econometrics 142 (2008) 379–398 393
M.Fort () Quantile Regression Last updated: May 20, 2014 79
Lecture 1 Lecture 2 Lecture 3
JTPA evalutation within the LATE-QTE modelquantiles of trainee earnings 105
TABLE IIIQuantile Treatment Effects and 2SLS Estimates
Dependent Variable: 30-month Earnings
Quantile
2SLS 0.15 0.25 0.50 0.75 0.85
A. MenTraining 1,593 121 702 1,544 3,131 3,378
(895) (475) (670) (1,073) (1,376) (1,811)% Impact of Training 8.55 5.19 12.0 9.64 10.7 9.02
High school or GED 4,075 714 1,752 4,024 5,392 5,954(573) (429) (644) (940) (1,441) (1,783)
Black −2�349 −171 −377 −2�656 −4�182 −3�523(625) (439) (626) (1,136) (1,587) (1,867)
Hispanic 335 328 1,476 1,499 379 1,023(888) (757) (1,128) (1,390) (2,294) (2,427)
Married 6,647 1,564 3,190 7,683 9,509 10,185(627) (596) (865) (1,202) (1,430) (1,525)
Worked less than 13 −6�575 −1�932 −4�195 −7�009 −9�289 −9�078weeks in past year (567) (442) (664) (1,040) (1,420) (1,596)
Constant 10,641 −134 1,049 7,689 14,901 22,412(1,569) (1,116) (1,655) (2,361) (3,292) (7,655)
B. WomenTraining 1,780 324 680 1,742 1,984 1,900
(532) (175) (282) (645) (945) (997)% Impact of Training 14.6 35.5 23.1 18.4 10.1 7.39
High school or GED 3,470 262 768 2,955 5,518 5,905(342) (178) (274) (643) (930) (1026)
Black −554 0 −123 −401 −1�423 −2�119(397) (204) (318) (724) (949) (1,196)
Hispanic −1�145 −73 −138 −1�256 −1�762 −1�707(488) (217) (315) (854) (1,188) (1,172)
Married −652 −233 −532 −796 38 −109(437) (221) (352) (846) (1,069) (1,147)
Worked less than 13 −5�329 −1�320 −3�516 −6�524 −6�608 −5�698weeks in past year (370) (254) (430) (781) (931) (969)
AFDC −2�997 −406 −1�240 −3�298 −3�790 −2�888(378) (189) (301) (743) (1,014) (1,083)
Constant 10,538 984 3,541 9,928 15,345 20,520(828) (547) (837) (1,696) (2,387) (1,687)
Note: The table reports 2SLS and QTE estimates of the effect of training on earnings. Assignment status is used as an instrumentfor training. The specification also includes indicators for service strategy recommended, age group, and second follow-up survey.Robust standard errors are reported in parentheses.
quantile. The estimates at low quantiles are substantially smaller than the corre-sponding quantile regression estimates, and they are small in absolute terms. Forexample, the QTE estimate (standard error) of the effect on the .15 quantile formen is $121 (475), while the corresponding quantile regression estimate is $1,187(205). Similarly, the QTE estimate (standard error) of the effect on the .25 quan-tile for men is $702 (670), while the corresponding quantile regression estimate is
M.Fort () Quantile Regression Last updated: May 20, 2014 80
Lecture 1 Lecture 2 Lecture 3
How do AAI get at the (causal) effect of JTPA on earnings ?They use an instrumental variable
extending the IV-LATE identification approachto the identification of QTE
Imbens & Rubin, Review of Economic Studies 1997Abadie, Journal of the America Statistical Association 2002
Abadie, Angrist, Imbens, Econometrica 2002
Remarks
No rank invariance/rank similarity assumption
QTE for compliers only, not for the average individual in the population
LATE-QTE can accomodate only binary treatment and binary instrument
M.Fort () Quantile Regression Last updated: May 20, 2014 81
Lecture 1 Lecture 2 Lecture 3
Identification: Imbens, Rubin (1997) Abadie, Angrist, Imbens (2002) (AAI02)
Z binary instrumental variable, eg. random assignment to JTPA
D binary treatment variable, eg. receiving training under JTPA
Dz is the potential treatment under assignment z
Y outcome variabile. eg earnings
Yd is the potential outcome under assignment d
Key assumptions:
comparisons by Z , identify the effect of ZZ does not directly affect outcomesalmost surely, individuals do not do the opposite of their assignment
M.Fort () Quantile Regression Last updated: May 20, 2014 82
Lecture 1 Lecture 2 Lecture 3
Identification: Imbens, Rubin (1997) Abadie, Angrist, Imbens (2002) (AAI02)
Compliance Types
Di (Zi = 0)
0 1
0 never-taker defier∀j , D(Zj ) = 0 ∀i , D(Zj ) = 1− Zj
Dj (Zj = 1)
1 complier always-taker∀j, D(Zj) = Zj ∀j , D(Zj ) = 1
The potential treatment status D = hD(Z , ε) can be seen as a type indicator
D varies with Z for compliers and defiers
The presence of defiers in the population is ruled out by assumption
M.Fort () Quantile Regression Last updated: May 20, 2014 83
Lecture 1 Lecture 2 Lecture 3
Identification: Imbens, Rubin (1997) Abadie, Angrist, Imbens (2002) (AAI02)
Compliance Types by Observed Treatment Status and Assignment to theTreatment given Monotonicity
Zj0 1
0 never-taker or never-takercomplier
Dj1 always-taker always-taker or
complier
Under the model assumptions, the type indicator defines a partition
The observed outcome distribution Y |D are mixtures of the potential
outcome distribution of the population types with identified proportions
Compliers cannot be identified from observational data
M.Fort () Quantile Regression Last updated: May 20, 2014 84
Lecture 1 Lecture 2 Lecture 3
Inference: Abadie, Angrist, Imbens (2002) (AAI02)
We cannot condition on the subsample of compliers to get QTEs
Estimation involves running a ‘weighted’ quantile regression that allows to
estimate the marginal quantiles of the potential outcome distributions for
compliers
Estimation requires an auxiliary firs step estimation of the weights
Weights are a function of E [Z |X , D, Y ]
In practice: STATA code by Froelich and Melly
M.Fort () Quantile Regression Last updated: May 20, 2014 85
Lecture 1 Lecture 2 Lecture 3
JTPA evalutation within the LATE-QTE modelquantiles of trainee earnings 105
TABLE IIIQuantile Treatment Effects and 2SLS Estimates
Dependent Variable: 30-month Earnings
Quantile
2SLS 0.15 0.25 0.50 0.75 0.85
A. MenTraining 1,593 121 702 1,544 3,131 3,378
(895) (475) (670) (1,073) (1,376) (1,811)% Impact of Training 8.55 5.19 12.0 9.64 10.7 9.02
High school or GED 4,075 714 1,752 4,024 5,392 5,954(573) (429) (644) (940) (1,441) (1,783)
Black −2�349 −171 −377 −2�656 −4�182 −3�523(625) (439) (626) (1,136) (1,587) (1,867)
Hispanic 335 328 1,476 1,499 379 1,023(888) (757) (1,128) (1,390) (2,294) (2,427)
Married 6,647 1,564 3,190 7,683 9,509 10,185(627) (596) (865) (1,202) (1,430) (1,525)
Worked less than 13 −6�575 −1�932 −4�195 −7�009 −9�289 −9�078weeks in past year (567) (442) (664) (1,040) (1,420) (1,596)
Constant 10,641 −134 1,049 7,689 14,901 22,412(1,569) (1,116) (1,655) (2,361) (3,292) (7,655)
B. WomenTraining 1,780 324 680 1,742 1,984 1,900
(532) (175) (282) (645) (945) (997)% Impact of Training 14.6 35.5 23.1 18.4 10.1 7.39
High school or GED 3,470 262 768 2,955 5,518 5,905(342) (178) (274) (643) (930) (1026)
Black −554 0 −123 −401 −1�423 −2�119(397) (204) (318) (724) (949) (1,196)
Hispanic −1�145 −73 −138 −1�256 −1�762 −1�707(488) (217) (315) (854) (1,188) (1,172)
Married −652 −233 −532 −796 38 −109(437) (221) (352) (846) (1,069) (1,147)
Worked less than 13 −5�329 −1�320 −3�516 −6�524 −6�608 −5�698weeks in past year (370) (254) (430) (781) (931) (969)
AFDC −2�997 −406 −1�240 −3�298 −3�790 −2�888(378) (189) (301) (743) (1,014) (1,083)
Constant 10,538 984 3,541 9,928 15,345 20,520(828) (547) (837) (1,696) (2,387) (1,687)
Note: The table reports 2SLS and QTE estimates of the effect of training on earnings. Assignment status is used as an instrumentfor training. The specification also includes indicators for service strategy recommended, age group, and second follow-up survey.Robust standard errors are reported in parentheses.
quantile. The estimates at low quantiles are substantially smaller than the corre-sponding quantile regression estimates, and they are small in absolute terms. Forexample, the QTE estimate (standard error) of the effect on the .15 quantile formen is $121 (475), while the corresponding quantile regression estimate is $1,187(205). Similarly, the QTE estimate (standard error) of the effect on the .25 quan-tile for men is $702 (670), while the corresponding quantile regression estimate is
M.Fort () Quantile Regression Last updated: May 20, 2014 86
Lecture 1 Lecture 2 Lecture 3
JTPA evalutation within the C&H IVQTE model
Unlike the case considered above, we do not find large differences between the direct and dual inferenceprocedures for IVQR in this case. The similarity between the two approaches is not unexpected due to thestrong correlation between the instrument and endogenous regressor. The close agreement here furthersuggests that not much is lost by considering the dual procedure in cases where identification is strong. It alsoprovides further support for the argument that the differences detected in the previous section are due to weakidentification. Given the robustness of the dual procedure to the presence of weak instruments and its simplecomputation, it seems that this inference procedure will be preferable to the standard procedure in many cases.
The dual confidence bounds are further illustrated in Fig. 5, which plots the IVQR objective function W nðaÞover the parameter space A. a is plotted on the horizontal axis, and the vertical axis shows W nðaÞ. Thehorizontal line in each graph is the 95% critical value for the dual inference procedure, so all points lyingbelow the horizontal line belong to the confidence region for aðtÞ. The graphs in Fig. 3 differ markedly fromthose in Figs. 1 and 2. In particular, all of the objective functions, and hence confidence regions, in Fig. 3 look
ARTICLE IN PRESS
0.2 0.3 0.4 0.5 0.6 0.7 0.8-2000
0
2000
4000
6000
8000QR: Training Effect
τ
Tra
inin
g E
ffect
0.2 0.3 0.4 0.5 0.6 0.7 0.8-0.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4QR: Percentage Impact of Training
τ
Tra
inin
g E
ffect
0.2 0.3 0.4 0.5 0.6 0.7 0.8-2000
0
2000
4000
6000
8000IVQR: Training Effect
τ
Tra
inin
g E
ffect
0.2 0.3 0.4 0.5 0.6 0.7 0.8-0.2
0
0.2
0.4
0.6
0.8
1
1.2
1.4IVQR: Percentage Impact of Training
τ
Tra
inin
g E
ffect
Fig. 4. Estimates of the training impact by QR and by IVQR. Notes: Left column. QR and IVQR estimates of the impact of a job training
program on earnings for t ¼ 0:15, 0.25, 0.50, 0.75, and 0.85. The top panel reports the QR estimate of the training impact, and the bottom
panel reports the IVQR results. In each figure, the solid line represents the point estimates, and the dashed (- -) line represents the 95%
confidence interval formed using the direct inference approach. For the IVQR results, the dash-dot (-.) line represents the 95% confidence
bound constructed using the dual inference procedure described in the text. In both figures, the horizontal axis measures the quantile index
t, and the vertical axis is the impact of training on earning quantiles measured in dollars. Models include covariates as specified in the text,
and the sample size is 5102. Right column. QR and IVQR estimates of the percentage impact of training for t ¼ 0:15, 0.25, 0.50, 0.75, and0.85. The top panel reports the QR estimate of the training impact, and the bottom panel reports the IVQR results. Percentage impacts are
for moving from non-training to training and all other covariates are evaluated at their sample mean. In both figures, the horizontal axis
measures the quantile index t, and the vertical axis is the percentage impact of training.
V. Chernozhukov, C. Hansen / Journal of Econometrics 142 (2008) 379–398 393
M.Fort () Quantile Regression Last updated: May 20, 2014 87
Lecture 1 Lecture 2 Lecture 3
FrameworkHeckman et al. (1997)
Yi1, Yi0 are the potential outcomes
Di is the (binary) treatment indicator
βi = Yi1 − Yi0 impact of the treatment
FY0(y), FY1
(y), , FY1,Y0(y1, y0) marginal & joint distribution of potential
outcomes
Doksum’s quantile treatment effect
δ(τ) = F−1Y1
(τ)− F−1Y0
(τ), 0 < τ < 1
Taking τ = FY0(y0) and changing variables
δ(x) = F−1Y1
(FY0(y0))− y0
M.Fort () Quantile Regression Last updated: May 20, 2014 88
Lecture 1 Lecture 2 Lecture 3
Exploring reasonable restrictions
1. Marginals can be identified under randomization
2. The joint distribution can be identified if it is degenerate, i.e.
FY0(y0) = FY1
(y0 + β)
(not interesting though why?)
3. Under perfect rank dependence, the constant treatment effect assumption in2. can be relaxed and replaced
β(y0) = F−1Y1
(FY0(y0))− y0
4. Bounds for the joint distribution (Hoeddfing(1940) & Frechet (1951))
max [FY1(y1|D = 1) + FY0
(y0|D = 1)− 1, 0] ≤FY1,Y0
(y1, y0|D = 1) ≤min[FY1
(y1|D = 1),FY0(y0|D = 1)
M.Fort () Quantile Regression Last updated: May 20, 2014 89
Lecture 1 Lecture 2 Lecture 3
Identification exploiting information from revealed preferences 1/3Heckman & Honore HH (1990) Econometrica
Consider a (Roy) model to explain occupational choice and its consequences for
the distribution of earnings when individuals differ in their occupation
specific skills endowments.
(One could think about education/training choices instead of occupation)
HH(90) show that in the model self-selection leads to reduced inequality
in earnings compared to an economy with random assignments to jobs.
HH(90) show under which assumptions is possible to determine the correlation of
latent skills or potential wages of persons even if one observes only one skill
of any person.
M.Fort () Quantile Regression Last updated: May 20, 2014 90
Lecture 1 Lecture 2 Lecture 3
Identification exploiting information from revealed preferences: the model
2/3
Income maximizing agents with 2 skills S1 > 0, S0 > 0
Skill prices π1, π0
Individuals differ in endowments; agents know their own
F (s1, s0) population distribution of skills
Skill i is useful only in sector i
Agent chooses sector 1 if W1 ≡ π1S1 > π0S0 ≡ W0
D = 1(W1 ≥ W0)
from Heckman & Honore HH (1990) Econometrica
M.Fort () Quantile Regression Last updated: May 20, 2014 91
Lecture 1 Lecture 2 Lecture 3
Identification exploiting information from revealed preferences: the model
3/3Heckman & Honore HH (1990) Econometrica, Theorem 9
Under the previous assumptions, if we only observe
Z = max(S1, π0S0) and π0 takes all values in (0, ∞), F is identifiable.
Pr(max(S1, π0S0) ≤ x) ∀x , π0 is known
Pr(max(S1, π0S0) ≤ x) = Pr(S1 ≤ x , S0 ≤ x/π0)
thus taking s1 = x , π0 = s1/s0; s0 = s1/π0
F (s1, s0) = Pr(S1 ≤ s1, S2 ≤ s1/π0)
F (s1, s0) can be determined along the ray s1 = π0s0 as π0 varies
M.Fort () Quantile Regression Last updated: May 20, 2014 92
Lecture 1 Lecture 2 Lecture 3
Comments
The above model assumes that only ‘gains’ determine participation in one
sector and that agents maximize profits
The participation rule implies a tight link between outcomes :
in the participating population the mass of the W1 conditional on W0
will be on the right of w0
To sum up: to solve the identification problem we may
(a) assume some (in)dependence
(b) exploit dependence induced by choices
M.Fort () Quantile Regression Last updated: May 20, 2014 93
Lecture 1 Lecture 2 Lecture 3
References those who fit in the slide . . .
Abadie, A. et al. (2002) Instrumental Variable Estimates of the Effect of SubsidizedTraining on the Quantiles of Trainee Earnings, Econometrica, Vol. 70 (1), pp. 91-117
Brunello et al. (2009)‘Years of Schooling, Human Capital and the Body Mass Index ofEuropean Females’, IZA dp4667
Brunello et al. (2009) ‘Changes in Compulsory Schooling,Education and the Distribution ofWages in Europe’, Economic Journal 110 pp. 516-539
Chernozhucov, V. et al. (2005) ‘An IV Model of Quantile TreatmentEffects’,Econometrica, Vol. 73 (1), pp. 245-261
Chernozhucov, V. et al. (2006) ‘Instrumental Quantile Regression Inference for Structuraland Treatment Effect Models’, Journal of Econometrics pp. 491-525
Chernozhucov, V. et al. (2008) ‘Instrumental Variable Quantile Regression: A RobustInference Approach’, Journal of Econometrics pp. 379-398
Heckman et al. (1990) ‘The Empirical Content of the Roy Model’ ,Econometrica,58(5):1121-1149
Heckman et al. (1997) ‘Making The Most Out of Programme Evaluations and SocialExperiments: Accounting for Heterogeneity in Programme Impacts’ , RES,64(4):487-535.
Imbens et al. (1997) ‘Estimating Outcome Distributions for Compliers in InstrumentalVariable Models’, Review of Economic Studies, pp. 555-574
Martins et al. (2004) ‘Does Education Reduce Wage Inequality? Quantile RegressionEvidence from 16 Countries’, Labour Economics, Vol. 11. pp. 355-371
M.Fort () Quantile Regression Last updated: May 20, 2014 94
Lecture 1 Lecture 2 Lecture 3
So far . . .
QRs are useful to characterize the dependence between an outcome Y and a
treatment D in the presence of heterogeneity of the treatment
impact among observationally equivalent individuals
Quantile treatment effects (QTEs) describes the difference between quantiles
of the outcome distribution under different levels of the treatment
at a given quantile.
Identification of QTE requires identification of the marginal distribution of
potential outcomes Yd
Under rank invariance or rank similarity, the identification of the marginal
distributions allows to identify the impact distribution
M.Fort () Quantile Regression Last updated: May 20, 2014 95
Lecture 1 Lecture 2 Lecture 3
Today
We move a step further by recognizing that in principle both Y and D can be
decomposed in two parts (a deterministic one and a stochastic one)
The two parts need not be additively separable
Thus,
We need to define a treatment parameter that allows for
‘more sources of stochastic variation’
Economic models typically place restrictions
on the number of independent sources of stochastic variation in the model
and on the distribution of these stochastic components and X
M.Fort () Quantile Regression Last updated: May 20, 2014 96
Lecture 1 Lecture 2 Lecture 3
The exogenous impact function (EIF)
QTEs describe and detect the heterogeneity in the relationship of Y and D
over the distribution of Y
This may be restrictive if there are reasons to believe that the dependence
between the two variables varies also over the distribution of D
Chesher (2003) suggests to look at the exogenous impact function. The EIF
describes the rate at which Y varies as the value of D is marginally increased
at specific quantiles of the distribution of the stochastic components
determining both Y and D (and maybe at specific values of the ex. cov.).
The EIF has a causal interpretation if one is able to shift D without affecting
the other elements.
M.Fort () Quantile Regression Last updated: May 20, 2014 97
Lecture 1 Lecture 2 Lecture 3
Usual example: education (D) and wages (Y )
Some studies find that the returns to education increase across deciles of the
conditional distribution of earnings without addressing endogeneity issues,
e.g. Martins & Pereira (2004)
U.S. data: mixed findings. Ability & education are complements (Arias et al.,
2001) or substitutes (Chernozhucov et al., 2006)
U.K. data: substitutability between education and cognitive and
non-cognitive ability (Denny et al., 2007)
Data on Europe: substitutability education & ability (Brunello et al. 2009)
! allow two unobserved sources of stochastic variation
M.Fort () Quantile Regression Last updated: May 20, 2014 98
Lecture 1 Lecture 2 Lecture 3
Brunello, Fort, Weber,Economic Journal, 2009
BFW study the causal effects of education on the distribution of earnings
using data from 12 European countries exploiting changes in minimum school
leaving age (MSLA) for identification
BFW find that
conditional wage inequality is reduced by marginal increases in education
education & ability act as substitutes in the earnings function
Policy implications: investing in the less fortunate (because of poor labour
market fortune or poor talent) could pay off both on efficiency and equity
grounds
M.Fort () Quantile Regression Last updated: May 20, 2014 99
Lecture 1 Lecture 2 Lecture 3
The Causal Chain Model in Steps
1. We start from a simple linear simultaneous (recursive) equation model with 2
equations and we consider the identification & estimation of the treatment
parameter in such model
2. We recall the control function version of the estimator
3. We relax the additive separability assumption on the stochastic components
in the model and we introduce a generalization of the structural model that
allows to detect and describe heterogeneous structural parameters
(causal chain model)
4. We discuss the assumptions under which the parameters (of interest) in such
model are identified
5. We discuss briefly how the (identified) parameter can be estimated
M.Fort () Quantile Regression Last updated: May 20, 2014 100
Lecture 1 Lecture 2 Lecture 3
Notation in Our Running Example
Two endogenous variables Y , wages and D years of education
Both scalar, (approximately) continuous variables.
A matrix of exogenous regressors X (gender, age, country fe, trends,. . .)
The instrument Z , mandatory schooling years
The usual recursive modely = θs + X β1 + u
s = X β2 + zδ + ν
Cov(X , u) = 0 & Cov(X , ν) = 0, Cov(z , ν) = 0 Cov(u, v) 6= 0
Crucial feature: exclusion restriction
Identification: rank & order conditions
M.Fort () Quantile Regression Last updated: May 20, 2014 101
Lecture 1 Lecture 2 Lecture 3
The Usual Recursive Model
Recursive model y , s, z scalars u, ν are correlated
y = θs + X β1 + u
s = X β2 + δz + ν
Reduced form: y1 = X
α1︷ ︸︸ ︷(θβ2 + β1) +z
α2︷︸︸︷(θδ) +
u︷ ︸︸ ︷(u + θν)
“Control function approach”, W = [X z ] , β = [β2 δ]:
y = θs + X β1 + γν + ε
y = θs + X β1 + γ[W (β− β) + ν] + ε
M.Fort () Quantile Regression Last updated: May 20, 2014 102
Lecture 1 Lecture 2 Lecture 3
The Usual Recursive Model: Remarks
The model specifies the relationship btw the stochastic component and X , Z
There is an exclusion restriction & no feedback: the model has a triangular
structure
The endogeneity is driven by the correlation between u and ν (latent ν)
u and ν enter additively
The number of “independent” sources of stochastic variation in the model
equals the number of observed variables
The treatment parameter θ is invariant wrt to the stochastic components of
the model: s exerts a location shift on the distribution of y
M.Fort () Quantile Regression Last updated: May 20, 2014 103
Lecture 1 Lecture 2 Lecture 3
The Usual Recursive Model: Remarks (continued)
There is a one-to-one correspondence between quantiles of u and v and the
conditional quantiles of y and s
To come up with an estimator, we describe the links between observed
quantities and unknown quantities (parameters)
The control function approach & the 2 stage approach deliver the same
estimator for θ
M.Fort () Quantile Regression Last updated: May 20, 2014 104
Lecture 1 Lecture 2 Lecture 3
Properties of QR estimator: Equivariance
It guarantee a coherent interpretation of the results when the data or the
underlying model are modified not in an essential way.
Scale equivariance
For any a > 0, β(τ; ay , X ) = aβ(τ; y , X ) and
β(τ;−ay , X ) = aβ(1− τ; y , X )
Regression Shift
For any γ ∈ Rp, β(τ; y + X γ, X ) = β(τ; y , X ) + γ
Reparametrization of Design
For any |A| 6= 0, β(τ; y + AX , X ) = A−1 β(τ; y , X )
The analog invariance properties hold in mean regression
M.Fort () Quantile Regression Last updated: May 20, 2014 105
Lecture 1 Lecture 2 Lecture 3
Equivariance to Monotone Transformations
For any monotone function h(·), conditional quantile functions are equivariant
Quanth(Y )(τ|x) = h(QuantY (τ|x))
i.e. the quantiles of the transformed variable are simply the transformed quantiles
of the original variable
The analog property does not hold in mean regression
M.Fort () Quantile Regression Last updated: May 20, 2014 106
Lecture 1 Lecture 2 Lecture 3
Simple Causal Chain Model: Quantiles of the Endogenous Variables
y = θs + X ′β1 + ε + λν
s = X ′β2 + Z δ + ν
QS |X ,Z (τs |τν, x , z) = x ′β2 + zδ + Qν(τν)
Reduced form:
QY |S,X ,Z (τ1|τε, s, x , z) = θs + x ′β1 + Qε(τε) + λ(s − x ′β2 − zδ)
Partial derivatives∇zQS |X ,Z (τ2|τν, x , z) = δ
∇sQY |S ,X ,Z (τ1|τε,
s︷ ︸︸ ︷QS |X ,Z (τ2|τν, x , z), x , z) = θ + λ
∇zQY |S ,X ,Z (τ1|τε,
s︷ ︸︸ ︷QS |X ,Z (τ2|τν, x , z), x , z) = −λδ
M.Fort () Quantile Regression Last updated: May 20, 2014 107
Lecture 1 Lecture 2 Lecture 3
Simple Causal Chain Model: Quantiles of the Endogenous Variables
QY |S,X ,Z (τ1|τε, s, x , z) = θs + x ′β1 + Qε(τε) + λ(s − x ′β2 − zδ)
∇sQY |S ,X (τ1|τε, s, x) = θ
We can retrieve θ from (identifying equation)
θ = ∇sQY |S ,X ,Z (τ1|τε,
s︷ ︸︸ ︷QS |X ,Z (τ2|τν, x , z), x , z) +
QY |S ,X ,Z (τ1|τε, s, x , z)
∇zQS |X ,Z (τ2|τν, x , z)
= (θ + λ) +−λδ
δ
The equation above links unknown quantities with observed ones
under some assumptions (not clearly presented, yet)
M.Fort () Quantile Regression Last updated: May 20, 2014 108
Lecture 1 Lecture 2 Lecture 3
A Causal Chain Model with Random Coefficients
M.Fort () Quantile Regression Last updated: May 20, 2014 109
Lecture 1 Lecture 2 Lecture 3
A Causal Chain Model with Random Coefficients: Quantiles Y and S
y = s(θ + λν) + x ′β1 + ε
s = x ′β2 + zδ + ν ν = s − x ′β1 − δz
QY |S ,X ,Z (τ1|τε, s, x , z) = x ′β1 + Qε(τε) + [θ + λ(s − x ′β2 − zδ)]s
= x ′β1 + Qε(τε) + [θ + λ(s − x ′β2 − zδ)]s
∇sQY |S,X (τ1|τε, s, x) = θ + λQν(τν)
We can retrieve ∇sQY |S,X (τ1|τε, s, x) = θ(ν) from (identifying equation)
∇sQY |S ,X ,Z (τ1|τε,
s︷ ︸︸ ︷QS |X ,Z (τ2|τν, x , z), x , z) +
QY |S,X ,Z (τ1|τε, s, x , z)
∇zQS |X ,Z (τ2|τν, x , z)
= (θ + λν) = (θ + λν + λs) +−λδs
δ
The equation above links unknown quantities with observed quantities
M.Fort () Quantile Regression Last updated: May 20, 2014 110
Lecture 1 Lecture 2 Lecture 3
A Causal Chain Model with Random Coefficients: Comments
To write the reduced form (hybrid) model
We write ν = f (s, x , z): first stage needs to be monotonic in ν
We exploit recursive structure in observed and latent variables
In a recursive model with monotonicity
there is a one-to-one mapping btw the quantiles of ν and
and the quantiles of s and y (equivariance of QR estimator)
One could allow more heterogeneity
y = s
θ(ε,ν)︷ ︸︸ ︷(θ + λν + γε) +x ′β1
or leave θ(ε, ν) unrestricted
M.Fort () Quantile Regression Last updated: May 20, 2014 111
Lecture 1 Lecture 2 Lecture 3
Chesher(2003) Causal Chain Model in a Nutshell
Recursive model with nonlinear nonadditive equations
y = hy (s, X , ε, ν) y continuous
s = hs (X , Z , ν) s, z continuous
a triangular structure in both endogenous and latent variables
b hy (·) differentiable wrt s and ν; stricly monotonic in ε
c hs (·) differentiable wrt z and ν; stricly monotonic in ν
d the conditional τε quantile of ε given ν, X , Z is independent pf ν, X , Z
e the conditional τν quantile of ν given X , Z is independent pf X , Z
The exclusion restriction on the observed variables is required for
identification of the derivatives
ceteris paribus variations can be identified under weaker assumptions
M.Fort () Quantile Regression Last updated: May 20, 2014 112
Lecture 1 Lecture 2 Lecture 3
Chesher Causal Chain Model: Estimation
1. Weighted average derivative estimator
2. Control variate estimator proposed by Ma et al. (2006)
step 1 estimate ν at a specific quantile τ of s for q2 quantiles
step 2 add the generated regressor nu in the
structural quantile regression equation for y for q1 quantiles
3. Estimation as in 1. and 2. outperforms alternative approaches
4. The output is a q1 × q2 matrix or a multidimensional graph:
the (estimated) exogenous impact function
5. One could obtain QTEs and average treatment effects
No routines available.
Following 2. : simply run a series of QR. Get standard errors with
appropriate bootstrap design (or follow asymptotic results in Ma et al. , 2006)
M.Fort () Quantile Regression Last updated: May 20, 2014 113
Empirical Setup I
ln(W) = α + S
Π(A,U)︷ ︸︸ ︷(β + λA + φU) +X′γW + A + U (1)
S = α + X′γS + πZ + ξA (2)
where
• W is wage;S is the quantity of schooling
• A is ability; U is labour market fortune orthogonal toA
• Z (years of compulsory schooling) is
an instrumental variable
• X is a vector of covariates
Ex-ante individuals do not have information onU– p. 5/32
About Ability A
Talent has: (i) an absolute effect (on earnings);(ii) a comparative effect (on returns)Ashenfelter & Rouse (1998)
Ability and schooling are complements if returnsincrease with ability (λ > 0), substitutes ifreturns decrease with ability (λ < 0)
We assume that more able individuals get more schoolingconsistent with signalling model and a variant of the human capital
model, see Blackburn and Neumark (1993)
– p. 6/32
About Labour Market Fortune U
The unobservableU captures
the fact that ex-ante identical individuals end up withdifferent wages in the random matching process afterschool completionHornstein et al (2006)
We assume that “luckier” individuals find a better match,i.e. end up with a higher wage
Alternatively, it may refer to
a zero mean demand shock which affects the relative
productivity of jobs and skillsGosling et al.(2000), Machin et al.(1998)
the ability which is productive only at work, in contrast with
cognitive abilityLang (1993)– p. 7/32
Empirical Setup II
ln(W) = α + S(β + λA + φU) + X′γW + A + U (1)
S = α + X′γS + πZ + ξA (2)
Let QY(τ |·) denote theτ -th conditional quantile
of the random variableY.
The conditional quantile model corresponding to eq. (1) and(2) is
QS(τA|X, Z) = α + γSX + πZ + ξQA(τA)
QW(τU|QS(τA|X, Z), X, Z) = α + QS(τA|X, Z)π(τA, τU) + γWX+
QU(τU) + QA(τA)
– p. 8/32
The Parameter of Interest: The Impact Function
Doksum(1974); Chesher(2003); Ma & Koenker (2006)
π(τA, τU) ≡ β + λQA(τA) + φQU(τU)
represents the rate at which wages increase asschooling is exogenously increased for a person
with ability equal toQA(τA) and labour market
fortune equal toQU(τU)
can be summarized as a matrix of quantile treatment
effects,π(τA, τU), which describe how returnsvary over the distribution of wages for a given levelof ability
– p. 9/32
Output “Preview”
π(τA, τU) τU =.10 . . . τU =.50 . . . τU =.90
τA =.10 ↔ ↔... . . . l . . .
τA =.90 l
by rows ↔: the table shows how returns vary overthe distribution of wages for a given level ofability/education (τA)
by columnsl: the table shows how returns varyacross different ability levels for a given level oflabour market fortune/wages (τU)
– p. 10/32
In Practice
Chesher(2003); Ma & Koenker (2006)
1. We estimate conditional quantile models for a set of values of τA
- QS(τA|X, Z) = α + γSX + πZ + QA(τA) - and we compute the
(first stage) residualsQA(τA) ≡ S− QS(τA|X, Z) ≡ S− α + γSX + πZ
2. We estimate the (hybrid) conditional quantile model for a set of
values ofτU for eachτA using a control variate approach
QW(τU|QS(τA|X, Z), X, Z)) = α + Sπ(τA, τU) + γWX + QU(τU)+
ϕ1QA(τA) + ϕ2QA(τA)S
ϕ1 andϕ2 may be interpreted as degree of endogeneity and
exploited to test (local) endogeneity– p. 11/32
Lecture 1 Lecture 2 Lecture 3
Benchmark: from Ma & Koenker, 2006
Recalling that integrating the quantile function F�1X ðtÞ of a random variable, X,over the domain ½0; 1�, yields its expectation, that is,
EX ¼
Z 1
0
F�1X ðtÞdt,
we can define a mean quantile treatment effect by integrating out t2, and denotingmi ¼ Eni,
p1ðt1Þ ¼Z 1
0
ða1 þ dðF�11 ðt1Þ þ lF�12 ðt2ÞÞÞ dt2 � a1 þ dF�11 ðt1Þ þ dlm2.
Averaging again, this time with respect to t1 yields the mean treatment effect
p1 ¼Z 1
0
ða1 þ dF�11 ðt1Þ þ dlm2Þdt1 � a1 þ dm1 þ dlm2.
This mean treatment effect would be what is estimated by the two-stage least-squaresestimator in the pure location shift (d ¼ 0) version of the model, but when the effectsare more heterogeneous as in this location-scale shift model the structural quantiletreatment effect p1ðt1; t2Þ represents a deconstruction of the mean effect into itselementary components. Fig. 1 illustrates the three versions of the treatment effectp1ðt1; t2Þ; p1ðt1Þ; and p1 for a particular parametric instance of model (2.6)–(2.7).
2.2. Estimation of structural quantile treatment effects
In this section, we will describe two general classes of estimators for the parametricrecursive structural model
Y i1 ¼ j1ðY i2;xi; ni1; ni2; aÞ, (2.10)
Y i2 ¼ j2ðzi;xi; ni2; bÞ. (2.11)
We will maintain our assumptions on the nij’s and the functions j1 and j2 and wewill explicitly assume that the functions j1 and j2 are known up to the finite-
ARTICLE IN PRESS
0.20.4
0.6 0.8
tau10.2
0.4
0.60.8
tau2
-5 05
10152025
a1
Mean Treatment Effect
0.20.4
0.6 0.8
tau10.2
0.40.6
0.8
tau2
-5 05
10152025
a1 (
tau1
)
Mean Quantile Treatment Effect
0.20.4
0.6 0.8
tau10.2
0.4
0.60.8
tau2
-5 05
10152025
a1(t
au1,
tau2
)
Quantile Treatment Effect
Fig. 1. Quantile treatment effects for the structural model: the figure illustrates three different notions of
the structural treatment effect for the linear location-scale structural equation model: (2.6)–(2.7) with
ða1; a2; d; lÞ ¼ ð10; 4; 3; 2Þ, ðb1; b2; gÞ ¼ ð1; 2; 3Þ, n1�Nð0; 1Þ, n2�Nð0; 0:5Þ. The left figure depicts p1 ¼ 10,
the mean treatment effect; the middle figure shows p1ðt1Þ ¼ 10þ 3F�11 ðt1Þ, the mean quantile treatment
effect; the right figure shows p1ðt1; t2Þ ¼ 10þ 3ðF�11 ðt1Þ þ 2F�12 ðt2ÞÞ, the general quantile treatment effect.
L. Ma, R. Koenker / Journal of Econometrics 134 (2006) 471–506 477
M.Fort () Quantile Regression Last updated: May 20, 2014 114
Effect of the Changes in MSLA on the Distribution of
Years of Education
Males τA = 0.10 τA = 0.30 τA = 0.50 τA = 0.70 τA = 0.90
Coeff. .354∗∗∗ .056∗∗∗ .120∗∗∗ .078∗∗ .026
F-test 2146.6 19.1 307.6 4.86 0.13
(p-val.) (.000) (.000) (.000) (.027) (.714)
Females τA = 0.10 τA = 0.30 τA = 0.50 τA = 0.70 τA = 0.90
Coeff. .416∗∗∗ .284∗∗∗ .072∗∗∗ .219∗∗∗ .135∗∗∗
F-test 643.8 195.4 88.7 57.4 4.26
(p-val.) (.000) (.000) (.000) (.000) (0.039)
τA denotes quantiles of the years of schooling distribution.
Three stars, two stars for statistically significant coefficients at the 1%, 5%,
confidence level. Example (females)⊳ Example (males)⊳– p. 15/32
π(τA, τU) τA = 0.3, Males
.03
.04
.05
.06
.07
.08
.1 .3 .5 .7 .9Quantile of Labour Market Fortune/Wages
Approx. 95% CI Marginal return to SchoolingApprox. 95% CI
Evidence of heterogeneity in returns, which tend to decrease asone moves from the bottom to the top deciles of the distribution of
lnW. Similar pattern for other values ofτA. π(τA, τU) ⊲– p. 17/32
π(τA, τU) τA = 0.3, Females
.06
.07
.08
.09
.1
.1 .3 .5 .7 .9Quantile of Labour Market Fortune/Wages
Approx. 95% CI Marginal return to Schooling Approx. 95% CI
Evidence of heterogeneity in returns, which tend to decrease asone moves from the bottom to the top deciles of the distribution of
lnW. Similar pattern for other values ofτA. π(τA, τU) ⊲– p. 18/32
π(τA, τU) τU = 0.5, Males
.03
.04
.05
.06
.07
.1 .3 .5 .7 .9Quantile of Ability/Schooling
Approx. 95% CI Marginal return to SchoolingApprox. 95% CI
Evidence of heterogeneity in returns, which tend to decrease asone moves oves from the lower to the higher levels ofA. Similar
pattern for other values ofτU. π(τA, τU) ⊲– p. 19/32
π(τA, τU) τU = 0.5, Females
.05
.06
.07
.08
.09
.1 .3 .5 .7 .9Quantile of Ability/Schooling
Approx. 95% CI Marginal return to SchoolingApprox. 95% CI
Evidence of heterogeneity in returns, which tend to decrease asone moves oves from the lower to the higher levels ofA. Similar
pattern for other values ofτU. π(τA, τU) ⊲– p. 20/32
π(τA, τU) = β + λQA(τA) + φQU(τU)
Males Females
(1) (2) (3) (4)
β 0.051.0015
∗∗∗ 0.050.0026
∗∗∗ 0.070.0009
∗∗∗ 0.072.0013
∗∗∗
λ −0.0021.0004
∗∗∗ −0.0022.0008
∗∗∗ −0.0025.0003
∗∗∗ −0.0021.0005
∗∗∗
φ −0.0089.003
∗∗∗ −0.013.0032
∗∗∗ −0.0091.0017
∗∗∗ −0.0119.0016
∗∗∗
R Squared 0.680 0.692 0.856 0.836
Col. (1) and (3) are estimates based on the 25 estimated returnsπ(τA, τU),
τA τU ∈ {0.1, 0.3, 0.5, 0.7, 0.9}. Col. (2) and (4) are based on excludingτA
τU ∈ {0.7, 0.9} and retaining 15 estimated returns. The regressors
QA(τA) ≡ G−1
A(τA) andQU(τU) ≡ G
−1
U(τU) are computed using the deciles of
the ecdf of the 1st stage and 2nd stage residuals. – p. 21/32
Implications for Conditional Wage Inequality
δτ2−τ1 ≡∂QY(τ2,X,S,A)
∂S− ∂QY(τ1,X,S,A)
∂S≡ π(τA, τ2) − π(τA, τ1)
Males δ30−10 δ50−10 δ70−10 δ90−10
τA = 0.10 −0.0165 −0.0193 −0.0198∗∗ −0.0150
τA = 0.30 −0.0149∗∗ −0.0163∗∗ −0.0206∗∗ −0.0122
τA = 0.50 −0.0172∗∗ −0.0187∗∗∗ −0.0233∗∗∗ −0.0196∗∗
Three stars, two stars and one star for statistically significant coefficients at
the 1%, 5%, and 10% confidence level (bootstrap; 100 replications).
– p. 22/32
Implications for Conditional Wage Inequality
δτ2−τ1 ≡∂QY(τ2,X,S,A)
∂S− ∂QY(τ1,X,S,A)
∂S≡ π(τA, τ2) − π(τA, τ1)
Females δ30−10 δ50−10 δ70−10 δ90−10
τA = 0.10 −0.0172∗∗∗ −0.0164∗∗∗ −0.0132∗ −0.0193∗∗
τA = 0.30 −0.0137∗∗ −0.0125∗∗ −0.0108 −0.0136
τA = 0.50 −0.0168∗∗∗ −0.0158∗∗ −0.0140∗∗ −0.0201∗∗
τA = 0.70 −0.0116∗∗ −0.0101∗∗ −0.0074 −0.0077
Three stars, two stars and one star for statistically significant coefficients at
the 1%, 5%, and 10% confidence level (bootstrap; 100 replications).
– p. 23/32
logW S YCOMP Age %Males Nobs
Austria 2.220 12.181 8.767 50.900 0.492 920
Belgium 2.470 14.887 9.782 33.125 0.465 853
Denmark 2.798 13.667 8.030 44.186 0.477 2235
Finland 2.366 15.153 7.511 37.151 0.496 1409
France 2.399 13.410 9.017 47.074 0.525 1293
Germany 2.439 12.127 8.620 45.649 0.590 1690
Greece 2.005 12.929 7.509 38.270 0.562 984
Ireland 2.265 12.356 8.534 39.331 0.574 1260
Italy 2.367 12.556 7.097 49.066 0.590 1762
Netherl. 2.574 14.166 9.445 37.702 0.592 1294
Spain 2.116 11.049 7.099 43.136 0.626 2284
Sweden 2.328 12.197 8.465 50.410 0.480 2344
Counfounders⊲ Data⊳– p. 27/32
Effect of the Changes in MSLA on the Distribution
of Years of Education
0.2 0.4 0.6 0.8
1618
2022
24
Synthetic Europe−12, Females
Years of schooling estimated cdf; ’1st stage’ τ
quan
tile
low ycomp: 6 yearshigh ycomp: 8 years
Blue solid line: 8 years of compulsory schooling; Red dashed line: 6 yrs of
comp. sc. First stage quantile regressions (π) ⊲ – p. 28/32
Effect of the Changes in MSLA on the Distribution
of Years of Education
0.2 0.4 0.6 0.8
2025
3035
Synthetic Europe−12, Males
Years of schooling estimated cdf; ’1st stage’ τ
quan
tile
low ycomp: 6 yearshigh ycomp: 8 years
Blue solid line: 8 yrs of compulsory schooling; Red dashed line: 6 yrs of
comp. sc. First stage quantile regressions (π) ⊲ – p. 29/32
Association between Education and Wages
over the distribution of wages
Coef.(se) τW = 0.10 τW = 0.30 τW = 0.50 τW = 0.70 τW = 0.90
Males .019∗∗∗ .026∗∗∗ .033∗∗∗ .035∗∗∗ .039∗∗∗
(.002) (.001) (.001) (.001) (.002)
Females .027∗∗∗ .037∗∗∗ .043∗∗∗ .050∗∗∗ .051∗∗∗
(.003) (.001) (.001) (.001) (.002)
τW denotes quantiles of the log wage distribution.
Three stars, two stars and one star for statistically significantcoefficients at the 1%, 5%, and 10% confidence level.
– p. 30/32
π(τA, τU) Males
Males τU = 0.10 τU = 0.30 τU = 0.50 τU = 0.70 τU = 0.90
τA = 0.10 .0748.004
.0583.004
.0555.003
.0550.004
.0598.006
τA = 0.30 .0625.007
.0476.004
.0462.003
.0420.005
.0503.006
τA = 0.50 .0665.006
.0492.004
.0478.004
.0432.004
.0469.006
τA = 0.70 .0486.006
.0396.004
.0448.004
.0411.004
.0471.005
τA = 0.90 .0468.006
.0329.004
.0384.003
.0332.004
.0452.006
τU denotes quantiles of the labour market fortune/ wages distribution.
τA denotes quantiles of ability/ years of schooling distribution.
Bootstrapped standard errors (100 replications) in small characters.
All statistically significant at the 1% confidence level.
Back (malesτA) ⊳ Back (malesτU) ⊳ – p. 31/32
π(τA, τU) Females
Females τU = 0.10 τU = 0.30 τU = 0.50 τU = 0.70 τU = 0.90
τA = 0.10 .0952.007
.0780.004
.0788.004
.0820.005
.0759.007
τA = 0.30 .0838.007
.0701.003
.0713.003
.0730.004
.0702.006
τA = 0.50 .0847.006
.0679.004
.0690.003
.0707.004
.0646.006
τA = 0.70 .0689.005
.0573.003
.0588.003
.0615.003
.0612.005
τA = 0.90 .0631.006
.0502.003
.0527.003
.0555.004
.0567.006
τU denotes quantiles of the labour market fortune/ wages distribution.
τA denotes quantiles of ability/ years of schooling distribution.
Bootstrapped standard errors (100 replications) in small characters.
All statistically significant at the 1% confidence level.
Back (femalesτA) ⊳ Back (femalesτU) ⊳ – p. 32/32
Lecture 1 Lecture 2 Lecture 3
Another example: class size (D) and students’ achievement (Y )
Conventional wisdom: “class size reduction is a viable mean to increase
scholastic achievement”
Issue: is there evidence supporting the claim?
Coleman Report (1966): schooling inputs have negligible effects on student’s
achievements
Hanushek (1986): mixed findings; modest positive effects of class size
reduction
Krueger (1997): different effects on boys & girls, blacks & whites, inner-city
& out-of-city students
Lazear (2001): theoretical framework in which optimal class size wrt
scholastic achievement differs btw students that behave well or not
Recent research stresses the role of cheating and class composition to explain
mixed evidence and small class size effects
M.Fort () Quantile Regression Last updated: May 20, 2014 115
Lecture 1 Lecture 2 Lecture 3
Example: class size (D) and students’ achievement (Y ) cont’d
Levin’s (2001) findings: mixed, no strong evidence of heterogeneity wrt
students’ achievement (standard QR); positive effects at
low achievement levels (2SLAD) → ‘peer effects’,
‘targeted instruction’; evidence of non random selection
of less (more) able students to larger (smaller) classes;
composition of the class (IQ) matters more than size, the effect
is smaller as one moves up in the achievement distribution
Ma & Koenker’s (2006) findings: negligible effects for the average student;
positive effects on language performance and negative
effects for math perf. for lower attainment students;
negative effects on language performance for high attainment
students, negligible for math (causal chain model)
Remark Levin and Ma & Konker use the same data but different models.
M.Fort () Quantile Regression Last updated: May 20, 2014 116
Lecture 1 Lecture 2 Lecture 3
Levin and Ma & Koeker Study: Some Details
Data: 1st wave from a longitudinal survey with info on Dutch
pupils enrolled in grades 2, 4 (aged 7-8), 6 (aged 9-10), 8 (aged
11-12) in 1994-1995. Variables: test scores of students
wrt intelligence, reading abilities,language (Dutch) , mathematics;
background data (parents & teachers); detailed (administrative)
school level data. Sample size: 57,000 pupils; 700 schools
Instrument: weighted school enrollement (WSE), the parameter
according to which the Ministry allocates funding to schools.
The funding determines the number of teachers hired.
WSE is a weighted average of total school enrollment, weighted by the socio-
economic status of the students enrolled in the school.
M.Fort () Quantile Regression Last updated: May 20, 2014 117
Lecture 1 Lecture 2 Lecture 3
Levin and Ma & Koneker Study: the IV
zi ≡ WSEi = 1.03max{∑nij=1 sij − 0.9ni , ni}
ni enrollment, i school, j student; sij ∈ {1(ref .), 1.25, 1.4, 1.7, 1.9(worst)}
Z varies between schools not within
Z is distinct from school size (Ma & Koeker, 2006)
Class size has more variability in bigger schools but does not increase with
school size (Ma & Koeker, 2006)
M.Fort () Quantile Regression Last updated: May 20, 2014 118
Lecture 1 Lecture 2 Lecture 3
Levin, 2001 Empirical Economics, Math
(���� #�� ����'2�
.'0 B��#�
;(��'��
�� � 2� 2�J 2J 2IJ 2D
���� ��3� 2.JI
� 2H�I�== 2.��� 2JJH�
�2 J � 2����==
�2J��� 2JH��===
2DD�� 2J �==
� 2 �J� 2��D�
�������'� 2 ��� 2 � �
� 2 �� 2 �D�
� 2 �H� 2 ���
� 2 �D� 2 �.�
� 2 �.� 2 �I�
2 �� 2 �J�
����(#��"��4(���#
2��HJ 2 I�H 2 DH� 2 DH� 2 �I� 2 � �
��(���� '��'� 2DI�
� 2HI �===
���'��"��4(���#
2�H� ===
�(�%�� �-�%���+�'����
�)��� �)��� �)��� �)��� �)��� �)���
(���� #�� �8��' �- ���� ��3� �� ���(��� ��0��+����' �- �'0) �'0 ��# .'0 ���#��� �'6� �'������' �4(���� ��# '6� �'��� ���' �%��('� #�+��'��� ��'���'���
�'0 B��#�
;(��'��
�� � 2� 2�J 2J 2IJ 2D
���� ��3� 2�HI� 2HI��
2�J�� 2��.�
2��D� 2J���
2 D�� 2HD.�
2��H� 2HD �
� 2��I� 2��I�
�������' 2 � � 2 ���
2 ��� 2 �.�
2 D� 2 ���
2 ��� 2 ���
2 H� 2 ���
� 2 �� 2 ���
����(#��"��4(���#
2��HJ 2 J�. 2 D�D 2 DDD 2 ID� 2 �DH
��(���� '��'� 2�I � 2����
���'��"��4(���#
2H.HD===
�(�%�� �-�%���+�'����
�)�JH �)�JH �)�JH �)�JH �)�JH �)�JH
F�� 60�� '0� ��#(�'���� ��(�'� ! 4(��'�� ���������� ���,��� �- ���� ��3� ��# ���� �8��'� �HH
M.Fort () Quantile Regression Last updated: May 20, 2014 119
Lecture 1 Lecture 2 Lecture 3
Levin, 2001 EE, Language
(���� #�� ����'2�
�'0 B��#�
;(��'��
�� � 2� 2�J 2J 2IJ 2D
���� ��3�� 2���� 2�I �
� 2��D� 2J�I�
� 2��J� 2����
� 2���� 2�.��
� 2�H�� 2�HJ�
� 2�JJ� 2�H��
�������' 2 ��� 2 ���
2 � � 2 �D�
2 �J� 2 ���=
2 I� 2 ���
2 .� 2 ���
2 .� 2 ���
����(#��"��4(���#
2���� 2 �H� 2 IHD 2 .�I 2 �D 2 HD�
��(���� '��' 2 I � 2HHJ�
���'��"��4(���#
2��I�===
�(�%�� �-�%���+�'����
�) D �) D �) D �) D �) D �) D
.'0 B��#�
;(��'��
�� � 2� 2�J 2J 2IJ 2D
���� ��3� 2J�D� 2�.��=
2���� 2J�H�
2.��� 2��.�=
2�D.� 2HI��
2H�I� 2H���
2�D�� 2��H�
�������'� 2 �� 2 D�
� 2 H� 2 � �
� 2 ��� 2 ���
2 � 2 �H�
� 2 �� 2 ���
� 2 .� 2 �J�
����(#��"��4(���#
2�J�H 2 I�� 2 DD� 2� J 2 D D 2 �.�
��(���� '��'� 2JJD� 2�DH�=
���'��"��4(���#
2�H� ===
�(�%�� �-�%���+�'����
�)��� �)��� �)��� �)��� �)��� �)���
����' 9����#��' +����%�� ��� ������'�� ���<��� �- �(���7 ������ �� �'��#��#�3�# ���'0���'�� �� ���(��� '��'�2 ! ����������� ��� ���(#� �##�'���� ���'��� -��� � ����'��') -�(� ���#�+�#(� �������� #(�����) � #(��, -�� �(��7� ���#��) ������'��� �- -����� �� �(��7�����) ���� �+����� ���) '���0��7� ���#�� ��# �C��������) #(��, -�� #(� '���0�� ����)#(��, -�� �('�����#� ����) '0��� #(��, +����%�� -�� ��0��7� #�������'��� ��# '�'���0�� �������'2 "�%(�' �'��#��# ������ '�<��� ��'� ����(�' ���(��# 0�'����<�#��'���',-�� ���' �4(���� �4(�'���� ��# (���� #����� ��'��C %��'�'�����# -�� 4(��'�� ����������� ��������'�# �� %���<�'�2 ===N==N= ��#���'� �����:����� �' '0� �ONJON� O��+��2 ���(#� "�
�4(���# �4(�� �� �(� �- 6���0'�# #�+��'���� �%�(' ��'���'�# 4(��'��
�(� �- 6���0'�# #�+��'���� ���(�# ��6 4(��'��
� �2
�H� @2 �+��
M.Fort () Quantile Regression Last updated: May 20, 2014 120
Lecture 1 Lecture 2 Lecture 3
Ma & Koenker, 2006 Journal of Econometrics
language and math scores is provided in Figs. 5 and 6, respectively. In the left panelwe depict the conventional two-stage least-square estimate of the mean shift effect ofclass size viewed as a constant function of t1 and t2: In the middle panel we showwhat we have called the mean quantile treatment effect obtained by integrating outthe t2 effect from the WAD estimate, dðt1; t2Þ, of the structural class-size effect. Inthe right panel we present dðt1; t2Þ.
The two-stage least-square estimate of the class-size effect is �0.07 with a standarderror of 0.20, a finding consistent with many other unsuccessful attempts to discern asignificant effect of class size. However, our estimates of the mean quantile treatmenteffect of class size in the middle panel reveals a somewhat more nuanced view. Bothmath and language plots show a positive effect of around 0.7 at low quantiles andfalling gradually to about �0.5 at the upper quantiles, suggesting that poorerstudents benefit from larger classes, while better students do better in smaller classes.Further disaggregating, the plots in the right panel indicate dispersion in the class-size effect in both the t1 and t2 directions, but the picture is roughly similar: positiveeffects at the lower quantiles of test scores, and negative effects at the upper
ARTICLE IN PRESS
0.20.4
0.60.8
tau10.2
0.4
0.6
0.8
tau2
-1.5-1
-0.5 0
0.51
1.52
delta
Mean Treatment Effect
0.20.4
0.60.8
tau10.2
0.4
0.6
0.8
tau2
-1.5-1
-0.5 0
0.51
1.52
delta
(tau
1)
Mean Quantile Treatment Effect
0.20.4
0.60.8
tau10.2
0.4
0.6
0.8
tau2
-1.5-1
-0.5 0
0.51
1.52
delta
(tau
1, ta
u2)
Quantile Treatment Effect
Fig. 5. Structural class-size effects for language: t1-students achievement, t2-class size.
0.20.4
0.60.8
tau10.2
0.4
0.6
0.8
tau2
-1
-0.5
0
0.5
1
delta
Mean Treatment Effect
0.20.4
0.60.8
tau10.2
0.4
0.6
0.8
tau2
-1
-0.5
0
0.5
1
delta
(tau
1)
Mean Quantile Treatment Effect
0.20.4
0.60.8
tau10.2
0.4
0.6
0.8
tau2
-1
-0.5
0
0.5
1
delta
(tau
1, ta
u2)
Quantile Treatment Effect
Fig. 6. Structural class-size effects for math: t1-students achievement, t2-class size.
L. Ma, R. Koenker / Journal of Econometrics 134 (2006) 471–506498
M.Fort () Quantile Regression Last updated: May 20, 2014 121
Lecture 1 Lecture 2 Lecture 3
Ma & Koenker, 2006
language and math scores is provided in Figs. 5 and 6, respectively. In the left panelwe depict the conventional two-stage least-square estimate of the mean shift effect ofclass size viewed as a constant function of t1 and t2: In the middle panel we showwhat we have called the mean quantile treatment effect obtained by integrating outthe t2 effect from the WAD estimate, dðt1; t2Þ, of the structural class-size effect. Inthe right panel we present dðt1; t2Þ.
The two-stage least-square estimate of the class-size effect is �0.07 with a standarderror of 0.20, a finding consistent with many other unsuccessful attempts to discern asignificant effect of class size. However, our estimates of the mean quantile treatmenteffect of class size in the middle panel reveals a somewhat more nuanced view. Bothmath and language plots show a positive effect of around 0.7 at low quantiles andfalling gradually to about �0.5 at the upper quantiles, suggesting that poorerstudents benefit from larger classes, while better students do better in smaller classes.Further disaggregating, the plots in the right panel indicate dispersion in the class-size effect in both the t1 and t2 directions, but the picture is roughly similar: positiveeffects at the lower quantiles of test scores, and negative effects at the upper
ARTICLE IN PRESS
0.20.4
0.60.8
tau10.2
0.4
0.6
0.8
tau2
-1.5-1
-0.5 0
0.51
1.52
delta
Mean Treatment Effect
0.20.4
0.60.8
tau10.2
0.4
0.6
0.8
tau2
-1.5-1
-0.5 0
0.51
1.52
delta
(tau
1)
Mean Quantile Treatment Effect
0.20.4
0.60.8
tau10.2
0.4
0.6
0.8
tau2
-1.5-1
-0.5 0
0.51
1.52
delta
(tau
1, ta
u2)
Quantile Treatment Effect
Fig. 5. Structural class-size effects for language: t1-students achievement, t2-class size.
0.20.4
0.60.8
tau10.2
0.4
0.6
0.8
tau2
-1
-0.5
0
0.5
1
delta
Mean Treatment Effect
0.20.4
0.60.8
tau10.2
0.4
0.6
0.8
tau2
-1
-0.5
0
0.5
1
delta
(tau
1)
Mean Quantile Treatment Effect
0.20.4
0.60.8
tau10.2
0.4
0.6
0.8
tau2
-1
-0.5
0
0.5
1
delta
(tau
1, ta
u2)
Quantile Treatment Effect
Fig. 6. Structural class-size effects for math: t1-students achievement, t2-class size.
L. Ma, R. Koenker / Journal of Econometrics 134 (2006) 471–506498
For low achievers, larger classes improve language performance; smaller math
For average students, no class size effects
For high achievers, smaller classes better for language; no effects on math
M.Fort () Quantile Regression Last updated: May 20, 2014 122
Lecture 1 Lecture 2 Lecture 3
Back to Theory (with this Example in Mind)
In the example, Y (test scores, achievement), D (class size), Z (WSE) are
continuous variables
The LATE approach does not apply
The IV −QTE model requires that conditional on Z , X , the relative position
of an individual in the achievement distribution is not affected by class size:
the ‘best’ student in a small class is the ‘best’ student in a big class.
LATE and IV −QTE focus on the QTEs and at most reveal the
heterogeneity of the impact of class size at different levels of achievement.
M.Fort () Quantile Regression Last updated: May 20, 2014 123
Lecture 1 Lecture 2 Lecture 3
Levin vs Ma & Koneker Research Questions
Levin ‘how does mean class size affect the distribution of academicoutcomes?’
Addressing Levin’s question may reveal heterogeneity of class size
effects over the distribution of students’s achievement but
cannot reveal heterogeneity wrt the distribution of class size
Ma & Koenker ‘Is there any heterogeneity of the class size effect over the
distribution of academic outcomes and the distribution of class
sizes?’
In any case we look at effects on the conditional distribution of the outcome
Sometimes you may need controls for identification but the research question is on
the marginal distribution . . .
This can be done but the we did not cover the tools to address those issues here
M.Fort () Quantile Regression Last updated: May 20, 2014 124
Lecture 1 Lecture 2 Lecture 3
References
Brunello et al. (2009) ‘Changes in Compulsory Schooling,Education and theDistribution of Wages in Europe’, Economic Journal 110 pp. 516-539
Chesher, A. (2003) Identification in Nonseparable Models, Econometrica, Vol.71, pp. 1405-1441 and the 2001 WP version of the paper!
Chesher, A. (2005) ’Nonparametric Identification under discrete variation’,Econometrica, Vol. 73 (5), pp. 1525-1550.
Koenker, R. (2005) Chapter 8 (Section 8.8)
Levin, (2001)‘For Whom the Reductions Count: A Quantile RegressionAnalysis of Class Size and Peer Effects on Scholastic Achievement ’,EmpiricalEconomics Vol. 26, pp. 221-246
Ma et al. (2006)‘Quantile Regression Methods for Recursive StructuralEquation Models’, Journal of Econometrics, Vol. 134 (2), pp. 471-506
Materials for this lecture are also based on lectures R. Spady at EUI (2006) and the talk by A.
Chesher at the 9th World Congress of the Econometric Society
M.Fort () Quantile Regression Last updated: May 20, 2014 125