regression discontinuity: advanced topicsusers.nber.org/~rdehejia/!@$aem/topic 07 rd... ·...

54
Regression Discontinuity: Advanced Topics NYU Wagner Rajeev Dehejia

Upload: others

Post on 26-Dec-2019

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

RegressionDiscontinuity:AdvancedTopics

NYUWagnerRajeevDehejia

Page 2: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

SummaryofRDassumptions

• Thetreatmentisdeterminedatleastinpartbytheassignmentvariable

• Thereisadiscontinuityintheleveloftreatmentatsomecutoffvalueoftheassignmentvariable (selectiononobservablesatthecutpoint)

• Unitscannotpreciselymanipulatetheassignmentvariabletoinfluencewhethertheyreceivethetreatmentornot

• Othervariablesthataffectthetreatmentdonotchangediscontinuouslyatthecutoff

Page 3: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

AssessingthevalidityofanRD

• Itisimpossibletotestthecontinuityassumptiondirectly,butwecantestsomeimplicationsofit

• Namely,allobservedpredeterminedcharacteristicsshouldhaveidenticaldistributionsoneithersideofthecutoff,inthelimit,asweapproachsmallerandsmallerbandwidths.Thatis,thereshouldbenodiscontinuitiesintheobservables.

• Againthereisananalogytoanexperiment:wecannottestwhetherunobservedcharacteristicsarebalanced,butwecantesttheobservables.Rejectioncallstherandomizationintoquestion.

Page 4: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

Internal&externalvalidity

• ThestrengthoftheRDdesignisitsinternalvalidity,arguablythestrongestofanyquasi-experimentaldesign

• Externalvaliditymaybelimited

• SharpRD(SRD)providesestimatesforthesubpopulationwithX=c,thatisthoserightatthecutoffoftheassignmentvariable.

– Thediscontinuityisaweightedaveragetreatmenteffectwhereweightsareproportionaltotheexantelikelihoodthatanindividual’srealizationofXwillbeclosetothethreshold.

• FuzzyRD(FRD)restrictstheestimatesfurthertocompliersatthecutoff(moreonthisbelow)

• Youneedtojustifyextrapolationtoothersubpopulations(e.g.,treatmenthomogeneity)

Page 5: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

ThreatstoanRDanalysis

TherearethreegeneraltypesofthreatstoanRDdesign:

1. Othervariableschangediscontinuouslyatthecutoff– Testforjumpsincovariates,includingpretreatmentvaluesof

theoutcomeandthetreatment

2. Therearediscontinuitiesatothervaluesoftheassignmentvariable

3. Manipulationoftheassignmentvariable– Testforcontinuityinthedensityoftheassignmentvariableat

thecutoff

Page 6: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

Specificationchecks

A. DiscontinuitiesinAverageCovariates

B. ADiscontinuityintheDistributionoftheForcingVariable

C. DiscontinuitiesinAverageOutcomesatOtherValues

D. SensitivitytoBandwidthChoice

E. FuzzyRDDesign

F. Extension:RegressionKinkDesign

Page 7: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

A.DiscontinuitiesinaveragecovariatesTestthenullhypothesisofazeroaverageeffectonpseudooutcomesknownnottobeaffectedbythetreatment.

Suchvariablesincludecovariatesthatarebydefinitionnotaffectedbythetreatment.Suchtestsarefamiliarfromsettingswithidentificationbasedonunconfoundednessassumptions.

Althoughnotrequiredforthevalidityofthedesign,inmostcases,thereasonforthediscontinuityintheprobabilityofthetreatmentdoesnotsuggestadiscontinuityintheaveragevalueofcovariates.Ifwefindsuchadiscontinuity,ittypicallycastsdoubtontheassumptionsunderlyingtheRDdesign.

Page 8: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason
Page 9: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

A.Balancechecks• Lee(2008)usestheregressiondiscontinuitydesignto

estimatepartyincumbencyadvantageinU.S.Houseelections.DO VOTERS AFFECT OR ELECT POLICIES? 835

Democrat Vote Share at time t Democrat Vote Share at time t

.25 .5 .75 Democrat Vote Share at time t

.25 .5 .75 Democrat Vote Share at time t

Figure III

Similarity of Constituents' Characteristics in Bare Democrat and Republican Districts-Part 1

Panels refer to (from top left to bottom right) the following district character istics: real income, percentage with high-school degree, percentage black, percent age eligible to vote. Circles represent the average characteristic within intervals of 0.01 in Democrat vote share. The continuous line represents the predicted values from a fourth-order polynomial in vote share fitted separately for points above and below the 50 percent threshold. The dotted line represents the 95 percent confidence interval.

share. The coefficient reported in column (6) is the predicted difference at 50 percent. The table confirms that, for many ob servable characteristics, there is no significant difference in a close neighborhood of 50 percent. One important exception is the

percentage black, for which the magnitude of the discontinuity is

statistically significant.23 As a consequence, estimates of the coefficients in Table I from

regressions that include these covariates would be expected to

produce similar results?as in a randomized experiment?since

23. This is due to few outliers in the outer part of the vote share range. When the polynomial is estimated including only districts with vote share between 25 percent and 75 percent, the coefficients becomes insignificant. The gap for percent urban and open seats, while not statistically significant at the 5 percent level, is significant at the 10 percent level.

This content downloaded from 128.122.149.145 on Sun, 6 Oct 2013 22:19:49 PMAll use subject to JSTOR Terms and Conditions

Page 10: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

A.Balancechecks• Lee(2008)usestheregressiondiscontinuitydesignto

estimatepartyincumbencyadvantageinU.S.Houseelections.836 QUARTERLY JOURNAL OF ECONOMICS

00 .25 .5 .75 .25 .5 .75 Democrat Vote Share at time t Democrat Vote Share at time t

Democrat Vote Share at time t Democrat Vote Share at time t

Figure IV

Similarity of Constituents' Characteristics in Bare Democrat and Republican Districts?Part 2

Panels refer to (from top left to bottom right) the following district character istics: voting population, North, South, West. Circles represent the average char acteristic within intervals of 0.01 in Democrat vote share. The continuous line represents the predicted values from a fourth-order polynomial in vote share fitted separately for points above and below the 50 percent threshold. The dotted line represents the 95 percent confidence interval.

all predetermined characteristics appear to be orthogonal to Dt. We have reestimated all the models in Table I conditioning on all

of the district characteristics in Table II, and found estimates that are virtually identical to the ones in Table I.

As a similar empirical test of our identifying assumption, in

Figure V we plot the ADA scores from the Congressional sessions that preceded the determination of the Democratic two-party vote share in election t. Since these past scores have already been determined by the time of the election, it is yet another predeter

mined characteristic (just like demographic composition, income

levels, etc.). If the RD design is valid, then we should observe no

discontinuity in these lagged ADA scores?just as we would ex

pect, in a randomized experiment, to see no systematic differ ences in any variables determined prior to the experiment. The

This content downloaded from 128.122.149.145 on Sun, 6 Oct 2013 22:19:49 PMAll use subject to JSTOR Terms and Conditions

Page 11: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

A.Balancechecks

• Lee(2008)usestheregressiondiscontinuitydesigntoestimatepartyincumbencyadvantageinU.S.Houseelections.

• However …morequalificaitonssoon....

Page 12: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

B.Sorting/bunching/manipulation

• SubjectsorprogramadministratorsmayinvalidatethecontinuityassumptioniftheystrategicallymanipulateX tobejustaboveorbelowthecutoff.

• Thisisaconcernespeciallyiftheexactvalueofthecutoffisknowntothesubjectsinadvance.

• Thistypeofbehavior,ifitexists,maycreateadiscontinuityinthedistributionofX atthecutoff(i.e.,“bunching”totherightortotheleftofthecutoff)

Page 13: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

B.Manipulation• Ifindividualshavecontrolovertheassignmentvariable,thenweshould

expectthemtosortinto(outof)treatmentiftreatmentisdesirable(undesirable)– Thinkofameans-testedincomesupportprogram,oranelection– Thosejustabovethethresholdwillbeamixtureofthosewhowouldhave

passedandthosewhobarelyfailedwithoutmanipulation.

• Ifindividualshaveprecisecontrolovertheassignmentvariable,wewouldexpectthedensityofXtobezerojustbelowthethresholdbutpositivejustabovethethreshold(assumingthetreatmentisdesirable).– McCrary(2008)providesaformaltestformanipulationoftheassignment

variableinanRD.TheideaisthatthemarginaldensityofXshouldbecontinuouswithoutmanipulationandhencewelookfordiscontinuitiesinthedensityaroundthethreshold.

– HowprecisemustthemanipulationbeinordertothreatentheRDdesign?SeeLeeandLemieux(2010).

• ThismeansthatwhenyourunanRDyoumustknowsomethingaboutthemechanismgeneratingtheassignmentvariableandhowsusceptibleitcouldbetomanipulation

Page 14: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

B.Adiscontinuityinthedistributionoftheforcingvariable

McCrary(2007)suggeststestingthenullhypothesisofthecontinuityofthedensityofthecovariatethatunderliestheassignmentatthediscontinuitypoint,againstthealternativeofajumpinthedensityfunctionatthatpoint.

Again,inprinciple,thedesigndoesnotrequirecontinuityofthedensityofX atc,butadiscontinuityissuggestiveofviolationsoftheno-manipulationassumption.

IfinfactindividualspartlymanagetomanipulatethevalueofX inordertobeononesideoftheboundaryratherthantheother,onemightexpecttoseeadiscontinuityinthisdensityatthediscontinuitypoint.

Page 15: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

B.Exampleofmanipulation

• Anincomesupportprograminwhichthoseearningunder$14,000qualifyforsupport• SimulateddatafromMcCrary2008

Page 16: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

B.Discontinuityoftheforcingvariable(cont’d)

• Thedoctorrandomlyassignspatientstotwodifferentwaitingrooms,AandB,andplanstogivethoseinAthestatinandthoseinBtheplacebo.Ifsomeofthepatientslearnoftheplannedtreatmentassignmentmechanism,wewouldexpectthemtoproceedtowaitingroomA.

• WewouldexpectforwaitingroomAtobecomecrowded.• Intheregressiondiscontinuitycontext,thisisanalogousto

expectingtherunningvariabletobediscontinuousatthecutoff,withsurprisinglymanyindividualsjustbarelyqualifyingforadesirabletreatmentassignmentandsurprisinglyfewfailingtoquality.

Page 17: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

B.Discontinuityoftheforcingvariable(cont’d)

• Partialmanipulationoccurswhentherunningvariableisundertheagent’scontrol,butalsohasanidiosyncraticelement(e.g.,canmanipulatetestscore,butonlyimperfectly).

• Typically,partialmanipulationoftherunningvariabledoesnotleadtoidentificationproblems(analogoustofuzzyRD).

• Completemanipulationoccurswhentherunningvariableisentirelyundertheagent’scontrol.

• Typically,completemanipulationoftherunningvariabledoesleadtoidentificationproblems.

Page 18: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

B.Testofdiscontinuityoftheforcingvariable(cont’d)

• Thedensitytestmaynotbeinformativeunlesstheexistenceoftheprograminducesagentstoadjusttherunningvariableinonedirectiononly.– Intuition:Ifyouhavingsortinbothways,itcouldcancelout.

• Thedensitytestcouldalsofail,evenwhenthereisnofailureofidentification – butstilloftenausefultest.

Thetest:1. Estimateaveryunder-smoothedhistogram.Thebinsforthe

histogramaredefinedcarefullyenoughthatnoonehistogrambinincludespointsbothtotheleftandrightofthepointofdiscontinuity.

2. Estimatealocallinearsmoothingofthehistogram.Themidpointsofthehistogrambinsaretreatedasaregressor,andthenormalizedcountsofthenumberofobservationsfallingintothebinsaretreatedasanoutcomevariable.

3. Uselocallinearregressionestimatesfromstep2totestfordiscontinuity.

Page 19: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

B.Testofdiscontinuityoftheforcingvariable(cont’d)

Figure 4. Democratic Vote Share Relative to Cutoff:Popular Elections to the House of Representatives, 1900-1990

0

30

60

90

120

150

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

Democratic Margin

Freq

uenc

y C

ount

0.00

0.20

0.40

0.60

0.80

1.00

1.20

1.40

1.60

Den

sity

Est

imat

e

Figure 5. Percent Voting Yeay:Roll Call Votes, U.S. House of Representatives, 1857-2004

0

50

100

150

200

250

300

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Percent Voting in Favor of Proposed Bill

Freq

uenc

y C

ount

0.00

0.50

1.00

1.50

2.00

2.50

Den

sity

Est

imat

e

Page 20: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

B.Testofdiscontinuityoftheforcingvariable(cont’d)

Figure 4. Democratic Vote Share Relative to Cutoff:Popular Elections to the House of Representatives, 1900-1990

0

30

60

90

120

150

-1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1

Democratic MarginFr

eque

ncy

Cou

nt

0.00

0.20

0.40

0.60

0.80

1.00

1.20

1.40

1.60

Den

sity

Est

imat

e

Figure 5. Percent Voting Yeay:Roll Call Votes, U.S. House of Representatives, 1857-2004

0

50

100

150

200

250

300

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Percent Voting in Favor of Proposed Bill

Freq

uenc

y C

ount

0.00

0.50

1.00

1.50

2.00

2.50

Den

sity

Est

imat

e

Page 21: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

RDD:Sorting/bunching(Camacho&Conover,2010)

Example:ManipulationofapovertyindexinColombia.Apovertyindexisusedtodecideeligibilityforsocialprograms.Thealgorithmtocreatethepovertyindexbecomespublicduringthesecondhalfof1997.

Page 22: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

C.Placebotests• Almondetal.(2010)useamedicaldefinitionof“verylowbirth

weight”<1500grams,toestimatetheeffectofadditionalmedicalcareonnewborns.

• Theyfindthatnewbornsjustbelowthe1500gramscutoffreceiveadditionaltreatmentandsurvivewithhigherprobabilitythannewbornsjustabovethecutoff.

• However,Barreca etal.(2010)findevidenceofnon-randomroundingat100-grammultiplesofbirthweight.

• Newbornsoflowsocioeconomicstatus,whotendtobelesshealthy,aredisproportionatelyrepresentedat100-grammultiples(balancecheck).

• Asaresult,newbornswithbirthweightsjustbeloworaboveeach100-multiplehavemore-favorablemortalityoutcomesthannewbornswithbirthweightsatthecutoffs.

Page 23: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

C.Placebotests(Barreca etal.,2010)2118 QUARTERLY JOURNAL OF ECONOMICS

FIGURE I

Means of Mortality Rates

Estimates are based on Vital Statistics Linked Birth and Infant Death Data,United States, 1983–2002 (not including 1992–1994). The lower panels of thisfigure (C, D) are disaggregated versions of ADKW’s Figure II.

numbers for convenience. In an effort to argue that the heapingaround the 1,500-g threshold is “not irregular” and hence not ofconcern, they argue that similar heaps are found around 1,400 gand 1,600 g where individuals would have no incentive to act ina strategic manner. Using McCrary’s (2008) estimation strategy,they also appeal to the lack of a statistically significant estimateof the discontinuity in the distribution.

Nevertheless, it turns out that the 1,500-g heap is irregular ina critical fashion. In particular, those at this data heap have sub-stantially higher mortality rates than surrounding observationson either side of the VLBW threshold. This feature of the data isdemonstrated in Figure I, in which we illustrate unadjusted meanmortality rates across the distribution of birth weights around1,500 g.1 In each of the four panels, documenting 24-hour through

1. Note that our Figure I is a disaggregated version of Figure II in ADKW.

at Acquisition D

ept Serials on March 1, 2012

http://qje.oxfordjournals.org/D

ownloaded from

Page 24: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

C.Discontinuitiesinaverageoutcomesatothervalues

TakingthesubsamplewithXi <c wecantestforajumpintheconditionalmeanoftheoutcomeatthemedianoftheforcingvariable.

Toimplementthetest,usethesamemethodforselectingthebandwidthasbefore.Alsoestimatethestandarderrorsofthejumpandusethistotestthehypothesisofazerojump.

RepeatthisusingthesubsampletotherightofthecutoffpointwithXi ≥ c.NowestimatethejumpintheregressionfunctionandatqX,1/2,r, andtestwhetheritisequaltozero.

Page 25: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

C.Discontinuitiesinaverageoutcomesatothervalues

ExamplefromDiNardo andLee(2008)FIGURE IIIa

Recognition, Subsequent Certification or Decertification, by Union Vote Share.

FIGURE IIIbContract Expiration Notice Filed, Prior to and Postcertification or

Decertification Election, by Union Vote ShareNote: Figure IIIa: Initial Elections that take place between 1984–1995, 21405

observations. Point estimates and standard errors (in parentheses) are from aregression of the dependent variable on a fourth-order polynomial and a certifi-cation status dummy variable. Figure IIIb: Post-: Elections take place (1984–1995), 21405 and 3785 for certification and decertification elections, respectively.Prior: Elections take place (1987–1999), 21457 and 3445 observations.

1407ECONOMIC IMPACTS OF NEW UNIONIZATION

Page 26: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

C.Discontinuitiesinaverageoutcomesatothervalues

ExamplefromDiNardo andLee(2008)

if the sampling process follows the familiar form of incidentalcensoring as in

!4"y* ! X# " D$ " εy ! y* ! 1%X& " D' " v # 0(,

where the outcome y* is only observed if the employer remains inbusiness. If (ε,v,X) is independent of D—as in a randomizedexperiment—and if there is no impact of unionization on survival(' ) 0), then there will be no sample selection bias.28 As arguedabove, unionization could be thought of as being randomly as-signed (among close elections), and Figure IV is consistent with azero impact on survival. In order to evaluate the plausibility of azero impact—compared with, for example, a *0.04 effect thatcannot be ruled out due to sampling error. Below, we present

28. As long as the impact of certification on survival is “monotonic,” theextent of the bias induced by analyzing a sample comprised solely of survivors isrelated to the extent of the differential survivor probability of near winners andnear losers. Since in our application, this difference is small, the extent of the biasis also necessarily small. See Lee [2002], for example.

FIGURE VLog(Sales) and Log(Sales/worker), by Union Vote Share

1411ECONOMIC IMPACTS OF NEW UNIONIZATION

Page 27: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

D.Bandwidthselectionandsensitivity

• Therearetwogeneralmethodsforselectionbandwidth– Adhoc,orsubstantivelyderived(e.g.,electionsbetween48-52%

are“close”)– Datadriven

• Wediscussbelow

Page 28: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

D.Bandwidthselectionandsensitivity(cont’d)

ForPolynomialRegression– Choosingtheorderofthepolynomialisanalogoustothechoiceof

bandwidth– Informalapproach:pickareasonablenumber(e.g.,4th order)– Twoapproaches

• UsetheAkaike informationcriterion(AIC)formodelselection:isthemeansquarederrorofthe

regressionandp isthenumberofmodelparameters• Selectanaturalsetofbins(asyouwouldforanRDgraph)andaddbindummiestothemodelandtesttheirjointsignificance.Addhigherordertermstothepolynomialuntilthebindummiesarenolongerjointlysignificant.– Thisalsoturnsouttobeatestforthepresenceofdiscontinuitiesintheregressionfunctionatpointsotherthanthecutoff,whichyou’llwanttodoanyway.

AIC = ln σ 2( ) + 2p, where ˆ σ 2

Page 29: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

D.Bandwidthselectionandsensitivity(cont’d)

ForLocalLinearRegressionbandwidthselectionrepresentsthefamiliartradeoffbetweenbiasandprecision• Whenthelocalregressionfunctionismoreorlesslinear,

thereisn’tmuchofatradeoffsobandwidthcanbelarger• Sopicka“reasonableh”.• Optimalbandwidthchoice:

– Theintuitionisthatyousetuptheobjectiveofminimizingthe(meansquared)errorbetweentheestimatedtreatmenteffectandactualtreatmenteffect.

– Thisgetsverytechnical.– ButthegoodnewsisthatthetwoStata commandsdoitforyou.– Imbens’codeathttp://users.nber.org/~rdehejia/!@$AEM/Topic%2007%20RD%20Advanced/rdob.ado givesyou

bandwidthchoiceusingrdob.– Alsotheprogramrd (ssc install rd)automaticallyusestheIK

optimalbandwidth.

Page 30: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

D.Bandwidthselectionandsensitivity(cont’d)

Inbothcases• Inpractice,youmaywanttofocusonresultsforthe“optimal”bandwidth,butit’simportanttotestforlotsofdifferentbandwidths.Thinkoftheoptimalbandwidthonlyasastartingpoint.

• Ifresultscriticallydependonaparticularbandwidth,theyarelesscredibleandchoiceofbandwidthrequiresasubstantivejustification.

• Inprinciple,theoptimalbandwidthfortestingdiscontinuitiesincovariatesmaynotbethesameastheoptimalbandwidthforthetreatment.Again,followthepracticeoftestingrobustnesstovariationsinbandwidth.

Page 31: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

E.FuzzyRDdesign• Cutoffdoesnotperfectlydeterminetreatmentbutcreatesadiscontinuity

intheprobabilityofreceivingthetreatment• Forexample:

• TheprobabilityofbeingofferedascholarshipmayjumpatacertainSATscore(abovewhichtheapplicationsaregiven“specialconsideration”)

• Incentivestoparticipateinaprogrammaychangediscontinuouslyatathreshold,butthechangeisnotpowerfulenoughtomoveallunitsfromnonparticipationtoparticipation

• Forunitsclosetothecutoffwecanuse

asaninstrumentforDi.• Weestimatetheeffectofthetreatmentforcompliers:thoseunits(close

tothediscontinuity,Xi ≅ c)whosetreatmentstatus,Di,dependsonZi.

!!!!

Zi =1 if Xi ≥ c !0 if Xi < c

#

$ %

& %

Page 32: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

E.FuzzyRDdesign• TheideaisthatforunitsthatareveryclosetothediscontinuityZi

canactasaninstrument• TheLATEparameteris:

or

• Thissuggests:1. RunasharpRDDforY2. RunasharpRDDforD3. Divideyourestimateinstep1byyourestimateinstep2

• Alternatively,runinstrumentalvariablesforthoseunitswithX≅ c

!!!!

limc−ε≤X≤c+ε

ε→0

E Y Z =1[ ] − E Y Z = 0[ ]E D Z =1[ ]− E D Z = 0[ ]&

'

( (

)

*

+ + ,

!!!!

limx↓c E Y X = x[ ]− limx↑c E Y X = x[ ]limx↓c E D X = x[ ]− limx↑c E D X = x[ ]

Page 33: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

F.Regressionkinkdesign• Insomesituationsatthecutoffitistheslopeofthetreatmentintensity

thatchanges,nottheleveloftreatmentassignment(0to1).• Classicexampleisunemploymentbenefitswhereyourbenefitisa

functionofpriorearnings.

Page 34: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

F.Regressionkinkdesign• Butthenyouexpectthattimetonextjobvarieswithbaseyearearnings

continuously(nojump),butwithachangeinslope.

Page 35: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

• Prisonsystemsinmanycountriessufferfromovercrowdingandhighrecidivismratesafterrelease

• Somecountriesuseearlydischargeofprisonersonelectronicmonitoring

• Difficulttoestimateimpactofearlyreleaseprogramonfuturecriminalbehavior:bestbehavedinmatesareusuallytheonestobereleasedearly

• Marie(2008)considerstheHomeDetentionCurfew(HDC)programinEnglandandWales

• ThisisafuzzyRDD:Onlyoffenderssentencedtomorethanthreemonth(88days)inprisonareeligibleforHDC,butnotallofthoseareofferedHDC

Earlyreleaseprogram(Marie,2009)

Page 36: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

Earlyreleaseprogram(Marie,2009)

Page 37: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

Earlyreleaseprogram(Marie,2009)

Page 38: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

Earlyreleaseprogram(Marie,2009)

Page 39: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

Earlyreleaseprogram(Marie,2009)

Page 40: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

“Regression-DiscontinuityDesignsandPopularElections:ImplicationsofPro-IncumbentBiasinCloseU.S.

HouseRaces”

byCaughey andSekhon (CS)

Page 41: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

Basicargument• “Closeelectionsarenotlikeotherelections”

– Strategicpoliticalactorshavestrongincentivestotargettheirresourceswheretheywillhavethegreatestmarginalimpact

• Thereisanincumbencyadvantageeveninverycloseelections– Theincumbentwinsdisproportionatelyandhasgreaterfinancial

resources

• Thisfinding,alongwithothercovariateimbalancesatthecutpoint,callsintoquestiontheLMBincumbencyadvantageresultsand,moregenerally,theassumptionthatoutcomesincloseelectionsare“asgoodasrandomlyassigned”– NotethatCScritiqueLee(2008),notLMB,buttheimplicationsforthe

incumbencyadvantageresultsinbothpapersarethesame

Page 42: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

TheLee(&McCrary)testsformanipulationFigure1:LocalfrequencycountsoftheDemocraticmargininU.S.Houseelections,1946-2008,withlocallinearestimateoverlaid.BandwidthswerechosenbythealgorithmproposedbyImbens andKalyanaraman (2009).

Agraphlike(A)ledLee,andseparatelyMcCrary,toconcludethatthereisnomanipulation.However,(B)and(C)begintosuggestanotherstory.Remember,theconcerniswiththeincumbentparty’svoteshare,nottheDemocraticvoteshare.

Page 43: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

DensityoftheassignmentvariableFigure2:Histogramoftheincumbentparty’smargininU.S.Houseelections,1946-2008.Thelocallinearestimateisbasedonatriangularkernelwithabandwidthof10.8,whichisoptimalaccordingtotheImbens-Kalyanaraman algorithm.Theestimateddiscontinuityatthecut-pointis9.5(SE =3.7).

KeyTakeaway:Thecandidateoftheincumbentpartyisaboutthreetimesmorelikelytowinelectionbyhalfapercentagepointorlessthantolosebyasimilarmargin.Thedensityofthisvariableappearstodivergeratherthanconvergeintheneighborhoodofthecut-point.

Page 44: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

CovariateimbalanceBasedoncorrectingsomeofLee’sdataandaddingsomenewvariables,CSfindsimbalanceatthecutoffinthefollowing:– Democraticmargininthepreviouselection– Theparties’relativecampaignexpenditures– 1st dimensionNOMINATEscoreofthecurrentincumbent– WhethertheDemocrat(Republican)candidateisthecurrentincumbent– NumberoftermstheDemocrat(Republican)hasservedintheU.S.

HouseofRepresentatives– WhethertheDemocrat(Republican)hasmorepoliticalexperiencethan

theRepublican(Democrat)– CongressionalQuarterly’sOctoberpredictionofwhichcandidatewill

wintherace

Page 45: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

Potentialmechanisms• Notlikelytobeoutrightfraud,becausesignificanceoflaggedvote

shareisincreasingovertimeandwebelievepotentialforfraudhasbeendecreasing.

• Controloverrecountsdoesnotappeartobethekeybecausetheyrarelyhappenandevenmorerarelychangetheoutcome.

• Butwedon’tneedanexplanationbasedonvotecounting.Differencesbetweenwinnersandlosersinincumbency,money,politicalexperience,andotherpre-electionresourcesareevidentfarbeforeanyvotesarecast,counted,ormanipulated.

• Thesedifferencescanbeseeninelectionsexpectedtobecloseexanteandinthosethatwereinfactdecidedbyanarrowmargin.

• Thesefactscontradicttheideathatresources,expectations,andallelseshouldbebalancedintheclosestelections.

Page 46: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

LessonsfromLMB&CS

• Thisisacautionarytale– LMBare very goodscholars.– Theydidalmosteverythingright.

• Theydonotdoalottojustifyfunctionalformorshowrobustnesstodifferentbandwidths.

Page 47: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

LessonsfromLMB&CS(cont’d)• Whatcanyoulearnfromthisexchange:

– Trytofindproblemsinyourdesignbeforesomeoneelsedoesitforyou

– Identifyandcollectaccuratedataontheobservablecovariatesmostlikelytorevealsortingatthecut-point.Thismaynotbethecovariatesthathappentobesittinginyourdataset.• Laggedvaluesofthetreatmentvariablearealwaysagoodidea.Inelections,thepartythatcurrentlycontrolstheoffice.

– Automatedbandwidthselectionalgorithmsdonotguaranteegoodresults.Theyarejustastartingpoint.

– ForRDpurposes,whatconstitutesa“close”electionappearstobecloserthanthe48-52%bandwidthwidelyuseduptonow.CSgetmostoftheirresultsusing49.5-50.5%.

Page 48: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

GuidetoPractice

Page 49: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

StepsforsharpRDanalysis

1. Graphthedatabycomputingtheaveragevalueoftheoutcomevariableoverasetofbins.– Thebinwidthhastobelargeenoughtohaveasufficientamountof

precisionsothattheplotslookssmoothoneithersideofthecutoffvalue,butatthesametimesmallenoughtomakethejumparoundthecutoffvalueclear.

2. Estimatethetreatmenteffectbyrunninglinearregressionsonbothsidesofthecutoffpoint.– Witharectangularkernel,thesearejuststandardregression

estimatedwithinabinofwidthh onbothsidesofthecutoffpoint.Notethat:Standarderrorscanbecomputedusingstandardleastsquaremethods(robuststandarderrors).Theoptimalbandwidthcanbechosenusingcrossvalidationorothermethods.

Page 50: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

StepsforsharpRDanalysis

3. Therobustnessoftheresultsshouldbeassessedbyemployingvariousspecificationtests.- Lookingatpossiblejumpsinthevalueofothercovariatesatthecutoff

point- Testingforpossiblediscontinuitiesintheconditionaldensityofthe

forcingvariable- Lookingwhethertheaverageoutcomeisdiscontinuousatothervalues

oftheforcingvariable- Usingvariousvaluesofthebandwidth,withandwithoutother

covariatesthatmaybeavailable.

Page 51: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

StepsforfuzzyRDanalysis

1. GraphtheaverageoutcomesoverasetofbinsasinthecaseofSRD,butalsographtheprobabilityoftreatment.

2. Estimatethetreatmenteffectusing2SLS,whichisnumericallyequivalenttocomputingtheratiointheestimateofthejump(atthecutoffpoint)intheoutcomevariableoverthejumpinthetreatmentvariable.

– Standarderrorscanbecomputedusingtheusual(robust)2SLSstandarderrors

– Theoptimalbandwidthcanagainbechosenusingoneofthemethodsdiscussedabove.

3. TherobustnessoftheresultscanbeassessedusingthevariousspecificationtestsmentionedinthecaseofSRDdesigns.

Page 52: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

EvaluatinganRDPaper(possiblyyourown)

• Doestheauthorshowconvincinglythat– Treatmentchangesdiscontinuouslyatthecutpoint– Outcomeschangediscontinuouslyatthecutpoint– Othercovariatesdonotchangediscontinuouslyatthecutpoint– Pretreatmentoutcomesdonotchangeatthecutpoint– Thereisnomanipulationoftheassignmentvariable(bunchingnear

thecutpoint)

• Arethebasicresultsevidentfromasimplegraph?

• Aretheresultsrobusttodifferentfunctionalformassumptionsabouttheassignmentvariable?– Forexample,parametricandnonparametricfits,differentbandwidths,

etc.

Page 53: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

EvaluatinganRDPaper(possiblyyourown)

• Couldotherpossiblyunobservedtreatmentschangediscontinuouslyatthecutoff(bundlingofinstitutions)?

– Forexample,18thbirthdaymarksadiscontinuouschangeineligibilitytovote,butalsoeligibilityfordraft,sentencingasanadult,andlotsofotherthings,whichmayormaynotberelevantdependingontheoutcomeinquestion

• Externalvalidity

– Arecasesnearthecutpoint differentfromcasesfarfromthecutpoint inotherways?Dothesedifferencesmakethemmoreorlessrelevantfromatheoreticalorpolicyperspective?

Page 54: Regression Discontinuity: Advanced Topicsusers.nber.org/~rdehejia/!@$AEM/Topic 07 RD... · assumptions. Although not required for the validity of the design, in most cases, the reason

ExamplesinSTATA