randomized controlled trials and the evaluation of development programs chris elbers vu university...
TRANSCRIPT
![Page 1: Randomized controlled trials and the evaluation of development programs Chris Elbers VU University and AIID 11 November 2015](https://reader036.vdocuments.site/reader036/viewer/2022083009/5697bfc71a28abf838ca7aac/html5/thumbnails/1.jpg)
Randomized controlled trials and the evaluation of development programs
Chris ElbersVU University and AIID
11 November 2015
![Page 2: Randomized controlled trials and the evaluation of development programs Chris Elbers VU University and AIID 11 November 2015](https://reader036.vdocuments.site/reader036/viewer/2022083009/5697bfc71a28abf838ca7aac/html5/thumbnails/2.jpg)
• Joint work with Jan Willem Gunning• Ideas developed when evaluating Dutch
development programs (commissioned work)• Related work by White (2006), Elbers, Gunning
and de Hoop (WD 2009), De Janvry, Finan and Sadoulet (REStat 2012)
![Page 3: Randomized controlled trials and the evaluation of development programs Chris Elbers VU University and AIID 11 November 2015](https://reader036.vdocuments.site/reader036/viewer/2022083009/5697bfc71a28abf838ca7aac/html5/thumbnails/3.jpg)
RCTs under fire
• Great successes trigger criticism• RCTs’ claim to Gold Standard status has been
attacked more or less aggressively– External validity questioned– Black box approach not scientific– Cannot answer ‘big questions’ (e.g. on economic
development)– “experiments have no special ability to produce more
credible knowledge than other methods” (Deaton, 2010, JEL)
![Page 4: Randomized controlled trials and the evaluation of development programs Chris Elbers VU University and AIID 11 November 2015](https://reader036.vdocuments.site/reader036/viewer/2022083009/5697bfc71a28abf838ca7aac/html5/thumbnails/4.jpg)
Practical considerations
• External validity not an issue if the goal is to evaluate a particular project
• RCTs are great for providing proof of conceptBut…• Actual programs not always amenable to
evaluation by RCT– Of course, the program could be changed…
• Salvage old-fashioned regression using observational (i.e. non-experimental) data?
![Page 5: Randomized controlled trials and the evaluation of development programs Chris Elbers VU University and AIID 11 November 2015](https://reader036.vdocuments.site/reader036/viewer/2022083009/5697bfc71a28abf838ca7aac/html5/thumbnails/5.jpg)
Outline
• Internal validity or RCTs not automatic• Programs vs projects• What do we want to estimate when evaluating
a program?– The total program effect
• Application to health insurance in Vietnam• Conclusion
![Page 6: Randomized controlled trials and the evaluation of development programs Chris Elbers VU University and AIID 11 November 2015](https://reader036.vdocuments.site/reader036/viewer/2022083009/5697bfc71a28abf838ca7aac/html5/thumbnails/6.jpg)
When internal validity of RCTs could fail
• Example: program implemented at arm’s length:– Program officers select (intended) participants based
on information specific to them• Evaluation of the program must follow this design– Direct random assignment of (intended) treatment to
ultimate participants misses the effect of POs’ selection activity → internal validity violated
– Randomization must be over POs …– … killing statistical power
• precision is of order of number of POs in sample
![Page 7: Randomized controlled trials and the evaluation of development programs Chris Elbers VU University and AIID 11 November 2015](https://reader036.vdocuments.site/reader036/viewer/2022083009/5697bfc71a28abf838ca7aac/html5/thumbnails/7.jpg)
Regression alternative(simplest case)
• Take random sample from potential beneficiaries of program
• Observe (intended) treatment status T and outcome y
• Regress y on T– Regression coefficient on T is ATET (assuming absence
of confounders)– ATET times treatment fraction is per capita effect of
program (‘total program effect’)– Precision is of order of number of sampled individuals
![Page 8: Randomized controlled trials and the evaluation of development programs Chris Elbers VU University and AIID 11 November 2015](https://reader036.vdocuments.site/reader036/viewer/2022083009/5697bfc71a28abf838ca7aac/html5/thumbnails/8.jpg)
Projects and programs
• RCTs best suited for evaluating simple interventions in homogeneous group of subjects with strong supervision of implementation
• Real-life interventions are messier– They are a change of already existing policies– They implement different policies simultaneously, with
different intensity, involving officers with varying degrees of enthusiasm, …
– Selective participation is part and parcel of a typical program• Should we not also try to quantify the impact of such
programs? Can we?
![Page 9: Randomized controlled trials and the evaluation of development programs Chris Elbers VU University and AIID 11 November 2015](https://reader036.vdocuments.site/reader036/viewer/2022083009/5697bfc71a28abf838ca7aac/html5/thumbnails/9.jpg)
A regression approach for evaluation of programs
• Consider the following model
– (at least) two observations t=0 and t=1– Random sample of beneficiaries i– bundle of policy variables, vector of impact coefficients
• We want to link a change of the outcome variable over time to a change in policy :
• The contribution of the policy change to the change in outcome per beneficiary in the population is
TPE = • Policy mix must be observed at observation unit level
– Íntervention histories
![Page 10: Randomized controlled trials and the evaluation of development programs Chris Elbers VU University and AIID 11 November 2015](https://reader036.vdocuments.site/reader036/viewer/2022083009/5697bfc71a28abf838ca7aac/html5/thumbnails/10.jpg)
The total program effect
TPE = • Different combinations of interventions affect different
individuals• Allow for selectivity: differences in policies will be correlated
to impact parameters (e.g. POs selecting participants based on perceived likelihood of success)– Selectivity compromises simple estimation of impact coefficients
even if is independent of and (usual assumptions needed for regressions)
– This problem of treatment heterogeneity can be tackled by modeling as a function of and
![Page 11: Randomized controlled trials and the evaluation of development programs Chris Elbers VU University and AIID 11 November 2015](https://reader036.vdocuments.site/reader036/viewer/2022083009/5697bfc71a28abf838ca7aac/html5/thumbnails/11.jpg)
Formally:
![Page 12: Randomized controlled trials and the evaluation of development programs Chris Elbers VU University and AIID 11 November 2015](https://reader036.vdocuments.site/reader036/viewer/2022083009/5697bfc71a28abf838ca7aac/html5/thumbnails/12.jpg)
Example: health insurance in Vietnam
• Using data from a study by Wagstaff and Pradhan (WB, 2005)
• Health insurance introduced in 1990s• Wagstaff and Pradhan try to avoid bias from
treatment heterogeneity by matching insured and uninsured households on the likelihood of being insured (propensity score matching)
• This technique not suitable for TPE– Sample with matched T/C individuals is no longer
representative of population
![Page 13: Randomized controlled trials and the evaluation of development programs Chris Elbers VU University and AIID 11 November 2015](https://reader036.vdocuments.site/reader036/viewer/2022083009/5697bfc71a28abf838ca7aac/html5/thumbnails/13.jpg)
Table 1: Data for the Vietnam Insurance Example
Variable: change in (average) Mean Std. Dev Min MaxArm circumference (cm) 1.154 2.013 -7.3 9.4Height (cm) 5.175 11.35 -49.57 39.84Body weight (kg) 2.983 6.544 -27.75 26.25Health expenditure (‘000 Dong) 1,081 5,519 -8808 233,965Total consumption expenditure (‘000 Dong) 6,513 8,009 -22,988 116,826Insurance (binary at individual level) 0.170 0.268 0 1School attended -0.017 0.683 -3.5 3Currently attending school (binary at individual level) 0.082 0.388 -2 2Gender 0.002 0.138 -0.75 1Age 3.522 8.299 -48.43 48.6Farm dummy -0.079 0.421 -1 1Household size -0.267 1.696 -18 11
The number of observations varies between 4299 and 4305.Source: authors’ calculations using the Vietnam Living Standard Surveys 1992-3, 1997-8.
![Page 14: Randomized controlled trials and the evaluation of development programs Chris Elbers VU University and AIID 11 November 2015](https://reader036.vdocuments.site/reader036/viewer/2022083009/5697bfc71a28abf838ca7aac/html5/thumbnails/14.jpg)
Estimation of TPE
• Program variable is fraction of insured household members– No ITT approach: (self) selection is part of an insurance
program• Unit of analysis is household– Individual outcomes averaged per household
• TPE estimated ‘naively’– Assuming constant across households i
• TPE estimated as proposed above, allowing to be household specific
![Page 15: Randomized controlled trials and the evaluation of development programs Chris Elbers VU University and AIID 11 November 2015](https://reader036.vdocuments.site/reader036/viewer/2022083009/5697bfc71a28abf838ca7aac/html5/thumbnails/15.jpg)
Table 2: Total Program EffectsDependent variable Naïve program
effect† (I)(s.e.)
Total program effect†† (II)
(s.e.)
R-squared of underlying regressions
Remarks
I II Arm circumference .022
(.029)0.090***(0.027)
0.22 0.23
Height -0.190(0.154)
.095( 0.139)
0.34 0.36
Body weight 0.167*(0.083)
0.384***(0.074)
0.31 0.33
Health expenditure -28.08(60.59)
-52.79(51.01)
0.03 0.04 Total consumption included in controls
Health expenditure 55.41(66.42)
64.32(52.87)
0.00 0.00 Total consumption expenditure not included
Total consumption expenditure
626.7***(110.9)
888.8***(105.7)
0.10 0.12 Total consumption expenditure not included
![Page 16: Randomized controlled trials and the evaluation of development programs Chris Elbers VU University and AIID 11 November 2015](https://reader036.vdocuments.site/reader036/viewer/2022083009/5697bfc71a28abf838ca7aac/html5/thumbnails/16.jpg)
Conclusions
• RCTs ill suited for evaluation of ‘programs’• Programs involve strategic participation by potential
participants and implementers– Evaluation must take that into account
• Regression techniques combined with proper sampling can identify combined impact of program elements – under nontrivial assumptions
• Simplest approximation of treatment heterogeneity suggests extensive use of interactions