session 1 general overview
Post on 18-Dec-2014
207 Views
Preview:
DESCRIPTION
TRANSCRIPT
Chris Nicoletti
Activity #267: Analysing the socio-economic impact of the Water Hibah on beneficiary households and communities (Stage 1)
Impact Evaluation Training Curriculum Session 1 April 16, 2013
This material constitutes supporting material for the "Impact Evaluation in Practice" book. This additional material is made freely but please acknowledge its use as follows: Gertler, P. J.; Martinez, S., Premand, P., Rawlings, L. B. and Christel M. J. Vermeersch, 2010, Impact Evaluation in Practice: Ancillary
Material, The World Bank, Washington DC (www.worldbank.org/ieinpractice). The content of this presentation reflects the views of the authors and not necessarily those of the World Bank.
MEASURING IMPACT Impact Evaluation Methods for Policy
Makers
3
• My name is Chris Nicoletti • From NORC • Senior Impact Evaluation Analyst. • Worked in Zambia, Ghana, Cape Verde, Philippines,
Indonesia, Colombia, Burkina Faso, etc. • Live in Colorado
– I like to ski, hike, climb, bike, etc. – Married and do not have any children
• What is your name? • Let’s go around the room and do introductions…
Introduction…
Impact Evaluation Training Curriculum - Activity 267
4
Tuesday - Session 1
INTRODUCTION AND OVERVIEW
1) Introduction
2) Why is evaluation valuable?
3) What makes a good evaluation?
4) How to implement an evaluation?
Wednesday - Session 2
EVALUATION DESIGN
5) Causal Inference
6) Choosing your IE method/design
7) Impact Evaluation Toolbox
Thursday - Session 3
SAMPLE DESIGN AND DATA COLLECTION
9) Sample Designs
10) Types of Error and Biases
11) Data Collection Plans
12) Data Collection Management
Friday - Session 4
INDICATORS & QUESTIONNAIRE DESIGN
1) Results chain/logic models
2) SMART indicators
3) Questionnaire Design
Outline: topics being covered
5
Today, we will answer these questions…
Impact Evaluation Training Curriculum - Activity 267
Why is evaluation valuable?
What makes a good impact evaluation?
How to implement an impact evaluation?
1
2
3
Why is evaluation valuable?
What makes a good impact evaluation?
1
2
3
6 6
Today, we will answer these questions…
Impact Evaluation Training Curriculum - Activity 267
Why is evaluation valuable?
What makes a good impact evaluation?
1
2
3 How to implement an impact evaluation?
7 7
Why Evaluate?
Impact Evaluation Training Curriculum - Activity 267
Need evidence on what works
Information key to sustainability
Improve program/policy implementation
1
2
3
Limited budget and bad policies could hurt
Design (eligibility, benefits) Operations (efficiency & targeting)
Budget negotiations Informing beliefs and the press Results agenda and Aid effectiveness
8
Results-Based Management is a global trend
Establishing links between monitoring and evaluation, policy formulation, and budgets
Managers are judged by their programs’ performance, not their control of inputs: A shift in focus from inputs to outcomes.
Critical to effective public sector management
What is new about results?
Impact Evaluation Training Curriculum - Activity 267
9
Monitoring vs. Evaluation
Monitoring Evaluation Frequency Regular, Continuous Periodic
Coverage All programs Selected program, aspects
Data Universal Sample based
Depth of Information
Tracks implementation, looks at WHAT
Tailored, often to performance and impact/ WHY
Cost Cost spread out Can be high
Utility Continuous program improvement, management Major program decisions
Impact Evaluation Training Curriculum - Activity 267
10
Monitoring
A continuous process of collecting and analyzing information, to compare how well a project, program or policy is
performing against expected results, and
to inform implementation and program management.
Impact Evaluation Training Curriculum - Activity 267
11 11
Impact Evaluation Answers
Impact Evaluation Training Curriculum - Activity 267
What was the effect of the program on outcomes?
How much better off are the beneficiaries because of the program/policy?
How would outcomes change if the program design changes?
Is the program cost-effective?
12
Evaluation
A systematic, objective assessment of an on-going or completed project, program, or policy, its design, implementation and/or results, to determine the relevance and fulfillment of objectives,
development efficiency, effectiveness, impact and sustainability, and
to generate lessons learned to inform the decision making process,
tailored to key questions.
Impact Evaluation Training Curriculum - Activity 267
13
Impact Evaluation
An assessment of the causal effect of a project , program or policy on beneficiaries. Uses a counterfactual… to estimate what the state of the beneficiaries would have
been in the absence of the program (the control or comparison group), compared to the observed state of beneficiaries (the treatment group), and
to determine intermediate or final outcomes attributable to the intervention.
Impact Evaluation Training Curriculum - Activity 267
14 14
Impact Evaluation Answers
Impact Evaluation Training Curriculum - Activity 267
What is effect of a household (hh) water connection on hh water expenditure?
Does contracting out primary health care lead to an increase in access?
Does replacing dirt floors with cement reduce parasites & improve child health?
Do improved roads increase access to labor markets & raise income?
15 15
Answer these questions
Impact Evaluation Training Curriculum - Activity 267
Why is evaluation valuable?
How to implement an impact evaluation?
What makes a good impact evaluation?
1
2
3
16 16
How to asses impact
Impact Evaluation Training Curriculum - Activity 267
What is beneficiary’s test score with program compared to without program?
Compare same individual with & without programs at same point in time
Formally, program impact is: α = (Y | P=1) - (Y | P=0)
e.g. How much does an education program improve test scores (learning)?
17 17
Solving the evaluation problem
Impact Evaluation Training Curriculum - Activity 267
Estimated impact is difference between treated observation and counterfactual.
Counterfactual: what would have happened without the program.
Need to estimate counterfactual.
Never observe same individual with and without program at same point in time.
Counterfactual is key to impact evaluation.
18 18
Counterfactual Criteria
Impact Evaluation Training Curriculum - Activity 267
Treated & Counterfactual (1) Have identical characteristics, (2) Except for benefiting from the intervention.
No other reason for differences in outcomes of treated and counterfactual.
Only reason for the difference in outcomes is due to the intervention.
19 19
2 Counterfeit Counterfactuals
Impact Evaluation Training Curriculum - Activity 267
Before and After
Those not enrolled Those who choose not to
enroll in the program Those who were not offered
the program
Same individual before the treatment
20 20
1. Before and After: Examples
Impact Evaluation Training Curriculum - Activity 267
You do not take into consideration things that are changing over the intervention period.
Agricultural assistance program Financial assistance to purchase inputs. Compare rice yields before and after. Before is normal rainfall, but after is drought. Find fall in rice yield. Did the program fail? Could not separate (identify) effect of financial
assistance program from effect of rainfall.
21 21
2.Those not enrolled: Example 1
Impact Evaluation Training Curriculum - Activity 267
Compare employment & earning of those who sign up to those who did not
Job training program offered
Who signs up? Those who are most likely to benefit -i.e. those with more ability- would have higher earnings than non-participants without job training
Poor estimate of counterfactual
22
What’s wrong?
Impact Evaluation Training Curriculum - Activity 267
Selection bias: People choose to participate for specific reasons
1
2
3
Job Training: ability and earning Health Insurance: health status and medical
expenditures
Many times reasons are related to the outcome of interest
Cannot separately identify impact of the program from these other factors/reasons
23
Possible Solutions???
Impact Evaluation Training Curriculum - Activity 267
Need to guarantee comparability of treatment and control groups. ONLY remaining difference is intervention.
In this training we will consider: Experimental Designs Quasi-experiments (Regression Discontinuity, Double
differences) Non-experimental (or) Instrumental Variables.
EXPERIMENTAL DESIGN!!!
24 24
Answer these questions
Impact Evaluation Training Curriculum - Activity 267
Why is evaluation valuable?
How to implement an impact evaluation?
What makes a good impact evaluation?
1
2
3
25
When to use Impact Evaluation?
Evaluate impact when project is: Innovative Replicable/scalable Strategically relevant for reducing
poverty Evaluation will fill knowledge gap Substantial policy impact
Use evaluation within a program to test alternatives and improve programs
Impact Evaluation Training Curriculum - Activity 267
26
Choosing what to evaluate
Criteria Large budget share Affects many people Little existing evidence of impact for
target population (IndII Examples?)
No need to evaluate everything
Spend evaluation resources wisely
27
IE for ongoing program Development
Are there potential program adjustments that would benefit from a causal impact evaluation?
Implementing parties have specific questions they are concerned with.
Are there parts of a program that may not be working?
28
How to make evaluation impact policy focused
Example: Scale up pilot? (i.e., Water Hibah) Criteria: Need at least a X% average increase in beneficiary outcome over a given period
Address policy-relevant questions What policy questions need to be answered? What outcomes answer those questions? What indicators measures outcomes? How much of a change in the outcomes
would determine success?
29
Policy impact of evaluation
What is the policy purpose?
Provide evidence for pressing decisions
Design evaluation with policy makers
IndII Examples???
30
Decide what need to learn.
Experiment with alternatives.
Measure and inform. Adopt better alternatives
overtime.
Policy impact of evaluation
Change in incentives Rewards for changing programs. Rewards for generating knowledge. Separating job performance from knowledge generation.
Cultural shift From retrospective evaluation to prospective evaluation.
Look back and judge
31
• Choosing what to evaluate is something that should take time and careful consideration.
• Impact evaluation is more expensive and often requires third party consultation.
• The questions that require an IE to answer should be evident in your logic models and M&E plans from the beginning.
• Remember, IE is an assessment of the causal effect of a project, program or policy on beneficiaries.
Choice should come from existing logic models and M&E plans.
CHOICE #1
Retrospective Design or Prospective Design?
33
Retrospective Analysis
Retrospective Analysis is necessary when we have to work with a pre-assigned program (expanding an existing program) and existing data (baseline?)
Examples: Regression Discontinuity: Education Project (Ghana) Difference in Differences: RPI (Zambia) Instrumental variables: Piso firme (México)
34
• Use whatever is available – the data was not collected for the purposes at hand.
• The researcher gets to choose what variables to test, based on previous knowledge and theory.
• Subject to misspecification bias. • Theory is used instrumentally, as a way to provide a
structure justifying the identifying assumptions. • Less money on data collection (sometimes), more money
on analysis. • Does not really require “buy in” from implementers or field
staff.
Retrospective Designs
35
Prospective Analysis
In Prospective Analysis, the evaluation is designed in parallel with the assignment of the program, and the baseline data can be gathered.
Example: Progresa/Oportunidades (México) CDSG (Colombia)
36
• Intentionally collect data for the purposes of the impact evaluation.
• The variables collected in a prospective evaluation are collected because they were considered potential outcome variables.
• You should report on all of your outcome variables.
• The evaluation itself may be a form of treatment. • It is the experimental design that is instrumental - gives
more power both to test the theory and to challenge it. • More money on data collection, less money on analysis. • Requires “buy in” from implementers and field staff.
Prospective Designs
37
Prospective Designs
Use opportunities to generate good controls The majority of programs cannot assign benefits to all the entire eligible population
Not all eligible receive the program
Budget limitations: Eligible beneficiaries that receive benefits are potential treatments Eligible beneficiaries that do not receive benefits are potential
controls
Logistical limitations: Those that go first are potential treatments Those that go later are potential controls
38
• The decision to conduct an impact evaluation was made after the program began, and ex post control households were identified.
• We are now trying to use health data from Puskesmas to “fill in the gaps” of the baseline.
• This would be a retrospective design, because there was not an experimental design in place for the roll out of the program.
An example: Socio-econ impact of Endline Water Hibah
CHOICE #2
What type of Evaluation Design do you use?
40
Types of Designs
Prospective
Randomized Assignment
Randomized Promotion
Regression Discontinuity
Retrospective
Regression Discontinuity
Differences in Differences
Matching
Model-based / Instrumental Variable
41
How to choose?
Identification strategy depends on the implementation of the program
Evaluation strategy depends on the rules of operations
42
Who gets the program?
Eligibility criteria Are benefits targeted?
How are they targeted?
Can we rank eligible's priority?
Are measures good enough for fine rankings?
Roll out Equal chance to go first, second, third?
43
Rollout base on budget/administrative constraints
Ethical Considerations
Equally deserving beneficiaries deserve an equal chance of going first
Give everyone eligible an equal chance If rank based on some criteria, then criteria
should be quantitative and public
Equity
Transparent & accountable method
Do not delay benefits
44
The Method depends on the rules of operation
Targeted Universal
In Stages
Without cut-off o Randomization o Randomized
Rollout
With cut-off
o RD/DiD o Match/DiD
o RD/DiD o Match/DiD
Immediately
Without cut-off
o Randomized Promotion
o Randomized Promotion
With cut-off
o RD/DiD o Match/DiD
o Randomized Promotion
45
• Provision of services to villages and households under the Water Hibah is not determined by randomization, but by assessment and WTP.
• The dataset design exhibits some characteristics of a controlled experiment with connected and unconnected, but connection decision is not determined by randomization.
• Household matching is not an efficient method with the potential discrepancies we identified in the pilot test, and does not work very well with the sample design that was chosen.
• Village-level matching is not feasible because there are usually connected and unconnected in a single village (locality).
• The design we have chosen is: pretest-posttest-nonequivalent-control-group quasi-experimental design that will use regression-adjusted Difference-in-Difference impact estimators.
An example: Socio-econ impact of Endline Water Hibah
CHOICE #3
What type of Sample Design do you use?
47
Types of Designs Random Sampling
Multi-Stage Sampling Systematic Sampling Stratified Sampling
Convenience Sampling Snowball Sampling
Types of Sample Designs
Plus any combination of them!
48
• It is important to note that sample design can be extremely complex.
• A good summary is provided by Duflo (2006): • The power of the design is the probability that, for a given effect size and a given
statistical significance level, we will be able to reject the hypothesis of zero effect. Sample sizes, as well as other (evaluation & sample) design choices, will affect the power of an experiment.
• There are lots of things to consider, such as: • The impact estimator to be used; The test parameters (power level, significance
level); The minimum detectable effect; Characteristics of the sampled (target) population (population sizes for potential levels of sampling, means, standard deviations, intra-unit correlation coefficients (if multistage sampling is used)); and the sample design to be used for the sample survey
A good sample design requires expert knowledge
49
The basic process is this…
Level of Power
Level of Hypothesis
Tests
Correlations in outcomes
within groups (ICCS)
Mean and Variance of
outcomes & MDES
50
• Most times, you do not have all of this information.
• Use existing studies; other data sources; assumptions. • Working backwards to fit a certain power size. • Working backwards b/c expected level of impact that you want to test for.
• You are working backwards to fit a certain budget!
• Build in marginal costs for each stage of sampling. • Decide whether or not to pursue project.
The reality is…
51
• Outcome indicators: we have simplified versions of them in the baseline, but they have been modified for endline Use baseline dataset to calculate ICCs.
• Highest variation in outcome indicators was identified across villages (localities) primary sample unit is the village.
• The # of households in the village was found to improve the efficiency of the design stratify villages based on the # of households
• Marginal costs of village visit vs. household visit were included.
• The final sample design that was identified is referred to as: Stratified Multi-stage sampling with 250 villages and 7-14 households per experimental group = 7,000 hhs.
An example: Socio-econ impact of Endline Water Hibah
What can IndII Do?
Ensure your M&E systems are relevant and reliable…
53
Data: Coordinate IE & Monitoring Systems
Typical content Lists of beneficiaries Distribution of benefits Expenditures Outcomes Ongoing process evaluation
Projects/programs regularly collect data for management purposes
Information is needed for impact evaluation
54
Manage M&E for results
Tailor policy questions
Precise unbiased estimates
Use your resources wisely Better methods
Cheaper data
Timely feedback and program changes
Improve results on the ground
Prospective evaluations are easier and better with reliable M&E
55
Evaluation uses information to:
Verify who is beneficiary
When started
What benefits were actually delivered
Necessary condition for program to have an impact: Benefits need to
get to targeted beneficiaries.
56
Overall Messages
Evaluation design
Impact evaluation Is useful for: Validating program design Adjusting program structure Communicating to finance ministry & civil society
A good one requires estimating the counterfactual: What would have happened to beneficiaries if had not
received the program Need to know all reasons why beneficiaries got program &
others did not
57
Other messages
Good M&E is crucial not only to effective project management but can be a driver for reform
Monitoring and evaluation are separate, complementary functions, but both are key to results-based management
Have a good M&E plan before you roll out your project and use it to inform the journey!
Design the timing and content of M&E results to further evidence-based dialogue
Good monitoring systems & administrative data can improve IE.
Easiest to use prospective designs.
Stakeholder buy-in is very important
58
Tuesday - Session 1
INTRODUCTION AND OVERVIEW
1) Introduction
2) Why is evaluation valuable?
3) What makes a good evaluation?
4) How to implement an evaluation?
Wednesday - Session 2
EVALUATION DESIGN
5) Causal Inference
6) Choosing your IE method/design
7) Impact Evaluation Toolbox
Thursday - Session 3
SAMPLE DESIGN AND DATA COLLECTION
9) Sample Designs
10) Types of Error and Biases
11) Data Collection Plans
12) Data Collection Management
Friday - Session 4
INDICATORS & QUESTIONNAIRE DESIGN
1) Results chain/logic models
2) SMART indicators
3) Questionnaire Design
Outline: topics being covered
Thank You!
This material constitutes supporting material for the "Impact Evaluation in Practice" book. This additional material is made freely but please acknowledge its use as follows: Gertler, P. J.; Martinez, S., Premand, P., Rawlings, L. B. and Christel M. J. Vermeersch, 2010, Impact Evaluation in Practice: Ancillary
Material, The World Bank, Washington DC (www.worldbank.org/ieinpractice). The content of this presentation reflects the views of the authors and not necessarily those of the World Bank.
MEASURING IMPACT Impact Evaluation Methods for Policy
Makers
top related