evaluation revisited: improving the quality of evaluative practice by embracing complexity
DESCRIPTION
Implications of complication and complexity for evaluation Patricia J. Rogers CIRCLE (Collaboration for Interdisciplinary Research, Consulting and Learning in Evaluation) Royal Melbourne Institute of Technology, Australia [email protected]. - PowerPoint PPT PresentationTRANSCRIPT
Implications of complication and complexity for evaluation
Patricia J. RogersCIRCLE (Collaboration for Interdisciplinary Research, Consulting and Learning in Evaluation) Royal Melbourne Institute of Technology, [email protected]
Evaluation Revisited: Improving the quality of evaluative practice by embracing complexity
Utrecht, the Netherlands May 20-21 2010
2
The naïve experimentalism view of evaluation and evidence-based policy and practice
DO THING ‘A’
DECIDE TO DO THING ‘A’
FIND THAT THING ‘A’ WORKS
SINGLE STUDY
SEVERAL STUDIES
PRACTITIONERS
POLICYMAKERS
RESEARCHERS
INTENDED BENEFICIARIES
BENEFIT AS EXPECTED
3
But things are often more complicated or complex than this …
4
What can (and does) go wrong with naïve experimentalism
DO THING ‘A’
DECIDE TO DO THING ‘A’
FIND THAT THING ‘A’ WORKS
PRACTITIONERS
POLICYMAKERS
RESEARCHERS
NARROW STUDIES THAT IGNORE IMPORTANT
EVIDENCE
DIFFERENTIAL EFFECTS – THING ‘A’
ONLY WORKS IN SOME CONTEXTS
MISREPRESENTATION OF RESULTS
RANDOM ERROR
NOT FEASIBLE IN OTHER LOCATIONS
NOT SCALEABLE
NEGATIVE EFFECTS IGNORED
5
An alternative view of knowledge- building
6
An approach to evaluation and evidence-based policy and practice that recognizes the complicated and complex aspects of situations and interventions
What is needed? What is possible?What works? What works for whom in what situations?
What is working?
Researchers and evaluators
Community and civil society
Practitioners and managers
Policymakers
7
Advocacy for RCTs (Randomised Controlled Trials) in development evaluation
“J-PAL is best understood as a network of affiliated researchers … united by their use of
the randomized trial methodology”
20062003
2010
“Advocated more use of RCTs
Argued that experimental and
quasi-experimental designs had a comparative
advantage because they provide an
unbiased numeric estimate of impact
TED talk
2009
Used leeches to illustrate the
alternative to using RCTs as evidence
8
Distinguishing between RCTs and naïve experimentalism
RCT (Randomised Controlled Trial)
– one of many research designs that can be suitable
– involves randomly assigning (truly randomly, not ad hoc) potential participants to either receive the treatment (or one of several version of the treatment) or to be in the control group (who might receive nothing or the current standard treatment)
– in ‘double blind’ RCTs neither the participants nor the researchers know who is in the treatment group (eg the control group get pills that look the same and the details of the group are kept secret until after the results are recorded)
Naïve experimentalism
– believes that RCTs always provide the best evidence (the ‘gold standard’ approach)
– ignores (or is ignorant) of the potential risks in using RCTs and the other approaches that can be appropriate
9
Exploring complication and complexity in evaluation
20061997
2008
2008
2009
2010
10
Some unhelpful ways ‘complex’ is used
• Difficult – eg little available data, hard to get additional data
• Beyond scrutiny – eg too technical for others to understand or challenge
• Ad hoc – eg too overwhelmed with implementation to think about planning or follow through
11
Two framings of simple, complicated and complex
Glouberman and Zimmerman 2002
Kurtz and Snowden 2003
Simple Tested ‘recipes’ assure replicability
Expertise is not needed
The domain of the ‘known’,
Cause and effect are well understood,
Best practices can be confidently recommended,
Complicated Success requires high level of expertise in many specialized fields + coordination
The domain of the ‘knowable’
Expert knowledge is required,
Complex Every situation is unique – previous success does not guarantee success
Expertise can help but is not sufficient; relationships are key
The domain of the ‘unknowable’,
Patterns are only evident in retrospect.
12
Using the framework
Can be used to refer to a situation or to an intervention
Not useful as a way of classifying the whole situation or intervention
most useful to consider aspects of interventions
Not normative
complex is not better than simple
simple interventions can still be difficult to do well, or to get good data about
13
Simple can sometimes be appropriate
“It can scarcely be denied that the supreme goal of all theory is to
make the irreducible basic elements as simple and as few as
possible without having to surrender the adequate
representation of a single datum of experience.”
Albert Einstein, Oxford University, 1933
“Everything should be made as simple as
possible, but no simpler.”
14
Implications of complicated and complex situations and interventions for evaluation
1. Focus
2. Governance
3. Consistency
4. Necessariness
5. Sufficiency
6. Change trajectory
7. Unintended outcomes
(Funnell and Rogers 2010 Purposeful Program Theory. Jossey-Bass)
15
1. Focus - implications for evaluation?
Simple Single set of objectives
Complicated Different objectives valued by different stakeholders
Multiple, competing imperatives
Objectives at multiple levels of a system
Complex Emergent objectives
Funnell and Rogers 2010 Purposeful Program Theory. Jossey-Bass)
16
Intervention
Longer term outcomes Shorter term outcomes at
system level Activities at system level
Activities at site level
Activities at client level
Shorter term outcomes at site level
Shorter term outcomes at client level
Focus - Objectives at multiple levels of a system
Funnell and Rogers 2010 Purposeful Program Theory. Jossey-Bass)
17
2. Governance - implications for evaluation?
Simple Single organization
Complicated Specific organizations with formalized requirements
Complex Emergent organizations working together in flexible ways
Funnell and Rogers 2010 Purposeful Program Theory. Jossey-Bass)
18
3. Consistency - implications for evaluation?
Simple Standardized
Complicated Adapted
Complex Adaptive
Funnell and Rogers 2010 Purposeful Program Theory. Jossey-Bass)
19
What interventions look like – teaching reading
Simple – best practice
Teachers select a reading program which has been shown in RCTs to be effective (eg Reading First program - $1b p.a)
Complicated - adapted
Teachers identify children’s learning stage and provide exercises to match this (eg Victorian Catholic Education Systems Literacy Assessment Project)
Griffin, P. 2009 ‘Ambitious new project to raise literacy and numeracy levels in Victorian Schools. http://newsroom.melbourne.edu/studio/ep-29Griffin, P., Murray, L., Care, E., Thomas, A., & Perri, P. (2009). Developmental Assessment: Lifting literacy through Professional Learning Teams, Assessment in Education. In press
20
What interventions look like – supporting small businesses
Complicated – what are the ‘active ingredients’
An RCT compares the effect on small businesses of providing
(i) business training
(ii) savings incentive
(iii) wages support
(iv) business training and savings incentive
(v) business training and wages support
(vi) savings incentive and wages support (McKenzie, 2010)
Complex - adaptive A program works with small businesses to iteratively dentify what they need, and meet this need
21
4. Necessariness - implications for evaluation?
Simple Only way to achieve the intended impacts
Complicated One of several ways to achieve the intended impacts – which can be identified in advance
Complex One of several ways to achieve the intended impacts – which are only evident in retrospect
Funnell and Rogers 2010 Purposeful Program Theory. Jossey-Bass)
22
Necessariness – with/without comparisons
A US program to assist poor families through linking them to services found that families receiving the program experienced improvements in welfare — but so did the families that were randomly assigned to a control group that did not receive the visits (St. Pierre and Layzer 1999).
[As this case shows], a good study helps avoid spending funds on ineffective programs and redirects attention to improving designs or to more promising alternatives.’ (When Will We Ever Learn?)
But families in the control group had also accessed services..
The appropriate comparison would have been to compare the costs incurred in the different groups
St Pierre et al, 1996 Report on the National Evaluation of the Comprehensive Child Development Program. Summary and links to reports available at http://www.researchforum.org/project_abstract_166.html
23
5. Sufficiency - implications for evaluation?
Simple Sufficient to produce the intended impacts. Works the same for everyone
Complicated Only works in conjunction with other interventions (previously, concurrently, or subsequently) and/or only works for some people and/or only works in some circumstances – which can be identified in advance
Complex Only works in conjunction with other interventions (previously, concurrently, or subsequently) and/or only works for some people and/or only works in some circumstances – which is only evident in retrospect
Funnell and Rogers 2010 Purposeful Program Theory. Jossey-Bass)
If 200 potted plants are randomly assigned to either a treatment group that receives daily water, or to a control that receives none,
: False negatives – the potted plant thought experiment
and both groups are placed in a dark cupboard,
Possible conclusions: Watering plants is ineffective in making them grow.
the treatment group does not have better outcomes than the control.
Better conclusion: Water is not sufficient.
: False positives – Early Head Start
• Early Head Start program - on average effective. Listed as an ‘evidence-based program’
• But unfavourable outcomes for children in families with high levels of demographic risk factors (Mathematica Policy Research Inc, 2002, Westhorp (2008)
Westhorp, G (2008) Development of Realist Evaluation Methods for Small Scale Community Based Settings Unpublished PhD Thesis, Nottingham Trent University
Mathematica Policy Research Inc (2002). Making a Difference in the Lives of Infants and Toddlers and Their Families: The Impacts of Early Head Start, Vol 1. US Department of Health and Human Services.
26
6. Change trajectory - implications for evaluation?
Simple Simple relationship– readily understood
Complicated Complicated relationship– needs expertise to understand and predict
Complex Complex relationship (including tipping points)– cannot be predicted but only understood in retrospect
Funnell and Rogers 2010 Purposeful Program Theory. Jossey-Bass)
27
: Complicated dose-response relationship – does stress improve performance?
28
7. Unintended outcomes - implications for evaluation?
Simple Unintended outcomes can be anticipated and monitored
Complicated Different unintended outcomes are likely in particular combinations of circumstances – expertise is needed to anticipate them and identify them
Complex Unintended outcomes cannot be anticipated but only identified (and addressed) as they emerge or in retrospect
Funnell and Rogers 2010 Purposeful Program Theory. Jossey-Bass)
29
Some thoughts on how evaluation might help us to understand the complicated and the complex
Issues that may need to be addressed
1. Focus2. Governance3. Consistency4. Necessariness5. Sufficiency6. Change trajectory7. Unintended outcomes
Possible evaluation methods, approaches and methodologies
• Emergent evaluation design that can accommodate emergent program objectives and emergent evaluation issues
• Collaborative evaluation across different stakeholders and organisations
• Non-experimental approaches to causal attribution/contribution that don’t rely on a standardized ‘treatment’
• Realist evaluation that pays attention to the contexts in which causal mechanisms operate
• Realist synthesis that can integrate diverse evidence (including credible single case studies) in different contexts
• ‘Butterfly nets’ to catch unanticipated results
30
Looking forward to hearing about your approaches to addressing these issues in evaluation