achieved relative intervention strength: models and methods chris s. hulleman david s. cordray...
TRANSCRIPT
Achieved Relative Intervention Strength:
Models and Methods
Chris S. Hulleman
David S. Cordray
Presentation for the SREE Research ConferenceWashington, DC
March 5, 2010
Overview
• Conceptual Framework– Definitions and Importance– Indexing Fidelity as Achieved Relative Strength (ARS)
• Three examples– Lab and Field Experiments– Reading First
• Practical Considerations and Challenges• Questions and discussion
Definitions and ImplicationsFidelity
– The extent to which the implemented Tx (tTx) was faithful to the intended Tx (TTx)
– Measure core intervention components
Achieved Relative Strength (ARS)– The difference between implemented causal components in the Tx and C– tTx – tC
– ARS is a default index of fidelity
Implications– Infidelity reduces construct, external, and statistical conclusion validity
Achieved Relative Strength = 0.15
Infidelity
“Infidelity”
0.50d
85 700.50
30d
t c
pooled
Y Yd
sd
(85)-(70) = 15
tC
t tx
cY
tY
TTx
TC
.45
.40
.35
.30
.25
.20
.15
.10
.05
.00
Treatment Strength
with fidelity
with fidelity
90 650.83
30
T C
pooled
Y Yd
sd
d
Expected Relative Strength = TTx - TC = (0.40-0.15) = 0.25
100
90
85
80
75
70
65
60
55
50
Outcome
TY
CY
Indexing Fidelity as Achieved Relative Strength
Intervention Strength = Treatment – Control
Achieved Relative Strength (ARS) Index
• Standardized difference in fidelity index across Tx and C• Based on Hedges’ g (Hedges, 2007)• Corrected for clustering in the classroom (ICC’s from .01
to .08)• See Hulleman & Cordray (2009)
Tx C
T
ARS IndexS
t t
Indexing Fidelity
Average– Mean levels of observed fidelity (tTx)
Absolute– Compare observed fidelity (tTx) to absolute or
maximum level of fidelity (TTx)
Binary– Yes/No treatment receipt based on fidelity scores
– Requires selection of cut-off value
Assessing Implementation Fidelity in the Lab and in Classrooms: The Case of a
Motivation Intervention
Examples 1 and 2
PERCEIVED UTILITY VALUE
INTEREST
PERFORMANCE
MANIPULATED RELEVANCE
Model Adapted from: Eccles et al. (1983); Hulleman et al. (2009)
The Theory of Change
Fidelity Measure:Quality of participant
responsiveness (0 to 3 scale)
Achieved Relative Strength Indices
Observed Fidelity
Lab vs. Class Contrasts
Lab Class Lab - Class
Average Tx 1.73 0.74
C 0.00 0.04
g 2.52 1.32 1.20
Absolute Tx 0.58 0.25
C 0.00 0.01
g 1.72 0.80 0.92
Binary Tx 0.65 0.15
C 0.00 0.00
g 1.88 0.80 1.08
Achieved Relative Strength = 1.32
Fidelity
Infidelity
Infidelity
TTx
TC
0.74 0.04ARS 1.32
0.53g
t c
pooled
X XARS g
sd
100
66
33
0
Treatment Strength
tC
t tx
cX
tX
3
2
1
0
Average ARS Index
(0.74)-(0.04) = 0.70
Assessing Implementation Fidelity in a Large-Scale Policy Intervention: The
Case of Reading First
Example 3
In Education, Intervention Models are Multi-faceted (from Gamse et al., 2008)
Use of research-based reading programs, instructional materials, and assessment, as articulated in the LEA/school application
Teacher professional development in the use of materials and instructional approaches
1)Teacher use of instructional strategies and content based on five essential components of reading instruction
2) Use of assessments to diagnose student needs and measure progress
3) Classroom organization and supplemental services and materials that support five essential components
From Major Components to Indicators…
Professional Development
Reading Instruction
Support for Struggling Readers
Assessment
Instructional Time
Instructional Material
Instructional Activities/Strategies
Block
Actual Time
Scheduled block?
Reported time
Major Components
Sub-components
Facets Indicators
Reading First Implementation: Specifying Components and Operationalization
Components Sub-components Facets Indicators(I/F)
Reading Instruction
Instructional Time 2 2 (1)
Instructional Materials 4 12 (3)
Instructional Activities /Strategies 8 28 (3.5)
Support for Struggling Readers (SR)
Intervention Services 3 12 (4)
Supports for Struggling Readers 2 16 (8)
Supports for ELL/SPED 2 5 (2.5)
Assessment Selection/Interpretation 5 12 (2.4)
Types of Assessment 3 9 (3)
Use by Teachers 1 7 (7)
Professional development
Improved Reading Instruction 11 67 (6.1)
4 10 41 170 (4)
Adapted from Moss et al. 2008
Reading First Implementation: Some ResultsComponents Sub-
componentsPerformance Levels (% of Absolute Standard)
AbsoluteStandard
ARSI
RF Non-RF
Reading Instruction
Daily (min.) 105 (117%) 87 (97%) 90 0.63
Daily in 5 components (min.)
59 50.8 -- 0.35
Daily with High Quality practice
18.13 16.2 -- 0.11
Professional Development
Hours of PD 25.8 13.7 -- 0.51
Five reading dimensions
4.3 (86%) 3.7 (74%) 5 0.31
0.38
Adapted from Gamse et al. (2008) and Moss et al. (2008)
Linking Fidelity to Outcomes
ARS: How Big is Big Enough?
Effect SizeStudy Fidelity
ARSOutcome
Motivation – Lab
1.88 0.83
Motivation – Field
0.80 0.33
Reading First*
0.35 0.05
*Averaged over 1st, 2nd, and 3rd grades (Gamse et al., 2008).
What Do I Do With Fidelity Indices?
Start with:– Scale construction, aggregation over model
sub-components and components
Use as:– Descriptive analyses– Causal analyses (Intent-to-Treat: ITT)– Explanatory (AKA exploratory) analyses
• E.g., LATE, Instrumental variables, TOT
Except for descriptive analyses, most approaches are relative new and not fully tested
In Practice….• Identify core intervention components
– e.g., via a Model of Change• Establish bench marks for TTX and TC
• Measurement– Determine indicators of core components– Derive tTx and tC
– Develop scales– Convert to ARS
• Incorporate into intervention analyses– Multi-level analyses (Justice, Mashburn, Pence, &
Wiggins, 2008)
Some Challenges
Intervention models– Often unclear– Scripted vs. Unscripted
Measurement– Novel constructs– Multiple levels– Aggregation (within and across levels)
Analyses– Weighting of components– Uncertainty about psychometric properties– Functional form not always known
Summary of Key Points
• Identify and measure core components• Fidelity assessment serves two roles:
– Average causal difference between conditions– Using fidelity measures to assess the effects of
variation in implementation on outcomes
• Post-experimental (re)specification of the intervention
• ARS: How much is enough?– Need more data!
Thank You
Questions and Discussion