1
Developing an evaluation of professional developmentWebinar #4: Going Deeper
into Analyzing Results
2
Information and materials mentioned or shown during this presentation are provided as resources and examples for the viewer's convenience. Their inclusion is not intended as an endorsement by the Regional Educational Laboratory Southeast or its funding source, the Institute of Education Sciences (Contract ED-IES-12-C-0011).
In addition, the instructional practices and assessments discussed or shown in these presentations are not intended to mandate, direct, or control a State’s, local educational agency’s, or school’s specific instructional content, academic achievement system and assessments, curriculum, or program of instruction. State and local programs may use any instructional content, achievement system and assessments, curriculum, or program of instruction they wish.
Webinar 4: Outline• Considerations for quantitative analyses –
Dr. Sharon Koon– WWC evidence standards & strong studies– Calculating attrition– Calculating baseline equivalence– Statistical adjustments
• Considerations for qualitative analyses –Dr. La’Tara Osborne-Lampkin
• Question & answer session
3
CONSIDERATIONS FOR QUANTITATIVE ANALYSES
Dr. Sharon Koon
4
Distinction between WWC evidence standards and additional
qualities of strong studies• WWC design considerations :
– Two groups—treatment (T) and comparison (C). – For randomized controlled trials (RCTs), low attrition– For quasi-experimental designs (QEDs), baseline equivalence
between T and C groups. – Contrast between T and C groups measures impact of the
treatment. – Valid and reliable outcome data used to measure the impact of
a treatment. – No known confounding factors.– Outcome(s) not overaligned with the treatment.– Same data collection process—same instruments, same
time/year—for the T and C groups.
5
Source: http://www.dir-online.com/wp-content/uploads/2015/11/Designing-and-Conducting-Strong-Quasi-Experiments-in-Education-Version-2.pdf
Distinction between WWC evidence standards and additional qualities of
strong studies (cont.)• Additional qualities of strong studies:
– Pre-specified and clear primary and secondary research questions. – Generalizability of the study results. – Clear criteria for research sample eligibility and matching
methods. – Sample size large enough to detect meaningful and statistically
significant differences between the T and C groups overall and for specific subgroups of interest.
– Analysis methods reflect the research questions, design, and sample selection procedures.
– A clear plan to document the implementation experiences of the T and C conditions.
6
Source: http://www.dir-online.com/wp-content/uploads/2015/11/Designing-and-Conducting-Strong-Quasi-Experiments-in-Education-Version-2.pdf
Determinants of a What Works Clearinghouse (WWC) study rating
Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19
7
Topics for discussion• Attrition• Baseline equivalence– Calculation of baseline equivalence– Adjustments for nonequivalence
• Effect-size corrections– Cluster correction– Multiple comparison correction
• Handling missing data
8
Attrition• For RCTs, the WWC is concerned
about both overall attrition (i.e., the rate of attrition for the entire sample) and differential attrition (i.e., the difference in the rates of attrition for the intervention and comparison groups) because both types of attrition contribute to the potential bias of the estimated effect.
Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19
9
Attrition (cont.)• Overall attrition = Number without observed data/number
randomized• Differential attrition = [T without observed data/number T
randomized] – [C without observed data/number C randomized]
• Attrition boundaries: liberal or conservative• In order to be deemed an RCT with low attrition, a cluster
RCT that reports an individual-level analysis must have low attrition at two levels. First, it must have low attrition at the cluster level. Second, the study must have low attrition at the subcluster level, with attrition based only on the clusters remaining in the sample. Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19
10
Attrition (cont.)
Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19
11
Baseline equivalence• For continuous outcomes, it is determined by
the difference between the mean outcome for the T group and the mean outcome for the C group, divided by the pooled within-group standard deviation of the outcome measure (i.e., standardized mean difference).
• For dichotomous outcomes, it is determined by the difference in the probability of the occurrence of an event.
Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19
12
Statistical adjustment for nonequivalence at baseline
• For differences in baseline characteristics that are between 0.05 and 0.25 standard deviations, the analysis must include a statistical adjustment for the baseline characteristics to meet the baseline equivalence requirement.
• A number of different techniques can be used, including regression adjustment and analysis of covariance (ANCOVA).
• The critical factor is that the appropriate baseline characteristics must be included in the analysis at the individual level (i.e., the unit of analysis).
Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19
13
Difference-in-difference adjustment
• The WWC applies this adjustment to effect size calculations based on unadjusted group means when the study is:– a QED with differences in baseline
characteristics less than .05 – an RCT with low attrition and differences in
baseline characteristics– an RCT with high attrition and differences in
baseline characteristics less than .05Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19
14
Cluster correction• A “mismatch” problem occurs when random
assignment is carried out at the cluster level (e.g., school level) and the analysis is conducted at the individual level (e.g., teacher level), but the correlation among students within the same clusters is ignored in computing the standard errors of the impact estimates.
• The standard errors of the impact estimates generally will be underestimated, thereby leading to overestimates of statistical significance.
Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19
15
Cluster correction (cont.)• The WWC computes clustering-corrected statistical significance
estimates.• The basic approach to the clustering correction is first to compute
the t-statistic corresponding to the effect size that ignores clustering and then correct both the t-statistic and the associated degrees of freedom for clustering based on sample sizes, number of clusters, and an estimate of the intra-class correlation (ICC).
• The default ICC value is 0.20 for achievement outcomes and 0.10 for behavioral and attitudinal outcomes.
• The statistical significance estimate corrected for clustering is then obtained from the t-distribution using the corrected t-statistic and degrees of freedom.
Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19
16
Multiple comparisons correction
• Repeated tests of highly correlated outcomes will lead to a greater likelihood of mistakenly concluding that the differences in means for outcomes of interests between the T and C groups are significantly different from zero (called Type I error in hypothesis testing).
• The WWC uses the Benjamini-Hochberg (BH) correction to reduce the possibility of making this type of error.
Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19
17
Multiple comparisons correction (cont.)
• The BH correction is used in three types of situations: • studies that estimated effects of the intervention for
multiple outcome measures in the same outcome domain using a single comparison group,
• studies that estimated effects of the intervention for a given outcome measure using multiple comparison groups, and
• studies that estimated effects of the intervention for multiple outcome measures in the same outcome domain using multiple comparison groups.
Source: http://ies.ed.gov/ncee/wwc/documentsum.aspx?sid=19
18
Handling missing data• RCTs & QEDs: Complete case analysis with no
adjustment for missing data (for example, unadjusted means) or with covariate adjustment
• Low-attrition RCTs only: – Complete case analysis with nonresponse weights– Multiple imputation, separately by condition and
using an established method (however, can not be used to meet the attrition standard)
– Maximum likelihood, separately by condition and using an established method
Source: http://ies.ed.gov/ncee/wwc/multimedia.aspx?sid=18
19
CONSIDERATIONS FOR QUALITATIVE ANALYSES
Dr. La’Tara Osborne-Lampkin
20
(Minchiello et al., 1990, p.5)
Qualitative Quantitative
Conceptual Concerned with understanding human behavior from the informant’s perspective
Concerned with discovering facts about human behavior
Assumes a dynamic & negotiated reality
Assumes a fixed & measurable reality
Methodological Data are collected through participant observation & interviews
Data are collected through measuring things
Data are analyzed by themes from descriptions by informants
Data are analyzed through numerical comparisons & statistical inferences
Data are reported through narratives and the language of the informant
Data are reported through statistical analyses
21
When analyzing qualitative data
Use an iterative approach guided by prior data collection and analysis
Use multiple, “reliable” researchers for analysis and interpretation
Document and outline steps and decisions made for data analysis (i.e., develop an audit trail)
Document the basis for inferences Establish structural corroboration or coherence
(Miles & Huberman, 1994; Miles, Huberman & Soldana, 2014, Patton, 1987)
22
23
Five key considerations
Triangulate data
Ensure representativeness
Look for competing explanations
Analyze negative casesKeep methods &
data in context
Triangulate data
Use multiple methods to study a program Collect multiple types of data on the
same question Use different interviewers to avoid biases
of any one different data collector and interviewers working alone
Use multiple perspectives (or theories) to interpret a set of data
24
25
Practical strategies to triangulate
Data sources Compare interview data
focus group semi-structured interview data
With observational data documents
Validate observational data with documents
Participants Compare what participants
say in public with what they say in private (for example, focus group data vs. individual interview data)
Check consistency of what participants do and say over time
Compare the perspectives of individuals within and across stakeholder groups
Verify “fit” and “work” “representativeness” in the
data…
26
Look for rival and competing explanations
• May employ an inductive or logical process
• Use data to support alternative explanations that are grounded in logic and theory
• Weight the evidence and look for best fit
27
Negative cases• “Exceptions that prove the rule”• Counter evidence
28
29
Keeping methods and data in context
Limit conclusions to:• those situations,• time periods,• persons, and • contexts for which the
data are applicable.
Keeping things in context is the cardinal principle of qualitative analysis (Patton, 1987).
30
Questions & Answers
Homework:Bring remaining questions to session 5
Developing an evaluation of professional development
• Webinar 5: Going Deeper into Interpreting Results & Presenting Findings 1/21/2016, 2:00pm
31