class project
DESCRIPTION
CLASS Project. Value-Added Measures Workshop. Central Oregon, January 20, 2011 Western Oregon, January 21, 2011. Technical Assistance Consultants. Dr. Allison McKie Seifullah, Mathematica Dr. Kevin Booker, Mathematica Albany Bend. Technical Assistance Consultants. - PowerPoint PPT PresentationTRANSCRIPT
Inspiring Oregonians… to do what it takes to make ourschools among the nation’s best.
CLASS Project
CLASS Project
Value-Added Measures Workshop
Central Oregon, January 20, 2011Western Oregon, January 21, 2011
CLASS Project
Technical Assistance Consultants
Dr. Allison McKie Seifullah, MathematicaDr. Kevin Booker, Mathematica
AlbanyBend
CLASS Project
Technical Assistance Consultants
Jackson Miller, MPP, WestatDr. Chris Thorn, Westat
Crook CountyLebanon Oregon City Redmond Salem
CLASS Project
Three Frames:
Time Process A Story
CLASS Project
Timeline
1. TIF Award
2. VAM Introduction
3. Research and Policy
4. Begin Deeper Study
5. Initial Construction
6. Trial First Run
7. Editing and Adjusting
1. 3. 2. 4. 6. 7.5.
Oct 1 Oct 25 Dec 2 Jan 20 Jan - June July - Sept Oct - May
CLASS Project
Process
Commitment to multiple measures VAM not basis for high stakes decisions Multiple year process Don’t jump ahead to compensation
CLASS Project
A Story
Teacher Incentive Fund Overview
January 2011
Presentation to Chalkboard Project & Partner Districts
Allison McKie Seifullah
"To support projects that develop and implement performance-based compensation systems (PBCSs) for teachers, principals, and other personnel in order to increase educator effectiveness and student achievement, measured in significant part by student growth, in high-need schools"
Teacher Incentive Fund (TIF) Purpose
10
Source: Teacher Incentive Fund Frequently Asked Questions
Improve student achievement by increasing teacher and principal effectiveness
Reform educator compensation systems so that educators are rewarded for increases in student achievement
Increase the number of effective teachers teaching poor, minority, and disadvantaged students in hard-to-staff subjects
Create sustainable PBCSs
TIF Goals
11
Source: http://www2.ed.gov/programs/teacherincentive/index.html
Mathematica: a nonpartisan policy research firm with over 40 years of experience
TIF roles:– Conduct national evaluation– Provide technical assistance to evaluation districts
Mathematica Policy Research
12
Research questions:
– What is the impact of differentiated effectiveness incentives (DEIs) on student achievement and educator effectiveness and mobility?
• DEIs reward, at differentiated levels, teacher and principals who demonstrate their effectiveness by improving student achievement.
• Incentive amounts vary based on performance.
– Is a particular type of DEI associated with greater student achievement gains?
– What are the experiences and challenges districts face in implementing these programs?
National TIF Evaluation
13
Evaluation Design
14
Schools Participating in the Evaluation
Lottery
GROUP 1 SCHOOLSDifferentiated Effectiveness
Incentive
GROUP 1 & 2 SCHOOLSRoles & Responsibilities Incentives
EvaluationsProfessional Development
All Other PBCS Components
GROUP 2 SCHOOLS1% Across-the-Board
Bonus
Technical Assistance– From Mathematica and Vanderbilt for evaluation
districts (Albany and Bend-La Pine in Oregon)– From Westat for all other districts
Center for Educator Compensation Reform website:
http://cecr.ed.gov/
Resources
15
Differentiated effectiveness incentives
Additional responsibilities and leadership roles incentives
Rigorous, transparent, and fair evaluation systems for teachers and principals
Needs-based professional development
Data management system that links student achievement data to payroll and HR systems
Plan for effectively communicating PBCS elements and promoting use of PBCS data
Required PBCS Components
16
Must give "significant weight" to student growth
Must include observation-based assessments of teacher and principal performance at multiple points in the year
Must be “substantial”: “likely high enough to create change in behavior…in order to ultimately improve student outcomes”
May be individual (e.g. teacher), group (e.g. grade, team, or school), or mixed incentives
Differentiated Effectiveness Incentives
17
Student growth: change in student achievement
Chalkboard given competitive preference for using a value-added measure of the impact on student growth as a “significant factor” in calculating differentiated effectiveness awards
Student growth VAM (Kevin’s discussion)
Student Growth and Value-Added Models
18
Background on Value-Added Models
January 2011
Presentation to Chalkboard Project & Partner Districts
Kevin Booker
VAMs aim to estimate contribution of teacher or school to student learning growth
Use prior achievement and other student data to factor out external influences on achievement
Assess whether students across a classroom or school are doing better or worse than predicted
Can be used to assess performance at different levels, including school, teacher teams, grade/subject teams, and individual teachers
Background on value-added models (VAMs)
20
How does value-added compare to alternatives?
Percent proficient– Unfair, makes inefficient use of data
Average test scores– Unfair, doesn’t account for baseline performance
Changes in average test scores– Better, but changing samples of tested students over
time make it problematic
Average test score gains– This is closest to value-added conceptually
21
Value-added = Average test scores ofown students – scores of similar students
End of Year Test Scores
Pre
dic
ted
Ow
n S
tud
ents
535
540
440
440
Beginning of Year
Value-added = 5
Account for everything we know
Assume that prior scores capture other factors that would be unobservable– Student’s innate ability, accumulated achievement, and
family, neighborhood, and peer influences that affect achievement also affected achievement last year
Time-specific events for individual students add “noise,” reduce precision of estimates
VAMs rely on residuals: What is left after accounting for other known factors
23
There will be some tested grades/subjects where a VAM is infeasible
– Earliest tested grade– If prior test scores are not a good predictor of current performance
The results from a VAM are inherently relative to the sample included, rather than benchmarked to an external standard
– When the analysis sample includes schools for the entire state, the VAM can tell you how a particular school performed compared to other schools in the state
– Could say that School A is in the 85th percentile of schools in the state, based on this VAM
Issues to consider when using VAMs
24
School Value-Added Estimates (Math)
25
-15
-10
-50
510
15
Hypothetical statewide VAM distribution in math, grades 4-8
26
-2-1
01
2S
ch
oo
l V
AM
Estim
ate
in
Sta
te S
D U
nits
0 10 20 30 40 50 60 70 80 90 100School Percentile Rank in State
PA Schools (Not PPS) Pittsburgh Schools
Don’t measure student learning that isn’t captured in student assessments
Don’t adjust for differences in resources
Don’t account for unobserved changes in student circumstances
Don’t determine how or why some teachers/ schools are performing better than others
What VAMs don’t do
27
Like all evaluation methods, VAMs are susceptible to some error
Unlike most other methods (e.g. classroom observation), VAM error rates are measured and reported
Particular error rate adopted is a policy question that depends on tolerance for different kinds of mistakes
Confidence level/error rate might vary depending on use of results
Balancing risks of errors: a policy decision
28
How does a VAM compare schools with different grade ranges?
Which factors should a VAM control for?
How many students are necessary to get a valid VAM estimate?
How will issues of data quality be addressed?
Can a VAM work when the test changes from one year/grade to the next?
Can a VAM incorporate multiple measures of performance?
Frequently asked questions about VAM
29
Hypothetical teacher value-added report
30
50 points
Mathematics 2007-10 Reading 2007-10
-75 points
Teacher Performance Report – Template #1 Teacher Name: Teacher X School: School X Grade: 4 Academic Years: 2007-08, 2008-09, 2009-10 (3-year average) Number of students: 60
Pred
icte
d Sc
ore
Ba
sed
on S
imila
r Stu
dent
s
1200
Actu
al A
vera
ge S
core
of
You
r Stu
dent
s
1250 1225
Pred
icte
d Sc
ore
Base
d on
Sim
ilar S
tude
nts
1150
Actu
al A
vera
ge S
core
of
You
r Stu
dent
s
MATH VA PERCENTILE:
54
1
50
100
READING VA PERCENTILE:
40
1
50
100 YOUR
VALUE ADDED: YOUR VALUE ADDED:
31
Three-year average (2007-10)
Value added percentile 54th percentile Value added percentile 40th percentile Value added range 48th – 60th Value added range 34th – 46th Statistically significant No Statistically significant Yes
Last year (2009-10)
Value added percentile 52nd percentile Value added percentile 43rd percentile Value added range 42nd – 62nd Value added range 33rd – 53rd Statistically significant No Statistically significant No
50 points
Mathematics 2007-10 Reading 2007-10
-75 points
Teacher Performance Report – Template #1 Teacher Name: Teacher X School: School X Grade: 4 Academic Years: 2007-08, 2008-09, 2009-10 (3-year average) Number of students: 60
Pred
icte
d Sc
ore
Ba
sed
on S
imila
r Stu
dent
s
1200 Ac
tual
Ave
rage
Sco
re
of Y
our S
tude
nts
1250 1225
Pred
icte
d Sc
ore
Base
d on
Sim
ilar S
tude
nts
1150
Actu
al A
vera
ge S
core
of
You
r Stu
dent
s
MATH VA PERCENTILE:
54
1
50
100
READING VA PERCENTILE:
40
1
50
100 YOUR
VALUE ADDED: YOUR VALUE ADDED:
32
Value Added Percentiles (5-Year Trend):
Value Added Estimates for Student Subgroups (2007-10 average):
Compared with your school’s overall value added, its VA estimate for the following subgroup is:
Significantly Lower Statistically similar Significantly Higher
African American reading and math
Low-income reading and math
Low achievers reading math
High achievers math reading
Mathematics 2009-10 Reading 2009-10
School Performance Report – Template #1 School: School X Academic Year: 2009-10 Grades: 6-8 Number of students: 60
MATH VA
PERCENTILE:
54
1
50
100
READING VA PERCENTILE:
40
1
54
2005-06 2006-07 2008-09 2007-08 2009-10
Mathematics
50
100 Reading
2005-06 2006-07 2008-09 2007-08 2009-10
100
1
53
36
25
52 50
51 48
20
40
35
1
50
100
50 points
Pred
icte
d Sc
ore
Ba
sed
on S
imila
r Stu
dent
s
1200
Actu
al A
vera
ge S
core
of
You
r Stu
dent
s
1250
YOUR SCHOOL’S VALUE ADDED:
-75 points 1225
Pred
icte
d Sc
ore
Base
d on
Sim
ilar S
tude
nts
1150
Actu
al A
vera
ge S
core
of
You
r Stu
dent
s
YOUR SCHOOL’S VALUE ADDED:
Most common mistake when rolling out VAM is to push to use VAM for high stakes too soon– Typically mainly data linking students to classrooms
and teachers that is most problematic
Need both short term and long term goals
Short term goals:– Identify VAM models that can be reliably estimated
with existing data– Start process of improving data systems so that
more and better VAM measures can be included moving forward
Common pitfalls when rolling out VAM
33
Identify VAM levels feasible in first year– School-level VAM– Grade-level team VAM– Subject or grade-by-subject team VAMs?
Identify tests to include in first year– State-wide assessments a good starting point– Tests need to be comparable across schools– Can add additional tests in future years– VAM is flexible in terms of including different types
of tests
Aim for a trial run of a teacher-level VAM sometime during first year
Goals for Year 1 VAM
34
An advantage a statewide test is that the VAM can identify when all schools in the district improve– Can set performance standards based on meeting a
certain percentile in the state performance distribution
– Allows for more schools to meet the standard as the district improves
The VAM can use tests given only within the district, but results will be relative to other schools in the district– For instance, reward schools in the top 30%
Which tests to include in the VAM?
35
Multiple VAM levels can be included in the measure of teacher performance– Could be 30% teacher team VAM, 30% school-level
VAM, and 40% other outcome measures
Which VAM levels are included can vary across teachers– Teachers in tested grades and subjects– Teachers in untested grades– Teachers in untested subjects
VAM as part of teacher performance measure
36
Teacher-level VAM is a useful tool to inform district policy, even if not used for high stakes– Many interventions take place at the classroom level
Successful rollout of VAM takes small steps to build trust– Start with school and team-level VAM to build
understanding and confidence– As data systems improve, roll out teacher-level VAM
in a low stakes setting– Once trust and understanding are in place and
multiple years of teacher VAM are available, build up to other uses
Using a teacher-level VAM
37
Key challenge is to correctly identify the source(s) of instruction associated with each outcome, for each student– Student mobility– Team teaching– Teaching assistants– Other sharing arrangements
Policy question: how much time is necessary to be instructionally relevant?
Roster verification is a crucial component
Once data is available, the VAM can allocate credit appropriately
Improving Data Quality
38
Whenever multiple sources share responsibility for a particular student outcome, VAM uses dosage to allocate credit– A student who switched schools during the year may
get 60% dosage at one school and 40% at another
Even if not interested in teacher-level VAM, improved data quality can allow for more realistic groupings of teacher teams
Not necessary for entire analysis sample to have the same data quality
VAMs with team teaching
39
Potential control variables include:– Student characteristics such as gender,
race/ethnicity, disability status, parental income/education
– Student programmatic variables such as ELL status, free or reduced price lunch status, special education status, gifted status
– Student mobility, indicator for grade repeater– Classroom-level aggregates, class size– “Dosage” variables indicating how much of the year
each student spent with that school
Is the control variable outside of the control of the school, and comparable across schools?
Key VAM decision: Which control variables?
40
Shrinkage Estimator
Not fair to treat estimate based on 40 students with same weight as estimate based on 400 students
Final estimate for each school is weighted average of :
– Value-added estimate calculated for the school
– Value-added estimate for the average school
The fewer students a school has:
– The less evidence we have for this school
– The more we weight the average school
41
Shrinkage: An Example
42
District average
Top 20%
Shrinkage: An Example
43
District average
Top 20%
Shrinkage: An Example
44
District average
Top 20%
1. Unweighted average across all tests
2. Give each subject equal weight
3. Base weights on precision: More weight to outcomes that are more precisely estimated
4. Use weights chosen according to district policy priorities
Options for combining outcome measures
45
Can be difficult to accurately predict EOC scores– Prior test scores are typically from different subject areas
Students take EOC exams in different grades– Algebra I EOC exam typically taken in 8th or 9th grade
– Differences likely related to high-school academic track
– Patterns can vary dramatically across schools
HS issues: EOC assessments
46
Attrition includes dropouts and students who leave the data for other reasons
Rates of attrition from 8th grade to 12th grade vary substantially across high schools
– Commonly see attrition ranges from 10% to 50%
If dropouts would have experienced lower growth, then schools with high dropout rates would show artificially inflated value-added
HS issues: Dropouts
47
Questions and Discussion
48