new york town hall value added - varc
DESCRIPTION
Slide presentation on VARC for the New York Value Added Town HallTRANSCRIPT
VALUE-ADDED NEW YORK TOWN HALL MEETING
Value-Added Research Center’s (VARC) Role in NWEA’s APPR Strategy
Testing
Metric (Growth Score)
Analysis (Value Added)
State APPR Rating (0-20)
NWEA
VARC
The Power of Two
&A more
complete picture of student learning
Achievement Value-AddedCompares students’
performance to a standard
Does not factor in students’ background characteristics
Measures students’ performance at a single
point in time
Critical to students’ post-secondary opportunities
Measures students’ individual academic growth longitudinally
Factors in students’ background characteristics
outside of the school’s control
Critical to ensuring students’ future academic success
Measures the impact of teachers and schools on
academic growth
Adapted from materials created by Battelle for Kids
Value-Added Basics – The Oak Tree Analogy
The Oak Tree Analogy
Gardener A Gardener B
Explaining Value-Added by Evaluating Gardener Performance
For the past year, these gardeners have been tending to their oak trees trying to maximize the height of the trees.
This method is analogous to using an Achievement Model.
Gardener A Gardener B
61 in.
72 in.
Method 1: Measure the Height of the Trees Today (One Year After the Gardeners Began) Using this method, Gardener B is the more effective gardener.
Pause and Reflect
How is this similar to how schools have been evaluated in the past?
What information is missing from our gardener evaluation?
61 in.
72 in.Gardener A Gardener B
Oak AAge 4
(Today)
Oak BAge 4
(Today)
Oak AAge 3
(1 year ago)
Oak BAge 3
(1 year ago)
47 in.52 in.
This Achievement Result is not the Whole Story
We need to find the starting height for each tree in order to more fairly evaluate each gardener’s performance during the past year.
This is analogous to a Simple Growth Model, also called Gain.
61 in.
72 in.Gardener A Gardener B
Oak AAge 4
(Today)
Oak BAge 4
(Today)
Oak AAge 3
(1 year ago)
Oak BAge 3
(1 year ago)
47 in.52 in.+14 in. +20 in
.
Method 2: Compare Starting Height to Ending Height
Oak B had more growth this year, so Gardener B is the more effective gardener.
Gardener A Gardener B
What About Factors Outside the Gardener’s Influence? This is an “apples to oranges” comparison. For our oak tree example, three environmental factors we will examine are:
Rainfall, Soil Richness, and Temperature.
External condition Oak Tree A Oak Tree B
Rainfall amount
Soil richness
Temperature
High LowLow HighHigh Low
Gardener A Gardener B
Gardener A Gardener B
How Much Did These External Factors Affect Growth? We need to analyze real data from the region to predict growth for these trees. We compare the actual height of the trees to their predicted heights to
determine if the gardener’s effect was above or below average.
In order to find the impact of rainfall, soil richness, and temperature, we will plot the growth of each individual oak in the region compared to its environmental conditions.
Rainfall Low Medium HighGrowth in
inches relative to
the average
-5 -2 +3
Soil Richness
Low Medium High
Growth in inches
relative to the average
-3 -1 +2Temperatu
reLow Medium High
Growth in inches
relative to the average
+5 -3 -8
Calculating Our Prediction Adjustments Based on Real Data
Oak AAge 3
(1 year ago)
Oak BAge 3
(1 year ago)
67 in.72 in.Gardener A Gardener B
Oak APrediction
Oak BPrediction
47 in.52 in.
+20 Average+20 Average
Make Initial Prediction for the Trees Based on Starting Height
Next, we will refine out prediction based on the growing conditions for each tree. When we are done, we will have an “apples to apples” comparison of the gardeners’ effect.
70 in. 67 in.Gardener A Gardener B
47 in.52 in.
+20 Average+20 Average
+ 3 for Rainfall - 5 for Rainfall
Based on Real Data, Customize Predictions based on Rainfall
For having high rainfall, Oak A’s prediction is adjusted by +3 to compensate.
Similarly, for having low rainfall, Oak B’s prediction is adjusted by -5 to compensate.
67 in.69 in.Gardener A Gardener B
47 in.52 in.
+20 Average+20 Average
+ 3 for Rainfall
- 3 for Soil + 2 for Soil
- 5 for Rainfall
Adjusting for Soil Richness For having poor soil, Oak A’s prediction is adjusted by -3. For having rich soil, Oak B’s prediction is adjusted by +2.
59 in.
74 in.Gardener A Gardener B
47 in.52 in.
+20 Average+20 Average
+ 3 for Rainfall
- 3 for Soil + 2 for Soil
- 8 for Temp + 5 for Temp
- 5 for Rainfall
Adjusting for Temperature For having high temperature, Oak A’s prediction is adjusted by -8. For having low temperature, Oak B’s prediction is adjusted by +5.
+20 Average+20 Average
+ 3 for Rainfall
- 3 for Soil + 2 for Soil
- 8 for Temp + 5 for Temp_________
+12 inchesDuring the year
_________+22 inches During the year
59 in.
74 in.Gardener A Gardener B
47 in.52 in.
- 5 for Rainfall
Our Gardeners are Now on a Level Playing Field
The predicted height for trees in Oak A’s conditions is 59 inches. The predicted height for trees in Oak B’s conditions is 74 inches.
PredictedOak A
PredictedOak B
ActualOak A
ActualOak B
59 in.
74 in.Gardener A Gardener B61 in.
72 in.+2-2
Compare the Predicted Height to the Actual Height Oak A’s actual height is 2 inches more than predicted. We attribute this to the effect
of Gardener A. Oak B’s actual height is 2 inches less than predicted. We attribute this to the effect
of Gardener B.
This is analogous to a Value-Added measure.
Above Average
Value-Added
Below Average
Value-Added
PredictedOak A
PredictedOak B
ActualOak A
ActualOak B
59 in.
74 in.Gardener A Gardener B61 in.
72 in.+2-2
Method 3: Compare the Predicted Height to the Actual Height
By accounting for last year’s height and environmental conditions of the trees during this year, we found the “value” each gardener “added” to the growth of the trees.
Value-Added Basics – Linking the Oak Tree Analogy to Education
Oak Tree Analogy Value-Added in Education
What are we evaluating?
• Gardeners • Districts• Schools• Grades• Classrooms
How does this analogy relate to value added in the education context?
What are we using to measure success?
• Relative height improvement in inches
• Relative improvement on standardized test scores
Sample • Single oak tree • Groups of students
Control factors • Tree’s prior height
• Other factors beyond the gardener’s control:
• Rainfall• Soil richness• Temperature
• Students’ prior test performance (usually most significant predictor)
• Other demographic characteristics such as:
• Grade level• Gender• Race / Ethnicity• Low-Income Status• ELL Status• Disability Status• Section 504 Status
Another Visual Representation
Spring NWEAMAP Score
Actual student
achievement
RIT score
Predicted student achievement
(Based on observationally similar students)
Value-Added
Starting student
achievement RIT score
Fall NWEAMAP Score
The Education Context
VARC Data Output
What do Value-Added Results Look Like?
The Value-Added model typically generates a set of results measured in scale scores.
Teacher Value-Added
Teacher A +10
Teacher B -10
Teacher C 0
This teacher’s students gained 10 more points on
the RIT scale than observationally similar
students across the state. (10 points more
than predicted)
10 points fewer than predicted
These students gained exactly as many points
as predicted
Value-Added in “Tier” Units
Grade 4 30
-2 -1 0 1 2
0.9
In some cases, Value-Added is displayed on “Tier” scale based on standard deviations (z-score) for reporting purposes.
About 95% of estimates will fall between -2 and +2 on the scale.
Using NWEA’s MAP + VARC within New York’s Annual Professional Performance Review (APPR)
60%20%
20%
APPRObservations State Test GrowthNWEA + VARC
State Tested Grades / Subjects
Other Grades / Subjects for which there is an approved NWEA test
60%20%
20%
APPRObservations Local MeasureNWEA + VARC
APPR’s 0-20 Local Measure Descriptions of Categories
A teacher’s results are compared to district or BOCES-adopted expectations for growth or achievement of student learning standards for grade/subject Ineffective – Results are well-below
expectations Developing – Results are below
expectations Effective – Results meet expectations Highly Effective – Results are well-above
expectations
What are the Rules for APPR’s Local 0-20?
Score Ranges 0-2 Ineffective 3-8 Developing 9-17 Effective 18-20 Highly Effective
What are the Rules for APPR’s Local 0-20?
Scores must use the full range (For example: not all teachers can be labeled “Effective”)
How can we translate Value-Added estimates into this 0-20 scale in a fair and responsible way? Who gets labeled “Ineffective” Resources to support these teachers
Transformation Example
10 15 200 5
Ineffective Developing Effective Highly Effective
Transformation Example
10 15 200 5
Ineffective Developing Effective Highly Effective
Transformation Example
10 15 200 5
Ineffective Developing Effective Highly Effective
Transformation Example
10 15 200 5
Ineffective Developing Effective Highly Effective
VARC Data Output File
Example VARC Output File
What is included in these results?
Levels of Results
District School Teacher Grade Subject
District A School 1 Ms. Smith 4 Math
District A School 1 Ms. Smith 4 Reading
District A School 2 Mr. Jones 6 Math
District A School 3 Mr. Thomas 1 Language Usage
District A School 4 Mrs. Meyer 10 Reading
Results will be provided for (provided a large enough sample of students) Math grades K-10 Reading grades K-10 Language Usage grades K-10
Result Formats
RIT Score Confidence Interval
+10 +7 to +13
0 -2 to +2
-4 -6 to -2
Tier Confidence Interval
+1.9 +1.7 to +2.1
0 -0.2 to +0.2
-0.8 -1.0 to -0.6
0-20 APPR
18
10
7
Scale score growth difference than average
for observationally similar students
“z-scores” of the RIT score differences. This
answers the question of “how good is good?”
Default 0-20 to comply
with law (to be decided)
VARC Data Needs
What Data Does VARC Need? Data identifying and linking
students/teachers State Student ID linkable to NWEA data School ID Teacher ID
What Data Does VARC Need? Student Test Data
Fall Test Data for Math, Reading, Language Usage (Date, Score, SEM)
Spring Test Data for Math, Reading, Language Usage(Date, Score)
Student Demographics Grade, Gender, Race/Ethnicity, Special
Education Status, ELL Status, FRL Status, etc.
What is the Timeline?
Testing windows in the 2012-2013 school year Need Fall/Spring testing
Collection strategy for student demographic data Data from the state update Contingency plan for collection from
RIC/district
What is the Timeline?
Our production timeline can only begin once we’ve received clean student-teacher linking data from supplier (state, RIC, district)
Timeline for Value-Added analysis Drop-dead date for data transfer to VARC Time to run analysis and quality check Return results back to districts’
superintendants or designee Special case of summer 2012
• Individual student-level MAP growth targets vs. the need for Value-Added for APPR
• 0-20 local measure within APPR 0-100• Transformation of Value-Added to 0-20• Consistent messaging and meaning across
NWEA partners• Approving this solution through the New
York SED
Questions / concerns for the advisory committee to address?
VALUE-ADDED NEW YORK ADVISORY GROUP MEETING
Existing VARC Projects
Minneapolis Milwauke
eRacine
Chicago
Madison
Tulsa
Atlanta
New York City
Los Angeles
Hillsborough County
NORTH DAKOTA
SOUTH DAKOTA
MINNESOTA
WISCONSIN
ILLINOIS
Districts and States Working with VARC
Denver
Collier County
Wisconsin
Opt-in statewide Value-Added system (2010)
Statewide advisory group with quarterly meetings District-led annual meetings
on responsible use and messaging
Expansion of piloted MAP Value-Added (Racine and Milwaukee) to statewide model
Same model and messaging across districts
A Value-Added Model of Classroom Performance: Recipe for a Statistician
1 0
1 1 1 1 1(school) (school) (classroom)
i i i
k ik jk ijk ik k j
Y Y X
S C
What does that mean in English?
Post-Test
Post-on-Pre Link
Pre-TestStudent
Characteristics
Classroom Effect
Unknown Student
Characteristics
= * + + +
Spring MAP Result
Adjustment to account for
starting point
Fall MAP Result
Adjustment to account for
student demographics
Classroom contribution to
student learning(Value-Added)
Error term for unknown factors,
(reduces with increased sample
size)
Los Angeles, California
Phase 1 (May 2011) Grades 3-8 Math and ELA Grade 9 ELA
Phase 2 (Nov 2011) Grades 3-11 ELA Grades 3-8 General Math High School subjects
Math, ELA, Science, Social Studies
Phase 3 (Nov 2012) Other Assessments
Example Documentation
http://portal.battelleforkids.org/BFK/LAUSD/Training_Materials.html?sflang=en
Excerpt from LAUSD’s teacher-level Value-Added
Model documentation
Transparency of the model is our goal
Hillsborough County, Florida
Began July 2010 Subject / Grade Coverage
Models from Art to Welding Multiple Measures
Charlotte Danielson observational ratings
Combined use of student outcomes and observational data in evaluation system
Use of Value-Added Fiscal awards Future uses being developed
together with union
New York, New York
In the past, Value-Added based on state exams Dangers related to the
release of teacher-level data
Constructive use of data Currently calculating
local measures based on MAP
Advising NYC on Transformation to 0-20
Some Common Features of VARC’s Value-Added Models
Prior test scores to predict current test scores Single prior test or multiple tests (sometimes across
subjects) Growth of a teacher’s students is compared to growth of
similarly achieving students across the state Student demographics
Typically Gender, Race/Ethnicity, Low-Income Status, Special Education Status, English Language Learner Status, other student-level data available for all students
Measurement error correction Dosage (when enrollment data is available) Statistical shrinkage estimation VARC motto: Simpler is better unless it’s wrong
Continuous improvement of the model based on latest research and improving data quality
Translating Value-Added to the 0-20 Scale Required by APPR
Using NWEA’s MAP + VARC within New York’s Annual Professional Performance Review (APPR)
60%20%
20%
APPRObservations State Test GrowthNWEA + VARC
State Tested Grades / Subjects
Other Grades / Subjects for which there is an approved NWEA test
60%20%
20%
APPRObservations Local MeasureNWEA + VARC
Can NWEA’s MAP be used for the other 20% where NWEA tests are approved?
What about grades / subjects not covered by NWEA’s assessments?
APPR’s 0-20 Local Measure Descriptions of Categories
A teacher’s results are compared to district or BOCES-adopted expectations for growth or achievement of student learning standards for grade/subject Ineffective – Results are well-below
expectations Developing – Results are below
expectations Effective – Results meet expectations Highly Effective – Results are well-above
expectations
What are the Rules for APPR’s Local 0-20?
Score Ranges 0-2 Ineffective 3-8 Developing 9-17 Effective 18-20 Highly Effective
Scores must use the full range (For example: not all teachers can be labeled “Effective”)
How can we translate Value-Added estimates into this 0-20 scale in a fair and responsible way?
Transformation Example
10 15 200 5
Ineffective Developing Effective Highly Effective
Transformation Example
10 15 200 5
Ineffective Developing Effective Highly Effective
Transformation Example
10 15 200 5
Ineffective Developing Effective Highly Effective
Transformation Example
10 15 200 5
Ineffective Developing Effective Highly Effective
0-20 Consideration Topics
Implications of a given translation Percentage of teachers labeled “Ineffective”
relative to resources for support Disagreement between Value-Added in
subject areas For example: a 4th grade teacher gets a “0”
in math and “20” in reading Do we do a weighted average of those two
to get a single cross-subject Value-Added? Do we take the higher of the two?
0-20 Consideration Topics
What about teachers teaching multiple grades? Same solution as multi-subject?
Once multiple years of data are available, do we use the most recent year or a multi-year average? If an average, how many years?
What about estimates based on very few students? Is there a minimum threshold for reporting
out? Is there any way to consider the confidence
interval around estimates?
15 Minutes
Break
Why does VARC recommend including student demographic data?How do we decide what to include?
Modeling Decisions
(Proxy measures for causal factors)
How does VARC choose what to control for?
How does VARC choose what to control for?• Imagine we want to evaluate another pair of gardeners and we notice that there is
something else different about their trees that we have not controlled for in the model.
• In this example, Oak F has many more leaves than Oak E. • Is this something we could account for in our predictions?
Oak EAge 5
Oak FAge 5
Gardener E Gardener F
73 in. 73 in.
In order to be considered for inclusion in the Value-Added model, a characteristic must meet several requirements:
Check 1: Is this factor outside the gardener’s influence?
Check 2: Do we have reliable data?
Check 3: If not, can we pick up the effect by proxy?
Check 4: Does it increase the predictive power of the model?
Check 1: Is this factor outside the gardener’s influence?
Outside the gardener’s influence
Starting tree height
Rainfall
Soil Richness
Temperature
Starting leaf number
Gardener can influence
Nitrogen fertilizer
Pruning
Insecticide
Watering
Mulching
Check 2: Do we have reliable data?
Category Measurement Coverage
Yearly record of tree height
Height (Inches) 100%
Rainfall Rainfall (Inches) 98%
Soil Richness Plant Nutrients (PPM)
96%
Temperature Average Temperature
(Degrees Celsius)
100%
Starting leaf number
Individual Leaf Count
7%
Canopy diameter Diameter (Inches) 97%
Check 3: Can we approximate it with other data?
Category Measurement Coverage
Yearly record of tree height
Height (Inches) 100%
Rainfall Rainfall (Inches) 98%
Soil Richness Plant Nutrients (PPM)
96%
Temperature Average Temperature
(Degrees Celsius)
100%
Starting leaf number
Individual Leaf Count
7%
Canopy diameter Diameter (Inches) 97%
?
Canopy diameter as a proxy for leaf count• The data we do have available about canopy diameter might help us measure the effect of
leaf number.• The canopy diameter might also be picking up other factors that may influence tree
growth.• We will check its relationship to growth to determine if it is a candidate for inclusion in the
model.
Oak EAge 5
Oak FAge 5
Gardener E Gardener F
33 in. 55 in.
If we find a relationship between starting tree diameter and growth, we would want to control for starting diameter in the Value-Added model.
The Effect of Tree Diameter on Growth
Gro
wth
fro
m Y
ear
5 t
o 6
(in
ches)
Tree Diameter (Year 5 Diameter in Inches)
0 20 40 60 800
5
10
15
20
25
30
35
40
Tree Diameter?
If we find a relationship between starting tree diameter and growth, we would want to control for starting diameter in the Value-Added model.
The Effect of Tree Diameter on Growth
Gro
wth
fro
m Y
ear
5 t
o 6
(in
ches)
Tree Diameter (Year 5 Diameter in Inches)
0 20 40 60 800
5
10
15
20
25
30
35
40
Tree Diameter
What happens in the education context?
Check 1: Is this factor outside the school or teacher’s influence?
Check 2: Do we have reliable data?
Check 3: If not, can we pick up the effect by proxy?
Check 4: Does it increase the predictive power of the model?
Outside the school’s influence
At home support
English language learner status
Gender
Household financial resources
Learning disability
Prior knowledge
School can influence
Curriculum
Classroom teacher
School culture
Math pull-out program at school
Structure of lessons in school
Safety at the school
Check 1: Is this factor outside the school or teacher’s influence?
Let’s use “Household financial resources” as an example
Check 2: Do we have reliable data?
What we want
• Household financial resources
Check 3: Can we approximate it with other data?
Using your knowledge of student learning, why might “household financial resources” have an effect on
student growth?
What we have
• Free/reduced lunch status
Related data
What we want
• Household financial resources
Check 4: “Does it increase the predictive power of the model?” will be determined by a multivariate linear regression model based on real data from your district or state (not pictured) to determine whether FRL status had an effect on student growth.
What about race/ethnicity?
What we have
• Race/ethnicity
What we want
• General socio-economic status• Family structure• Family education• Social capital• Environmental stress
Related complementary data may correlate with one another(not a causal relationship)
Race/ethnicity causes higher or lower performance
Check 4 will use real data from your district or state to determine if race/ethnicity has an effect on student growth.If there is no effect, it will not be included in the model.
What about race/ethnicity?
If there is a detectable difference in growth rates
We attribute this to a district or state challenge to be addressed
Not as something an individual teacher or school should be expected to overcome on their own
Checking for Understanding
What would you tell a 5th grade teacher who said they wanted to include the following in the Value-Added model for their results?:A. 5th grade reading curriculumB. Their students’ attendance during 5th
gradeC. Their students’ prior attendance during 4th
gradeD. Student motivation
Check 1: Is this factor outside the school or teacher’s influence?
Check 2: Do we have reliable data?
Check 3: If not, can we pick up the effect by proxy?
Check 4: Does it increase the predictive power of the model?
Small Group Discussion
Group 1
Nate (NWEA)
Sean (VARC)
Group 2
John (NWEA)
Andrew (VARC)
Key discussion topics: Advisory council’s role in
selecting a consistent “standard” model and 0-20 translation and Value-Added model
Questions / concerns about selecting a 0-20 translation of Value-Added
Questions / concerns about modeling features (we do not yet know what data will be available to VARC)
Wrap-Up
Top concerns and questions from small group discussion
Where do we need more information? What are the challenges we face?
How can we work together to address those challenges?
What are our next steps? Next advisory group meeting What topics should we cover?
Quasi-experimental design structureVisualizing Achievement vs. Value-AddedControlling for starting pointComparison to a different model – Student Growth Percentiles
Additional Resources
Value-Added Model Description
Design• Quasi-
experimental statistical model
• Controls for non-school factors (prior achievement, student and family characteristics)
Output• Productivity
estimates for contribution of educational units (schools, classrooms, teachers) to student achievement growth
Objective• Valid and fair
comparisons of school productivity, given that schools may serve very different student populations
The Power of Two - Revisited
1 2 3 540
20
40
60
80
100
Value-Added (2009-2010)
Perc
en
t P
rof/
Ad
v (
2009)
Scatter plots are a way to represent Achievement and Value-Added together
Ach
ievem
ent
Value-Added
The Power of Two - Revisited
1 2 3 540
20
40
60
80
100
Value-Added (2009-2010)
Perc
en
t P
rof/
Ad
v (
2009)
Schools in your district
A
A. Students know a lot and are growing faster than predicted
B
B. Students are behind, but are growing faster than predicted
C
C. Students know a lot, but are growing slower than predicted
D
D. Students are behind, and are growing slower than predicted
E
E. Students are about average in how much they know and how fast they are growing
(high or low achieving students)
What about tall or short trees?
1. What about tall or short trees?• If we were using an Achievement Model, which gardener would you rather be?
Gardener C Gardener D
Oak CAge 4
Oak DAge 4
• How can we be fair to these gardeners in our Value-Added Model?
28 in.
93 in.
Why might short trees grow faster?• More “room to grow”• Easier to have a “big impact”
Gardener C Gardener D
Why might tall trees grow faster?• Past pattern of growth will continue • Unmeasured environmental factors
How can we determine what is really happening?
Oak CAge 4
Oak DAge 4
In the same way we measured the effect of rainfall, soil richness, and temperature, we can determine the effect of prior tree height on growth.
The Effect of Prior Tree Height on Growth
Gro
wth
fro
m Y
ear
4 t
o 5
(in
ches)
Prior Tree Height (Year 4 Height in Inches)
0 20 40 60 80 100 1200
5
10
15
20
25
30
35
40
Prior Tree H...
Oak C
(28 in)
9 in
Oak D
(93 in)
30 in
Our initial predictions now account for this trend in growth based on prior height. • The final predictions would also account for
rainfall, soil richness, and temperature.
Oak CAge 4
Oak DAge 4
Oak CAge 5
(Prediction)
Oak DAge 5
(Prediction)
+9 in.
+30 in.
How can we accomplish this fairness factor in the education
context?
Analyzing test score gain to be fair to teachers
Student 3rd Grade Score 4th Grade Score
Abbot, Tina 244 279
Acosta, Lilly 278 297
Adams, Daniel 294 301
Adams, James 275 290
Allen, Susan 312 323
Alvarez, Jose 301 313
Alvarez, Michelle 256 285
Anderson, Chris 259 277
Anderson, Laura 304 317
Anderson, Steven 288 308
Andrews, William 238 271
Atkinson, Carol 264 286
High
Low
Test ScoreRange
High Achiever
Low Achiever
If we sort 3rd grade scores high to low, what do we notice about the students’ gain from test to test?
Student 3rd Grade Score 4th Grade Score Gain in Score from 3rd to 4th
Allen, Susan 312 323 11
Anderson, Laura 304 317 13
Alvarez, Jose 301 313 12
Adams, Daniel 294 301 7
Anderson, Steven 288 308 20
Acosta, Lilly 278 297 19
Adams, James 275 290 15
Atkinson, Carol 264 286 22
Anderson, Chris 259 277 18
Alvarez, Michelle 256 285 29
Abbot, Tina 244 279 35
Andrews, William 238 271 33
High
Low
Test ScoreRange
If we find a trend in score gain based on starting point, we control for it in the Value-Added model.
Student 3rd Grade Score 4th Grade Score Gain in Score from 3rd to 4th
Allen, Susan 312 323 11
Anderson, Laura 304 317 13
Alvarez, Jose 301 313 12
Adams, Daniel 294 301 7
Anderson, Steven 288 308 20
Acosta, Lilly 278 297 19
Adams, James 275 290 15
Atkinson, Carol 264 286 22
Anderson, Chris 259 277 18
Alvarez, Michelle 256 285 29
Abbot, Tina 244 279 35
Andrews, William 238 271 33
High
Low
Test ScoreRange
High
Low
Gain
What do we usually find in reality? Looking purely at a simple growth
model, high achieving students tend to gain about 10% fewer points on the test than low achieving students.
In a Value-Added model we can take this into account in our predictions for your students, so their growth will be compared to similarly achieving students.
School A School B School C
AdvancedProficientBasicMinimal
StudentPopulation
Why isn’tthis fair?
Comparisons of gain at different schoolsbefore controlling for prior performance
HighAchievement
MediumAchievement
LowAchievementArtificially inflated
gain
Artificially lower gain
Comparisons of Value-Added at different schools
after controlling for prior performance
School A School B School C
Fair Fair Fair
StudentPopulation
AdvancedProficientBasicMinimal
Checking for Understanding
What would you tell a teacher or principal who said Value-Added was not fair to schools with: High achieving students? Low achieving students?
Is Value-Added incompatible with the notion of high expectations for all students?
STUDENT GROWTH PERCENTILES (SGP)
Draft Explanation
Gardener A
Oak AAge 4
(Today)
Oak AAge 3
(1 year ago)
47 in.
How Would SGP Measure Oak A?
Oak A’s growth will be compared to all Oaks in the region who started at the same height last year.
Identify all Oaks that were 47” last year
Oak AAge 3
(1 year ago)
Oak ZOak YOak XOak WOak VOak UOak T
Find the Height of Those Trees Today
Oak AAge 4
(Today)
Oak ZOak YOak XOak WOak VOak UOak T
Reorder the Trees Shortest to Tallest
Oak AAge 4
(Today)
Oak ZOak YOak XOak WOak VOak UOak T
Oak W Oak AAge 4
(Today)
Oak U Oak T Oak Z Oak Y Oak X Oak V
Reorder the Trees Shortest to Tallest
The percentage of trees equal or shorter than Oak A is Oak A’s growth percentile.
2/8 = 0.25 25th Growth Percentile
Gardener A
Oak AAge 4
(Today)
Oak AAge 3
(1 year ago)
47 in.
Assigning SGP to the Gardener
If Gardener A is assigned to multiple trees, the median SGP of Gardener A’s trees is assigned to the Gardener.
25th Percentile
61 in.
Pause and Reflect
What might happen if Oak A is in a different environment than the other trees it was compared against?
Is SGP measuring the effect of just the gardener?