1 assessments to support deeper learning and ambitious instruction p. david pearson uc berkeley...
TRANSCRIPT
1
Assessments to support deeper learning and ambitious
instruction
P. David PearsonUC Berkeley
Slides will be posted at www.scienceandlitercy.org
3
Assessing Literacy: Problems and
Possibilities??? How did we get to where we are? Where are we headed? Where should and could we be
headed?
P. David PearsonUC Berkeleywww.scienceandliteracy.org
4
Some Context for Today’s Talk
My interest stems from several sources Multiple attempts to build state assessment
systems Several years of attempts to build alternative
assessments—portfolios, performance assessments, formative systems.
Attempts to work with schools to build assessment systems rather than collections of tests
A conviction that accountability should not drive us into a corner in which we engage in practices that are not in the best interests of students, teachers, or the public.
5
Part IHow did we get to where
we are?Where did it all start?
Early 1960s: A kinder, gentler time Title 1 reauthorization in 1967: the
first trade of accountability for flexibility
6
The 1970s
Behavioral objectives Criterion referenced assessments Curriculum-embedded assessments Minimal competency tests: New
Jersey Statewide assessments: Michigan &
Minnesota
7
Skill 1
Skill 2
Teach Assess Conclude
Teach Assess Conclude
The 1970s Skills management mentality: Teach a skill, assess it for mastery, reteach it if necessary, and then go onto the next skill.
Historical relationships between instruction and assessment
Foundation: Benjamin Bloom’s ideas of mastery learning
8
Skill 1
Skill 2
Skill 3
Teach Assess Conclude
Teach Assess Conclude
Teach Assess Conclude
Skill 4
Skill 5
Skill 6
Teach Assess Conclude
Teach Assess Conclude
Teach Assess Conclude
The 1970s, cont.
And we taught each of these skills until we had covered the entire curriculum for a grade level.
9
Rule of Thumb
The finer the grain size at which we monitor a process like reading and writing, the greater the likelihood that we will end up teaching and testing bits and pieces rather than global processes like comprehension and composition.
10
The 1980s
In like a lamb, out like a lion Quiet consolidation of skills based learning
(especially in commercial materials) A Nation at Risk leads to call for school
reform The evolution of “higher order” thinking in
all subject matter areas Late 1980s: Do we have assessments to
match our models of learning---------- Performance and authentic assessment
movements Eradicate the distinction between instruction and
assessment
11
Key movements of the late 1980s/1990s
Kentucky and Vermont portfolio assessments
Graduation by portfolio presentation Central Park East Walden
Assessment by exhibition: Coalition Performance Assessment: New Standards,
CLAS, Maryland… Some models of teacher performance
assessment: NBPTS, Connecticut
12
Late 1980s/early 1990s:Portfolios
Performance AssessmentsMake Assessment Look Like Instruction
On standards 1-n
Activities Conclusions
From which we draw
We engage in instructional activities, from which we collect evidence which permits us to draw conclusions about student growth or accomplishment on several dimensions (standards) of interest.
13
The complexity of performance assessment practices: one to many
Any given activity may offer evidence for many standards, e.g, responding to a story.
Activity X
Standard 5
Standard 3
Standard 4
Standard 2
Standard 1
14
Standard XActivity 1
Activity 2
Activity 3
Activity 4
Activity 5
For any given standard, there are many activities from which we could gather relevant evidence about growth and accomplishment, e.g., reads fluently
The complexity of performance assessment practices: many to one
15
The complexity of portfolio assessment practices, many to
many
Activity 1
Activity 2
Activity 3
Activity 4
Activity 5
Standard 1
Standard 2
Standard 3
Standard 4
Standard 5
• Any given artifact/activity can provide evidence for many standards
• Any given standard can be indexed by many different artifacts/activities
16
· Some people can tell what time it is by looking at the sun, but I never have been able to make out the numbers.
· There are four seasons: salt, pepper, mustard, and catsup.
The perils of performance assessment: or maybe those multiple-choice assessments
aren’t so bad after all…….
17
"Water is composed of two gins, Oxygin and Hydrogin. Oxygin is pure gin. Hydrogin is gin and water."
The perils of performance assessment
18
"Germinate: To become a naturalized German."
"Vacumm: A large, empty space where the pope lives."
The perils of performance assessment
19
· Genetics explains why you look like your father, and if you don't, why you should.
The perils of performance assessment
20
You can listen to thunder and tell how close you came to getting hit. If you don't hear it, you got hit, so never mind.
"When you breath, you inspire. When you do not breath, you expire."
The perils of performance assessment
21
1990s Standards based reform
State initiatives IASA model
Trading flexibility for accountability Move from being accountable for the
means and leaving the ends up for grabs (doctor or lawyer model) TO
Being accountable for the ends and leaving the means up for grabs (carpenter or product model)
22
Standards-Based ReformThe Initial Theory of Action
StandardsAssessment
Accountability
ClearExpectations
Motivation
HigherStudentLearning
Ala Tucker and Resnick in the early 1990s
23
More Recent Expanded Theory of Action
StandardsAssessment
Accountability
ClearExp’s
Motivation
HigherStudentLearning
Instruction
Professional Development
Ala Elmore and Resnick in the late 1990s.
24
No Child Left Behind
A definite retreat from performance-based assessment as a wide-scale tool
The persistence of standards-based reform...
NCLB and Reading First Every grade level outcomes
assessment Assessments for placement, progress
monitoring, diagnosis and outcomes Mania for alignment…
25
Alignment: Part of the theory of action
Specify the content standards Align everything else to them:
Assessments Instruction Curriculum materials Professional development Standards for teacher learning
26
Alignment: The centrality of content standards
Content Standards
Performance StandardsFor Students
Student Assessments
Wild Card
Professional Development
Standards for Teacher Education
Accreditation
Standards for Licensure
Teacher Evaluation
27
Problems in aligning assessments to standards
The foxes are guarding the hen house problem (the test companies do the alignment)
The backwards mapping problem (When the standards and the instruction look just like the test): Deal with DIBELS later…
The alignment is a coat of many colors problem (recall the earlier mapping)
28
Skill 1
Skill 2
Teach Assess Conclude
Teach Assess Conclude
The 1970s Skills management mentality: Teach a skill, assess it for mastery, reteach it if necessary, and then go onto the next skill.
Remember the lessons of the 1970s!!!
The bureaucratization problem
29
Skill 1
Skill 2
Skill 3
Teach Assess Conclude
Teach Assess Conclude
Teach Assess Conclude
Skill 4
Skill 5
Skill 6
Teach Assess Conclude
Teach Assess Conclude
Teach Assess Conclude
The 1970s, cont.
And we taught each of these skills until we had covered the entire curriculum for a grade level.
The bureaucratization problem
30
Standard 1
Standard 2
Teach Assess Conclude
Teach Assess Conclude
We could be back where we started.
What could happen if we are not careful with standards
The bureaucratization problem
31
Standard 1
Standard 2
Standard 3
Teach Assess Conclude
Teach Assess Conclude
Teach Assess Conclude
Standard 4
Standard 5
Standard 6
Teach Assess Conclude
Teach Assess Conclude
Teach Assess Conclude
In the name of ensuring a place for each standard in our assessment systems, we could end up in the same place we were in in the 1970s
What could happen if we are not careful with
standards
The bureaucratization problem
32
DIBELS Detour Schooling the cognitive process of test
taking: Kids who are alike in terms of accuracy will differ substantially in terms of speed and fluency
Face validity problem: What virtue is there in doing things faster? naming letters, sounds, words, ideas What would you do differently if you knew that
Susie was faster than Ted at X, Y, or Z???
33
The Achilles Heel: Consequential Validity
Give DIBELS
Use results to craft instruction
Give DIBELS again
Give Comprehension Test
Give Comprehension Test
The emperor has no clothes
34
Key Concept: Haladyna
Test Score Pollution: a rise or fall in a score on a test without an accompanying rise or fall in the cognitive or affective outcome allegedly measured by the test
35
Rule of thumb re-emerges
The finer the grain size at which we monitor a process like reading and writing, the greater the likelihood that we will end up teaching and testing bits and pieces rather than global processes like comprehension and composition.
36
Rule of Thumb for the Future
Wherever we go, we must not go back to the world in which we link instruction to assessment at the very finest grain-size
Keep our eyes on the prize—what does accomplished performance look like?
37
Where are we headed in the Post NCLB era?
Obama administration keeping up the accountability pressure Teacher accountability for student
achievement $350,000,000 investment in new
accountability assessments (PARCC, SBAC)
CCSS movement Another round of curriculum reform
You are a part of that effort
38
A New Opportunity: A Confluence of Forces
CCSS movements/Foundation funding Disciplinary delivery of of literacy
Reincarnation of Ambitious Pedagogy Deeper Learning
PARCC and SBAC Hybrid assessments Increase weights given to open ended
responses
Performance Tasks: Why bother?
External validity College ready Career ready PARCC ready
Curricular validity Higher Order Thinking Powerful learning Deeper learning
Performance Tasks: Why bother?
Consequential validity What curricular activities will it lead teachers and
students toward? Scaffolding toward independence
Utility Formative Assessment
Feedback to the studentdifferentiation Feedback to you as the teacherreteaching
42
Problems in aligning assessments to standards
The assessment system problem One assessment will not do the job. You need:
Big outcomes Milestone assessments along the way
(charting the trajectory of progress) Analyses of the skill infrastructure
Matching tools to audience and purpose
47
My analysis
Students who have learned how to read and write in response to your tasks
AND Whatever curriculum it is that
supports progress that will lead to good performance on those assessments
Will do well on these PARCC tasks
48
Why?
They will have developed some transferrable practices that will serve them well in these new circumstances.
That’s what we are about—developing transferable knowledge and skills
49
Closing Advice…
Things you never really wanted to know or even ask about
Three things: Think about assessment as a system of
indicators Return to our accountability roots Be careful about how we link instruction
and assessment
51
Accountability Accountability follows responsibility Based upon multiple indicators (both
external and internal), e.g., External assessments (both state and standardized) Internal assessments (progress in our curriculum) Activity indicators (attendance, pass rates, library
use) Client satisfaction
Accountability should lead to assistance to build capacity
Accountability should be reciprocal: Teachers take responsibility for learning Government for resources to support learning Society for resources that lead to healthy lives
52
Aligning everything to the standards:
A model worth rejectingStandards
Assessment
Instruction
• This model is likely to shape the instruction too narrowly.
• Lead to test score pollution.
53
A better way of thinking about how standards can link instruction and
assessmentStandards
our way of operationalizing our values: What we care about in teaching and learning literacy.
Teaching andLearningActivities
AssessmentActivities
They guide the development of both instruction and assessment
This relationship can operate at the state or local level
The logic of lots of good reform projects!
54
My bottom line
We desperately need instructionally sensitive assessments that have first rate psychometric characteristics so that we can build trustworthy internal systems for monitoring student progress
No decision of consequence about any individual, school, district or other aggregation should be based upon a single indicator of anything.
Tests are a means to an end: Their value is measured by the degree to which they allow us to make good decisions and provide good instruction. They are not the ends themselves.
High stakes and low challenge is the worst possible scenario…