1 assessments to support deeper learning and ambitious instruction p. david pearson uc berkeley...

1

Assessments to support deeper learning and ambitious

instruction

P. David PearsonUC Berkeley

Slides will be posted at www.scienceandlitercy.org

2

OR

On-demand performance assessment: The good, the bad, the ugly, and the incredibly beautiful!

3

Assessing Literacy: Problems and

Possibilities??? How did we get to where we are? Where are we headed? Where should and could we be

headed?

P. David PearsonUC Berkeleywww.scienceandliteracy.org

4

Some Context for Today’s Talk

My interest stems from several sources Multiple attempts to build state assessment

systems Several years of attempts to build alternative

assessments—portfolios, performance assessments, formative systems.

Attempts to work with schools to build assessment systems rather than collections of tests

A conviction that accountability should not drive us into a corner in which we engage in practices that are not in the best interests of students, teachers, or the public.

5

Part IHow did we get to where

we are?Where did it all start?

Early 1960s: A kinder, gentler time Title 1 reauthorization in 1967: the

first trade of accountability for flexibility

6

The 1970s

Behavioral objectives Criterion referenced assessments Curriculum-embedded assessments Minimal competency tests: New

Jersey Statewide assessments: Michigan &

Minnesota

7

Skill 1

Skill 2

Teach Assess Conclude


The 1970s Skills management mentality: Teach a skill, assess it for mastery, reteach it if necessary, and then go onto the next skill.

Historical relationships between instruction and assessment

Foundation: Benjamin Bloom’s ideas of mastery learning

8

Skill 1

Skill 2

Skill 3




Skill 4

Skill 5

Skill 6




The 1970s, cont.

And we taught each of these skills until we had covered the entire curriculum for a grade level.

9

Rule of Thumb

The finer the grain size at which we monitor a process like reading and writing, the greater the likelihood that we will end up teaching and testing bits and pieces rather than global processes like comprehension and composition.

10

The 1980s

In like a lamb, out like a lion Quiet consolidation of skills based learning

(especially in commercial materials) A Nation at Risk leads to call for school

reform The evolution of “higher order” thinking in

all subject matter areas Late 1980s: Do we have assessments to

match our models of learning---------- Performance and authentic assessment

movements Eradicate the distinction between instruction and

assessment

11

Key movements of the late 1980s/1990s

Kentucky and Vermont portfolio assessments

Graduation by portfolio presentation Central Park East Walden

Assessment by exhibition: Coalition Performance Assessment: New Standards,

CLAS, Maryland… Some models of teacher performance

assessment: NBPTS, Connecticut

12

Late 1980s/early 1990s:Portfolios

Performance AssessmentsMake Assessment Look Like Instruction

On standards 1-n

Activities Conclusions

From which we draw

We engage in instructional activities, from which we collect evidence which permits us to draw conclusions about student growth or accomplishment on several dimensions (standards) of interest.

13

The complexity of performance assessment practices: one to many

Any given activity may offer evidence for many standards, e.g, responding to a story.

Activity X

Standard 5

Standard 3

Standard 4

Standard 2

Standard 1

14

Standard XActivity 1

Activity 2

Activity 3

Activity 4

Activity 5

For any given standard, there are many activities from which we could gather relevant evidence about growth and accomplishment, e.g., reads fluently

The complexity of performance assessment practices: many to one

15

The complexity of portfolio assessment practices, many to

many

Activity 1

Activity 2

Activity 3

Activity 4

Activity 5

Standard 1

Standard 2

Standard 3

Standard 4

Standard 5

• Any given artifact/activity can provide evidence for many standards

• Any given standard can be indexed by many different artifacts/activities

16

· Some people can tell what time it is by looking at the sun, but I never have been able to make out the numbers.

· There are four seasons: salt, pepper, mustard, and catsup.

The perils of performance assessment: or maybe those multiple-choice assessments

aren’t so bad after all…….

17

"Water is composed of two gins, Oxygin and Hydrogin. Oxygin is pure gin. Hydrogin is gin and water."

The perils of performance assessment

18

"Germinate: To become a naturalized German."

"Vacumm: A large, empty space where the pope lives."


19

· Genetics explains why you look like your father, and if you don't, why you should.


20

You can listen to thunder and tell how close you came to getting hit. If you don't hear it, you got hit, so never mind.

"When you breath, you inspire. When you do not breath, you expire."


21

1990s Standards based reform

State initiatives IASA model

Trading flexibility for accountability Move from being accountable for the

means and leaving the ends up for grabs (doctor or lawyer model) TO

Being accountable for the ends and leaving the means up for grabs (carpenter or product model)

22

Standards-Based ReformThe Initial Theory of Action

StandardsAssessment

Accountability

ClearExpectations

Motivation

HigherStudentLearning

Ala Tucker and Resnick in the early 1990s

23

More Recent Expanded Theory of Action

StandardsAssessment

Accountability

ClearExp’s

Motivation

HigherStudentLearning

Instruction

Professional Development

Ala Elmore and Resnick in the late 1990s.

24

No Child Left Behind

A definite retreat from performance-based assessment as a wide-scale tool

The persistence of standards-based reform...

NCLB and Reading First Every grade level outcomes

assessment Assessments for placement, progress

monitoring, diagnosis and outcomes Mania for alignment…

25

Alignment: Part of the theory of action

Specify the content standards Align everything else to them:

Assessments Instruction Curriculum materials Professional development Standards for teacher learning

26

Alignment: The centrality of content standards

Content Standards

Performance StandardsFor Students

Student Assessments

Wild Card

Professional Development

Standards for Teacher Education

Accreditation

Standards for Licensure

Teacher Evaluation

27

Problems in aligning assessments to standards

The foxes are guarding the hen house problem (the test companies do the alignment)

The backwards mapping problem (When the standards and the instruction look just like the test): Deal with DIBELS later…

The alignment is a coat of many colors problem (recall the earlier mapping)

28

Skill 1

Skill 2



The 1970s Skills management mentality: Teach a skill, assess it for mastery, reteach it if necessary, and then go onto the next skill.

Remember the lessons of the 1970s!!!

The bureaucratization problem

29

Skill 1

Skill 2

Skill 3




Skill 4

Skill 5

Skill 6




The 1970s, cont.

And we taught each of these skills until we had covered the entire curriculum for a grade level.


30

Standard 1

Standard 2



We could be back where we started.

What could happen if we are not careful with standards


31

Standard 1

Standard 2

Standard 3




Standard 4

Standard 5

Standard 6




In the name of ensuring a place for each standard in our assessment systems, we could end up in the same place we were in in the 1970s

What could happen if we are not careful with

standards


32

DIBELS Detour Schooling the cognitive process of test

taking: Kids who are alike in terms of accuracy will differ substantially in terms of speed and fluency

Face validity problem: What virtue is there in doing things faster? naming letters, sounds, words, ideas What would you do differently if you knew that

Susie was faster than Ted at X, Y, or Z???

33

The Achilles Heel: Consequential Validity

Give DIBELS

Use results to craft instruction

Give DIBELS again

Give Comprehension Test

Give Comprehension Test

The emperor has no clothes

34

Key Concept: Haladyna

Test Score Pollution: a rise or fall in a score on a test without an accompanying rise or fall in the cognitive or affective outcome allegedly measured by the test

35

Rule of thumb re-emerges

The finer the grain size at which we monitor a process like reading and writing, the greater the likelihood that we will end up teaching and testing bits and pieces rather than global processes like comprehension and composition.

36

Rule of Thumb for the Future

Wherever we go, we must not go back to the world in which we link instruction to assessment at the very finest grain-size

Keep our eyes on the prize—what does accomplished performance look like?

37

Where are we headed in the Post NCLB era?

Obama administration keeping up the accountability pressure Teacher accountability for student

achievement $350,000,000 investment in new

accountability assessments (PARCC, SBAC)

CCSS movement Another round of curriculum reform

You are a part of that effort

38

A New Opportunity: A Confluence of Forces

CCSS movements/Foundation funding Disciplinary delivery of of literacy

Reincarnation of Ambitious Pedagogy Deeper Learning

PARCC and SBAC Hybrid assessments Increase weights given to open ended

responses

39

Why performance assessment?

Performance Tasks: Why bother?

External validity College ready Career ready PARCC ready

Curricular validity Higher Order Thinking Powerful learning Deeper learning

Performance Tasks: Why bother?

Consequential validity What curricular activities will it lead teachers and

students toward? Scaffolding toward independence

Utility Formative Assessment

Feedback to the studentdifferentiation Feedback to you as the teacherreteaching

42

Problems in aligning assessments to standards

The assessment system problem One assessment will not do the job. You need:

Big outcomes Milestone assessments along the way

(charting the trajectory of progress) Analyses of the skill infrastructure

Matching tools to audience and purpose

43

PARCC digression

47

My analysis

Students who have learned how to read and write in response to your tasks

AND Whatever curriculum it is that

supports progress that will lead to good performance on those assessments

Will do well on these PARCC tasks

48

Why?

They will have developed some transferrable practices that will serve them well in these new circumstances.

That’s what we are about—developing transferable knowledge and skills

49

Closing Advice…

Things you never really wanted to know or even ask about

Three things: Think about assessment as a system of

indicators Return to our accountability roots Be careful about how we link instruction

and assessment

50

Matching tools with decisions and clientsFormative Assessments

Assessment Systems

51

Accountability Accountability follows responsibility Based upon multiple indicators (both

external and internal), e.g., External assessments (both state and standardized) Internal assessments (progress in our curriculum) Activity indicators (attendance, pass rates, library

use) Client satisfaction

Accountability should lead to assistance to build capacity

Accountability should be reciprocal: Teachers take responsibility for learning Government for resources to support learning Society for resources that lead to healthy lives

52

Aligning everything to the standards:

A model worth rejectingStandards

Assessment

Instruction

• This model is likely to shape the instruction too narrowly.

• Lead to test score pollution.

53

A better way of thinking about how standards can link instruction and

assessmentStandards

our way of operationalizing our values: What we care about in teaching and learning literacy.

Teaching andLearningActivities

AssessmentActivities

They guide the development of both instruction and assessment

This relationship can operate at the state or local level

The logic of lots of good reform projects!

54

My bottom line

We desperately need instructionally sensitive assessments that have first rate psychometric characteristics so that we can build trustworthy internal systems for monitoring student progress

No decision of consequence about any individual, school, district or other aggregation should be based upon a single indicator of anything.

Tests are a means to an end: Their value is measured by the degree to which they allow us to make good decisions and provide good instruction. They are not the ends themselves.

High stakes and low challenge is the worst possible scenario…

55

The real bottom line Never send a test out to do a

curriculum’s job! The End!!!!

Instead, send the test out to support the curriculum and those held accountable to it

1 assessments to support deeper learning and ambitious instruction p. david pearson uc berkeley...

Documents

performance assessments

assessments criterion

flexibility slide

michigan minnesota slide

demand performance assessment

assessment foundation

formative systems

years of attempts