dylan wiliam (@dylanwiliam) - amazon s3 · powerpoint presentation author: dylan wiliam created...

33
Dylan Wiliam (@dylanwiliam) Principled assessment design www.dylanwiliamcenter.com www.dylanwiliam.net

Upload: others

Post on 05-Jun-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Dylan Wiliam (@dylanwiliam)

Principled assessment design

www.dylanwiliamcent er.com www.dylanwiliam.net

Page 2: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Initial assumptions

• The assessment system should be designed to assess the school’s curriculum rather than having to design the curriculum to fit the school’s assessment system.

• Since each school’s curriculum should be designed to meet local needs, there cannot be a one-size-fits-all assessment system—each school’s assessment system will be different.

• There are, however, a number of principles that should govern the design of assessment systems, and

• There is some science here—knowledge that people need in order to avoid doing things that are just wrong.

2

Page 3: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Outline

• Essential preconditions

• Principled assessment design

• Recording

• Reporting

• Putting it all together

3

Page 4: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Before we can assess…

• The ‘backward design’ of an education system

– Where do we want our students to get to?• ‘Big ideas’

– What are the ways they can get there?• Learning progressions

– When should we check on/report progress?• Inherent and useful checkpoints

4

Page 5: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Big ideas

• A “big idea”

– helps make sense of apparently unrelated phenomena

– is generative in that is can be applied in new areas

5

Page 6: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Some big ideas of the school curriculum

Subject Big idea

English The “hero’s journey” as a useful framework for understanding myths and legends.

Geography Patterns of human development are influenced by, and in turn influence, physical features of the environment.

History Sources are products of their time, but knowing the circumstances of their creation helps resolve conflicts.

Mathematics Fractions, decimals, percentages and ratios are ways of expressing numbers that can be represented as points on a number line.

Science All matter is made of very small particles

Sociology The way people behave is the result of interplay between who they are (agency) and where they are (structure).

6

Page 7: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Learning progressions

• Learning progressions

– only make sense with respect to particular learning sequences;

– are therefore inherently local; and consequently

– those developed by national experts are likely to be difficult to use and often just plain wrong

– have two defining properties• Empirical basis: almost all students demonstrating a skill must

also demonstrate sub-ordinate skills

• Logical basis: there must be a clear theoretical rationale for why the sub-ordinate skills are required

7

Page 8: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Significant stages in development

• Rationales for assessing the learning journey

– Intrinsic: developmental levels inherent in the discipline

– Extrinsic: the need to inform decisions

8

Page 9: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Significant stages in development: intrinsic

• Intrinsic: developmental levels inherent in the discipline

– Stages of development (e.g., Piaget, Vygotsky)

– Structure in the student’s work (e.g., SOLO taxonomy)

– ‘Troublesome knowledge’ (Perkins, 1999)• Threshold concepts (Meyer & Land, 2003)

• Alien concepts

• Burdensome knowledge

9

Page 10: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Significant stages in development: extrinsic

• Extrinsic: the need to inform decisions

– Decision-driven data collection• key transitions in learning (years, phases)

• timely information for stakeholders

• monitoring student progress

• informing teaching and learning

10

Page 11: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Why all assessment should be for learning

“We shouldn't want a shift to formative assessment because research shows how it improves learning (we don't need to be told that—it has to be true). We should want it because schools are places where learners should be learning more often than they are being selected, screened or tested in order to check up on their teachers. The latter are important; the former are why schools exist.” (Silcock, 1998)

11

Page 12: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

12

Page 13: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Ten principles for assessment

• Assessment

– is part of effective learning

– focuses on how students learn

– is central to classroom practice

– is a key professional skill

– is sensitive and constructive

– fosters motivation

– promotes understanding of goals and criteria

– helps learners know how to improve

– develops the capacity for self-assessment

– recognises all educational achievement

13

Assessment Reform Group (2002)

Page 14: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Assessment: good servant, bad master

Page 15: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Written examinations

“They have perverted the best efforts of teachers, and

narrowed and grooved their instruction; they have occasioned

and made well nigh imperative the use of mechanical and rote

methods of teaching; they have occasioned cramming and the

most vicious habits of study; they have caused much of the

overpressure charged upon schools, some of which is real;

they have tempted both teachers and pupils to dishonesty;

and last but not least, they have permitted a mechanical

method of school supervision.”

15

(White, 1888 pp. 517-518)

Page 16: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

The Macnamara Fallacy (Handy, 1994 p. 219)

• The first step is to measure whatever can be easily measured.– This is OK as far as it goes.

• The second step is to disregard that which can’t easily be measured or to give it an arbitrary quantitative value.– This is artificial and misleading.

• The third step is to presume that what can’t be measured easily really isn’t important.– This is blindness.

• The fourth step is to say that what can’t be easily measured really doesn’t exist.– This is suicide.

16

Page 17: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Campbell’s law

“The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor.” (Campbell, 1976 p. 49)

– All performance indicators lose their meaning when adopted as policy targets

– The clearer you are about what you want, the more likely you are to get it, but the less likely it is to mean anything

17

Page 18: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

The “Lake Wobegon” effect

3.4

3.5

3.6

3.7

3.8

3.9

4.0

4.1

4.2

4.3

4.4

1986 1987 1988 1989 1990

Gra

de

eq

uiv

ale

nts

Test C Test B Test C

Koretz, Linn, Dunbar and Shepard (1991)

Page 19: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Effects of narrow assessment

• Incentives to teach to the test

– Focus on some subjects at the expense of others

– Focus on some aspects of a subject at the expense of others

– Focus on some students at the expense of others (“bubble” students)

• Consequences

– Learning that is• Narrow

• Shallow

• Transient

19

Page 20: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

And yet…

• High-stakes assessment systems do improve outcomes for students on a range of measures

• The effects can be substantial (an extra two months of learning per year)

• The challenge:

– Realize the benefits of high stakes assessments

– Avoid the unintended adverse consequences

Page 21: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

The challenge comes down to…

• Designing an assessment system that is:

– Distributed• So that evidence collection is not undertaken entirely at the end

– Synoptic• So that learning has to accumulate

– Extensive• So that all important aspects are covered (breadth and depth)

– Manageable• So that costs are proportionate to benefits

– Trusted• So that stakeholders have faith in the outcomes

21

Page 22: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Quality in assessment

Page 23: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

What is an assessment?

• An assessment is a procedure for making inferences

– We give students things to do

– We collect the evidence

– We draw conclusions

• Key question: “Once you know the assessment outcome, what do you know?”

• For any test:

– some inferences are warranted (valid)

– some are not

23

Page 24: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Validity

• Evolution of the idea of validity– A property of a test

– A property of students’ scores on a test

– A property of inferences drawn on the basis of test results

• “One validates not a test but an interpretation of data arising from a specified procedure”(Cronbach, 1971)

• Consequences– No such thing as a valid (or indeed invalid) assessment

– No such thing as a biased assessment

– Formative and summative are descriptions of inferences

24

Page 25: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Meanings and consequences of assessment

• Evidential basis

– What does the assessment result mean?

• Consequential basis

– What does the assessment result do?

• Assessment literacy (Stiggins, 1991)

– Do you know what this assessment result means?

– Does it have utility for its intended use?

– What message does this assessment send to students (and other stakeholders) about the achievement outcomes we value?

– What is likely to be the effect of this assessment on students?

Page 26: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Recording

Page 27: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Sylvie and Bruno concluded (Carroll, 1893)

“That’s another thing we’ve learned from your Nation,” said Mein Herr, “map-making. But we’ve carried it much further than you. What do you consider the largest map that would be really useful?”

“About six inches to the mile.”

“Only six inches!” exclaimed Mein Herr. “We very soon got to six yards to the mile. Then we tried a hundred yards to the mile. And then came the grandest idea of all! We actually made a map of the country, on the scale of a mile to the mile!”

“Have you used it much?” I enquired.

“It has never been spread out, yet,” said Mein Herr: “the farmers objected: they said it would cover the whole country, and shut out the sunlight! So we now use the country itself, as its own map, and I assure you it does nearly as well.

27

Page 28: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Reporting

Page 29: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

What is a grade?

“…an inadequate report of an inaccurate judgment by a biased and variable judge of the extent to which a student has attained an undefined level of mastery of an unknown proportion of an indefinite material.” (Dressel, quoted in Chickering, 1983 p. 12)

Page 30: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Scores versus grades

• Precision is not the same as accuracy

– The more precise the score, the lower the accuracy.

– Less precise scores are more accurate, but less useful

• Scores suffer from spurious precision

– Given that no score is perfectly reliable, small differences in scores are unlikely to be meaningful

• Grades suffer from spurious accuracy

– When we use grades or categories, we tend to regard performance is different categories as qualitatively different

30

Page 31: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Meanings and consequences of school grades

• Two rationales for grading– Meanings

• Assessment as evidentiary reasoning

• Assessment outcomes as supports for making inferences– (e.g., about student achievement)

– Consequences• Assessment outcomes as rewards and punishments

• Assessments create incentives for students to do what we want them to do

– These two rationales interact, and conflict• achievement grades for completion of homework

• achievement grades for effort

• penalties for late submission

• zeroes for missing work

31

Page 32: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Designing an assessment system

Page 33: Dylan Wiliam (@dylanwiliam) - Amazon S3 · PowerPoint Presentation Author: Dylan Wiliam Created Date: 7/6/2016 12:54:54 PM

Mapping out the terrain33

Annual

Interim

Weekly

Daily

Instructional Guidance

(“formative”)

Describing Individuals

(“summative”)

Institutional Accountability (“evaluative”)

Function

Hourly

Tim

esca

le

High-stakes accountability

Academic promotion

Hinge-pointquestions

End-of-unit tests

Benchmarks

Exit pass

Common assessments

End-of-course exams

Before the end-of-unit tests

Growth