Download - When is assessment learning-oriented? Dylan Wiliam dylanwiliam

When is assessment learning-oriented?

Dylan Wiliam

www.dylanwiliam.net

4th Biennial EARLI/Northumbria Assessment Conference, Potsdam, Germany, August 2008

Overview of presentationWhy do we need formative assessment?

Clarifying assumptions and definitions about formative assessment

A theoretically-based definition of formative assessment

How formative assessment relates to other aspects of education

Which of the following categories of skill has disappeared from the work-place most rapidly over the last 40 years?

1. Routine manual

2. Non-routine manual

3. Routine cognitive

4. Complex communication

5. Expert thinking/problem-solving

The changing demand for skill

Autor, Levy & Murnane, 2003

There is only one 21st century skillSo the model that says learn while you’re at school, while you’re young, the skills that you will apply during your lifetime is no longer tenable. The skills that you can learn when you’re at school will not be applicable. They will be obsolete by the time you get into the workplace and need them, except for one skill. The one really competitive skill is the skill of being able to learn. It is the skill of being able not to give the right answer to questions about what you were taught in school, but to make the right response to situations that are outside the scope of what you were taught in school. We need to produce people who know how to act when they’re faced with situations for which they were not specifically prepared. (Papert, 1998)

A convergence of interestsPhilosophies of education (Williams, 1966)…Transmission of culturePreparation for employementSelf-actualization

…all require preparation for future learning (PFL)Cannot be taught in isolation from other learningStudents still need the basic skills of literacy, numeracy, concepts and factsLearning power is developed primarily through pedagogy, not curriculumWe have to develop the way teachers teach, not what they teach

Learning power environmentsKey concept:Teachers do not create learningLearners create learning

Teaching is the engineering of effective learning environments

Key features of effective learning environments:Create student engagement (pedagogies of engagement)Well-regulated (pedagogies of contingency)

Why pedagogies of engagement?Intelligence is partly inheritedSo what?

Intelligence is partly environmentalEnvironment creates intelligence Intelligence creates environment

Learning environmentsHigh cognitive demand InclusiveObligatory

Motivation: cause or effect?

competence

challenge

Flow

apathyboredom

relaxation

arousal

anxiety

worry control

high

low

low high

(Csikszentmihalyi, 1990)

Why pedagogies of contingency?Learners do not learn what they are taught

Assessment is the bridge between teaching and learning, and thus the central process of teaching (as opposed to lecturing).

Pedagogies of contingency Personalisation

Mass customization (rather than mass production or individualisation) Diversity

A valuable teaching resource (rather than a challenge to be minimized)

What gets learnt?Denvir & Brown (1996)Understanding of basic number

in low-achieving 7-9 year oldsExtensive conceptual hierarchy

developedStudents assessedTeacher plans teaching

programmeStudents re-assessed

The research evidenceSeveral major reviews of the research…Natriello (1987)Crooks (1988)Kluger & DeNisi (1996)Black & Wiliam (1998)Nyquist (2003)

… all find consistent, substantial effects

It’s the cost-benefit ratio stupid…Intervention Extra months of

learning per yearCost/

classroom/yr

Class-size reduction (by 30%) 4 €25k

Increase teacher content knowledge from weak to strong

2 ?

Formative assessment/Assessment for learning

8 €2.5k

Independent dimensions of assessmentScale Large-scale (nomothetic) versus small-scale (idiographic)

Locus Classroom versus examination hall

Authority Teacher-produced versus expert-produced

Scope Continuous versus one-off

Format Multiple-choice versus constructed response

Function Formative versus summative

No such thing as formative assessmentPurposes of assessments Evaluative Summative Formative

Instruments Purposes Functions

Prospects for integration are bleak

Formative assessment involves the creation of, and capitalization upon, moments of contingency in instruction

“An assessment functions formatively when evidence about student achievement elicited by the assessment is interpreted and used to make decisions about the next steps in instruction that are likely to be better, or better founded, than the decisions that would have been made in the absence of that evidence.”

Black and Wiliam, 2009 (we hope!)

Some principlesA commitment to formative assessmentDoes not entail any view of what is to be learnedDoes not entail any view of what happens when learning takes place

…although clarity on these is essential.

Evolving conceptions of formative assessment

“Feedback” metaphor Components of a feedback system

data on the actual level of some measurable attribute; data on the reference level of that attribute; a mechanism for comparing the two levels and generating

information about the ‘gap’ between the two levels; a mechanism by which the information can be used to alter the gap.

Feedback system Importance of eliciting the right data The role of the learner The role of the learning milieu (e.g., as activity system)

Unpacking formative assessmentKey processesEstablishing where the learners are in their learningEstablishing where they are goingWorking out how to get there

ParticipantsTeachersPeersLearners

Aspects of formative assessment

Where the learner is going

Where the learner is How to get there

TeacherClarify and share

learning intentions

Engineering effective discussions, tasks and

activities that elicit evidence of learning

Providing feedback that moves learners

forward

PeerUnderstand and share learning

intentions

Activating students as learningresources for one another

LearnerUnderstand

learning intentionsActivating students as owners

of their own learning

Five “key strategies”…Clarifying, understanding, and sharing learning intentionscurriculum philosophyEngineering effective classroom discussions, tasks and activities that elicit evidence of learningclassroom discourse, interactive whole-class teachingProviding feedback that moves learners forward feedbackActivating students as learning resources for one another collaborative learning, reciprocal teaching, peer-assessmentActivating students as owners of their own learningmetacognition, motivation, interest, attribution, self-regulated learning, self-

assessment

(Wiliam & Thompson, 2007)

…and one big ideaUse evidence about achievement to adapt instruction to meet learner needs

Keeping learning on track (KLT)A pilot guides a plane or boat toward its destination by taking constant readings and making careful adjustments in response to wind, currents, weather, etc.

A KLT teacher does the same:Plans a carefully chosen route ahead of time (in essence building the track)Takes readings along the way Changes course as conditions dictate

Effects of formative assessmentLong-cycle Span: across units, terms Length: four weeks to one year Impact: Student monitoring; curriculum alignmentMedium-cycle Span: within and between teaching units Length: one to four weeks Impact: Improved, student-involved, assessment; teacher cognition about learningShort-cycle Span: within and between lessons Length:

day-by-day: 24 to 48 hours minute-by-minute: 5 seconds to 2 hours

Impact: more responsive classroom practice; increased student engagement

System responsiveness and time-framesIf evidence is to inform decision-making, the evidence needs to be available before the decision… Long-cycle: Are our professional development programmes well-aligned with the

needs of our teachers? Cycle-length: two years

Long-cycle: Does our curriculum adequately cover the state standards as operationalized in the annual state test? Cycle-length: one year

Medium-cycle: Is this student responding adequately to the tier 1 intervention for reading or do they require a tier 2 intervention? Cycle-length: one month

Short cycle: Does the class understand the generation of equivalent fractions well enough to move on to the addition of fractions? Cycle-length: five minutes

The formative assessment hi-jack…“Statistical process control” models of learning

USA: “Formative tests”Tests administered at intervals of 6 to 10 weeksOften not even keyed to instruction

England (5 to 16 year olds):“Assessment for learning strategy”Government policy focused on target-setting and level chasingFocus on “tracking achievement”

England (Higher Education): Portfolio assessment New focus on formative e-assessment

Ideas whose time has come…or gone…Diagnostic analysis of standardized tests is probably dead Lack of agreements about models Models make assumptions not about items, but how students answer them Dearth of assessment developers who know enough about learners Poor efficiency

More promising developments Use of Bayesian inference networks to build proficiency models But

Proficiency models are not necessarily developmental models Models need large amounts of data to run

Getting the cycle right (and the right cycle)Within this view of formative assessment feedback is not whole of formative assessment It’s not even the most important component of formative assessment

Medium- and long-cycle formative assessments Are supported by existing psychometrics Are easy to manage, but Generally produce small effectsShort-cycle formative assessments Contradict important psychometric assumptions

Reliability Monotonicity of ICCs

Are difficult to establish, but Generally produce large effects

The overlap between age-cohorts is large…

QuickTime™ and a decompressor

are needed to see this picture.

The spread of achievement within each cohort is greater than generally assumed

…so individual progress is hard to trackOn typical standardized tests growth is slow…Average annual growth of achievement of individuals is around 0.4 sdSo monthly growth of individual achievement is 0.03 sd

…and the reliability of the test is limited…A reliability of 0.90 corresponds to a standard error of measurement of 0.3 sd In other words, the SEM of a highly reliable test is ten times the monthly

growth in achievement.

So standardized tests are completely useless for monitoring individual progress in achievement—they are insensitive to instruction.

…and the data is no use when it arrives…Traditional testing deals with individuals, but teachers mostly deal with groupsData-Push vs. Decision-Pull “Data-push”

Quality control at end of an instructional sequence Monitoring assessment that dentifies that remediation is needed, but not what Requires new routines to utilize the information “a series of unwanted answers to unasked questions” (Popper)

Decision-Pull Starts with the decisions teacher make daily Supports teachers “on-the-fly” decisions

If a 30-item test provides useful information on an individual, then responses from 30 individuals on a single item might provide useful information on a class

Characteristics of hinge-point questionsRelate to important learning outcomes necessary for progression in learning

Can be used at any point in a learning sequenceBeginning (range-finding)Middle (mid-course correction)End (e.g., “exit pass”)

When used in “real-time” teacher must be able to collect and interpret the response of all students in 30 seconds

Low probability of correct guessingIn which of these right-angled triangles is a2 + b2 = c2 ?

A a

c

b

C b

c

a

E c

b

a

B a

b

c

D b

a

c

F c

a

b

Build on key (mis-)conceptions…in mathWhat can you say about the means of the following two data sets?

Set 1: 10 12 13 15

Set 2: 10 12 13 15 0

A. The two sets have the same mean.

B. The two sets have different means.

C. It depends on whether you choose to count the zero.

Wilson & Draney, 2004

…in Science…

The ball sitting on the table is not moving. It is not moving because:

A. no forces are pushing or pulling on the ball.

B. gravity is pulling down, but the table is in the way.C. the table pushes up with the same force that gravity pulls downD. gravity is holding it onto the table. E. there is a force inside the ball keeping it from rolling off the table

… and History.Why are historians concerned with bias when analyzing sources?

A. People can never be trusted to tell the truthB. People deliberately leave out important detailsC. People are only able to provide meaningful information if they

experienced an event firsthandD. People interpret the same event in different ways, according to their

experienceE. People are unaware of the motivations for their actionsF. People get confused about sequences of events

Cognitive Rules Responses

A

B

C

D

Correct

Incorrect

Requirements for hinge-point questionsFor an item to support instructional decision-making, the key requirement is that in no case do incorrect and correct cognitive rules map on to the same response (Wylie & Wiliam, 2007)

B C DA

B C DA

Item improvement

The discovery of new incorrect cognitive rules that interpret item keys leads to item improvement

In which of these figures is one-quarter shaded?

FeedbackKinds of feedback in Higher Education (Nyquist, 2003)

Weaker feedback only Knowledge of results (KoR)

Feedback only KoR + clear goals or knowledge of correct results (KCR)

Weak formative assessment KCR+ explanation (KCR+e)

Moderate formative assessment (KCR+e) + specific actions for gap reduction

Strong formative assessment (KCR+e) + activity

Effect of formative assessment (HE)N Effect*

Weaker feedback only 31 0.14

Feedback only 48 0.36

Weaker formative assessment 49 0.29

Moderate formative assessment 41 0.39

Strong formative assessment 16 0.56

*corrected values

FeedbackFeedback should Cause thinking Provide guidance on how to improve Focus on what to take forward to the next assignment rather that what is deficient

about the last assignment Be used Techniques

Delayed scores/grades Learning portfolios “Five of these answers are wrong. Find them and fix them” ‘Three-quarters of the way through a unit” test

Sharing learning intentionsEffective summative assessment:

Requires raters to share a construct of qualityEffective formative assessment

Requires learners to share the same construct of quality as the raters Requires teachers to have an anatomy of quality

Techniques Explaining learning intentions at start of lesson/unit

Learning intentions Success criteria

Intentions/criteria in students’ language Posters of key words to talk about learning Planning/writing frames Annotated examples ‘flesh out’ assessment standards (e.g. lab reports) Opportunities for students to design their own tests

Students owning their learning and as learning resources for one anotherTechniquesStudents assessing their own/peers’ work

with rubrics with exemplars “two stars and a wish”

Training students to pose questions/identifying group weaknessesSelf-assessment of understanding

Traffic lights Red/green discs

The learning milieuDual processing theory (Boekaerts, 1993)Self-regulated learning is both metacognitively governed and affectively

charged (Boekaerts, 2006 p. 348)Students form mental representations of the task-in-context and appraise:

current perceptions of the task and the physical, social, and instructional context within which it is embedded;

activated domain-specific knowledge and (meta)cognitive strategies related to the task;

beliefs about motivation (including domain-specific capacity), interest and the effects of effort

Resulting appraisal generates activity along one of two pathways: ‘well-being’ pathway ‘growth’ pathway

When is assessment learning oriented?Assessment is learning oriented when it Is integrated into instructional design so that it becomes invisibleCreates engagement in learningHelps learners to understand what successful performance looks likeGenerates information that can be interpreted in terms of a learning

progressionFocuses attention on growth rather than well-beingProvides a focus for supportive conversations between learners

Download - When is assessment learning-oriented? Dylan Wiliam dylanwiliam

Top Related