national center on educational outcomes assessment architecture: building universally designed...

62
Assessment Architecture: Building Universally Designed Large-Scale Assessments CCSSO Preconference Clinic Saturday, June 21 1:00 – 5:00pm

Post on 21-Dec-2015

219 views

Category:

Documents


1 download

TRANSCRIPT

Assessment Architecture: Building Universally

Designed Large-Scale Assessments

CCSSO Preconference Clinic Saturday, June 21 1:00 – 5:00pm

Goals for Today• Identify and give

examples of key elements of universally designed assessments

• Use assessment results to determine whether items are universally designed

• Apply considerations for item review to sample test items

Goals for Today

• Explore universally designed assessments in state assessment RFPs and Test Specifications

• Know where to go for information, resources, and support

Agenda

• Building Design: Form Follows Function (and Taste!)

• Welcome from NCEO!• Foundations of Universally

Designed Assessments • Break • Measure Twice, Cut Once• Check Out the Materials • Keep it on the Level• Nail Down the Bids• Final Inspection

Title I Regulations introduce the need for universally designed assessments –

[Assessments must be] designed to be accessible and valid with respect to the widest possible range of students, including students with disabilities and students with limited English proficiency.

Sec. 200.2(b)(2)

Caution

While universally designed assessments can make tests more equitable, producing results that are more valid for all students, they cannot replace

instructional opportunity!

Elements of Universally Designed

Assessments Inclusive assessment

population

Precisely defined constructs

Items developed and reviewed for bias and accessibility

Amenable to accommodations

Elements of Universally Designed

Assessments

Simple, clear, and intuitive instructions and procedures

Maximum readability/ comprehensibility

Maximum legibility: text, graphs, tables, illustrations, and response formats

Inclusive Assessment PopulationUniversally designed

assessments: Consider all types of students in the general curriculum from the beginning

Include students with disabilities and ELLs in item tryouts and field testing

Universally designed assessments reflect good measurement qualities:

Actually measure what they are intended to measure

Remove all non-construct-oriented cognitive, sensory, emotional, and physical barriers

Is the use of “hold” as a noun familiar to students?Is the concept of a “rock climbing wall” familiar to most students?

Will students be distracted by the odd shapes on the diagram?

“Four holds on one of the rock climbing walls are labeled on the diagram below. Matthew first climbs vertically 10 feet from Hold A to Hold B, horizontally 25 feet from Hold to Hold C, and then vertically 15 feet from Hold C to Hold D. How many fewer feet would Matthew have climbed if he had climbed directly from Hold A to Hold D?”

Amenable to Accommodations Universally designed

assessments allow needed accommodations to be used

Plan for students who continue to need accommodations

Facilitate the use of accommodations such as Braille, assistive technology, bilingual dictionaries or translations

American Printing House for the Blind

Accessible Test Department

APH’s Commitment

• Provide high quality tests in accessible formats for students with visual impairments

• Build understanding of accessibility in testing students with visual impairments

Braille Issues

• Pictures• Graphics• Appropriate test

items.

Print Issues

• Photocopying• Use of gray scale• Measurement

items

We Promote…

• Using VI expert during test item development

• On time tests and practice materials

• Teaching skills that students need

We have plans…

• Publish manual on making tests accessible for VI

• Research on what works!

• Test publisher workshop

• State assessment personnel workshop

We Can Do This

• Have VI students taking and passing high standards tests

• Have access to tests in formats needed, on time and of high quality

• Raise the expectations of the general public

Before you criticize someone, walk a mile in his shoes….

…then when you do criticize that person, you’ll be a mile away and have his shoes!!

Assessments designed to better include English language

learners benefit all types of kids!

• Students have the experience to understand the items

• Language is clear, simple and indicates precisely what is required from student (“Plain language”)

• Questions are amenable to supports that ELLs might use

• Cognitive demands are reasonable

“While writers might think certain expectations are obvious, if they are not explicit in the item, then they are subject to honest misinterpretation in the responses.”

(Kopriva, 2000, p. 39)

To raise money for a trip to the Wolfridge Environmental Learning Center, sixth graders at Johnson Middle School are selling raffle tickets. The raffle prize is an electric scooter worth more than $300. A total of 500 tickets were sold. You bought two raffle tickets, your sister bought three and your father bought one. What is the probability that someone in your family will win the prize?

Recommendations to Improve Accessibility of

Text (Kopriva, 2000)• Simple, brief and consistent sentence

structure in items • Consistent and clear paragraph structure • Present tense and active voice• Minimal paraphrasing and rewording. If

used, identify the original statement in parentheses

• Minimal use of pronouns. Follow a pronoun with the term it refers to in parentheses

• High frequency words • Avoid words with double meanings and

colloquialisms. If used, define them in the text.

Young historians take projects to the granddaddy of museum

by Jennifer Corbett, Staff CorrespondentStar Tribune

Para. 1 “When Nicole Zachor, Laura Swanson and Carol Hinz started work on a project for history class a few months ago, the White Bear Lake sophomores had no idea that it would be displayed at the Smithsonian Institution’s National Museum of American History…”

Para. 12 “ This year junior and senior high students started work on their projects in January or February. A project can be a research paper but it can also be a group of individual media presentation, display presentation or performance”

Which of the following is a condition for a student to participate in National History Day?

A. The student must be a junior or senior in high school.

B. The student must be able to go to Washington, D.C.

C. The student must do a project related to the national topic.

D. The student must do the project on his or her own by himself or herself.

¿Cuál de las siguientes es una condicíon para que un estudiante participe en el Día Nacional de Historia?

A. Los estudiantes deben estar en grados once o doce de High School.

B. El estudiante debe estar en posibilidad de ir a Washington, D.C

C. El estudiante debe hacer un proyecto relacionado a un tema nacional.

D. El estudiante debe hacer el proyecto por sí solo.

• Amount of text not relevant to items

• Length of text• Number of long texts• Timing (may be unspoken)• Amount of unfamiliar words• Placement of definitions (in text,

to side, separate)• Location of native language text• Computerized/Hypertext

Cognitive Demands

Preliminary Research in Universal Design

• Sample of 230 students taken from four schools in U.S. Southwest.

• Two schools were “town” schools (pop. 20,000) and two were “rural” schools.

• Students chosen from sixth grade teams that had populations of students with disabilities.

• Two tests were created, one from sample statewide test items, the other re-designed using UD principles.

• Each student took both tests.• Students randomly assigned to take a

particular test first to prevent practice effect.

• Constructs held constant for each item.

• Advisory Board trained in principles of Universal Design and asked to comment / suggest improvements based on their perspectives.

• Team consisted of three parents of children in special education program (one Navajo, one Latina, one Anglo) and one community member with dyslexia.

Ramón is building a doghouse. He wants the roof of the doghouse to be at an angle that is more than 90° but less than 110°. Which angle below could he use for the roof?

A. B.

C. D.

Sample Original Item

Revised Item

Which angle is more than 90° and less than 110°?

A. B.

C. D.

What changed?Design element #2: Construct more

precisely defined.Design element #3: Bias eliminated (dog

house, Ramón)Design element #4: “Built in

accommodations” – un-timed, students circled answer on paper, did not bubble

Design element #5: Simple instructions and procedures

Design element #6: More comprehensible language

Design element #7: Larger font

• Means of two tests were compared and t-tests performed.

• A difference of 8.16 (1.67 sig.) was found between means, a statistically significant finding.

• Effect size calculated using Cohen’s d. Effect of design = .061 (or 6/10 Standard Deviation difference) – a “moderate effect”

• Students with largest difference between two tests were interviewed to determine difference for them.

• Students noted that: more direct language made it easier for them to “understand” items and unlimited time helped them to “think better” about items. Students also said they “remembered” content better on UD test.

What have we learned?

• Design matters!! How a test is designed may affect how a student scores on that test.

• Items that are better designed appear to aid students who are English Language Learners with disabilities “show what they know” better.

• This leads to more valid assessment of traditionally “under-performing” students.

Usability Universally

designed assessments use text that enables people to read quickly, effortlessly and with understanding

The physical appearance of text – shapes of letters and numbers – conforms to several dimensions that characterize legible text

OFFICIAL BALLOT, PALM BEACH COUNTY, FLORIDA

Contrast – Black type on matte pastel or off-white paper produces good contrast and reduces eye strain

Type Size – Print larger than 12 point increases legibility

Spacing – Space between letters and between words in wide

Leading – White space between lines of type (leading) is larger

Typeface – Standard typeface, with upper and lower case letters, is better than italic, small caps, or all caps

Justification – Unjustified text is easier to read, especially for poor readers

Legible Graphs, Tables, Illustrations Universally designed

assessments use non-text materials just as carefully as text materials Symbols are highly

distinguishable

Only essential illustrations are used (ones referred to in text and necessary to answer question) [illustrations for interest often draw attention away from construct being assessed]

Is the border distracting?

Legible graphs, tables, illustrations

What’s that big black rectangle?

Could this item be presented in an alternate format? Braille?Is the high number of items on the map and long list of cities necessary to respond to this item?

“According to this weather page, which place is the warmest on December 28?”

If you were flying to Chicago the day this weather page was printed, what information could you learn for your trip from this page?

Here is an example of an item that could more easily be translated into an alternate format.

Legible Response Formats

Universally designed assessments consider the design of the response venue as well as the assessment itself Large bubbles that avoid most

challenges of low vision or difficulty with fine motor skills

Consideration of age of students in selecting format (avoid separate answer sheets for younger students)

More information?

Visit: http://education.umn.edu/nceo

or Search for NCEO

Web site includes: Topic Introduction

Frequently Asked Questions

Online and Other Resources

UD and Data Analysis Goal = Increase validity for all

Focus = Reduce differential validity

Impact = May or may not reduce differential difficulty (p values)

Process = Go beyond internal validity statistics such as DIF (Differential Item Functioning)

Salvage Example - Day 1

7+ Feet

ND SD

30% 10%

Salvage Example - Day 2

7+ Feet

ND SD

Pole 30% 10%

String 40% 40%

Salvage In Detail

ND SD

StringPol

e Pole

String

Keep It On the Level

Karen BartonCTB/McGraw-Hill

UD – How do we know . . .

• . . .if something is UD?• . . .if the UD is a valid and reliable

approach for students?

What are we looking for?

• Elements of UD

• Content representation• Construct irrelevant barriers• Effect on student performance• Effect of accommodations

Check the Design

• Item reviews– UD elements

• In place? What’s missing? What’s appropriate?

– Content validation – test specs., content & standard rep.

– By experts on various ability groups (SD, LEP, Gifted) and intended constructs

– What about including students in the review?

Check the Construction

• Construct validation– irrelevant variance, dimensionality– factor analyses, structure equation

modeling, etc.

• For target groups:– What elements improve accessibility?– What elements decrease

accessibility?– What elements have no effect?

• If there is any effect, is it desired and feasible?

Check the effects: Pre vs Post UD

Check the affects: Pre vs Post UD

• Pilot Administration:– Student centered focus groups, “think-alouds,”

interviews, questionnaires/surveys

• Item Review– Accessibility expectations

• possible impediments (linguistic loads, other elements not being implemented)

• Amenable to accommodations– P-values by item, point biserials, etc.– DIF

• Inferential - limited by sample sizes of subgroups• Descriptive - mean parameter values, objective score

compares– Distracter analyses– Omit rates

Check the affects: Pre vs Post UD

• Total score– Mean comparisons– Score changes – mean difference,

effect size– External validation

• Is re-construction required? (If it ain’t

broke . .)

– What are the stakes? Who is affected?– What are the costs?

• To students• To contract • To test design• Time, money, experience

Plan ahead!

• RFP should request studies be conducted to assure the UD is being done, done correctly, and is a positive approach for improving the accessibility for students with diverse ability levels – BEFORE students receive high stakes consequences!