building the ncsc summative assessment: towards a stage- adaptive design sarah hagge, ph.d., and...

24
Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New Orleans, LA June 25, 2014 Copyright © 2014 CTB/McGraw-Hill LLC. 1

Upload: gerald-miles

Post on 18-Jan-2016

219 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

1

Building the NCSC Summative Assessment: Towards a Stage-Adaptive

Design

Sarah Hagge, Ph.D., and Anne Davidson, Ed.D.

McGraw-Hill Education CTB

CCSSO

New Orleans, LA

June 25, 2014

Page 2: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

2

Overview

Rationale for stage-adaptive test Proposed stage-adaptive design Overview of pilot testing: Plan and goals Summary of results from Pilot Phase I Main findings and next steps

Page 3: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

3

Rationale for Stage-Adaptive Test

Targeted to student proficiency levels Improved precision of student test scores Reduced total testing time Reduced testing burden to students and teacher

test administrators

Page 4: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

4

Proposed Stage-Adaptive Design

All students will receive tests with the same content distribution

Tests will be adaptive based on tiers and item difficulty– All students receive the same or a similar first stage,

or testlet of items– Students will receive a second stage of items of lower,

higher or about the same difficulty based on their performance on the first stage of the test

Page 5: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

5

Example of a Stage-Adaptive Design

Stage 1

Moderate difficultyAll students

Stage 2B

Higher difficultyHigher performing students from Stage 1

Stage 2A

Lower difficultyLower performing students from Stage 1

Page 6: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Overview of Pilot Testing

Page 7: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2013 CTB/McGraw-Hill LLC.

7

Purpose of Pilot Testing

Collect information necessary to support development and refinement of NCSC summative assessment design

Pilot Phase 1 – Item tryout – Spring 2014– Generate student performance data– Investigate administration conditions– Understand how the items are functioning– Investigate the proposed item scoring processes and procedures

Pilot Phase 2 – Test forms – Fall 2014– Investigate the adaptive algorithm– Collect form and student performance data

Page 8: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

8

Broad goals

Try out items Evaluate items Understand administration policies Understand administration processes

– Computer based system– Accommodations

Investigate building an IRT scale Develop the stage adaptive design specification

Page 9: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

9

ELA Content and Forms

Grades 3-8, 11 8 forms/grade

– Four reading passages• Two literary and two informational• Foundational items in Grades 3 and 4

– 22 – 35 items/form– One passage at each of the four tiers– Selected response and dichotomously scored

constructed response items

Page 10: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

10

Math Content and Forms

Grades 3-8, 11 8 forms/grade

– 25 items per form– Each form contained a mix of all four item tiers– Content distribution percentages similar across the 8

forms– Selected response and dichotomously scored

constructed response items

Page 11: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

11

Initial Analysis

Demographic characteristics of student sample– Descriptive statistics (e.g., gender, ethnicity) were collected

for the sample of students.– Learner characteristic inventory was used to collect profile

information about students who participated.– Accommodations data was collected prior to administration

as well as whether the eligible student used the accommodation.

Form-level results Classical item analysis Tier analysis Item response time

Page 12: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

12

Flagging Criteria for Item Reviews

Classical Item Analysis– Low p-value, <0.50 (note Tier-1 items have 2 answer

choices)– High p-value, >0.90– Low point-biserial correlation, < 0.20– High option point-biserial correlation, >0.05– Omit rate, >5%

Tier reversals (Tier 1 p-value < Tier 4) Key checks (Distractor analysis) Survey and student interaction study results

Page 13: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Pilot Phase I Results

Page 14: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

14

Summary of Student Counts

3832 students overall took ELA (n forms = 8/grade)

3703 students overall took Math (n forms = 8/grade)

Grade N N ELA N Math3 717 518 5334 742 576 5145 723 526 5506 766 598 5447 722 533 5408 756 546 55511 735 535 467

Total 5161 3832 3703

Page 15: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

15

Summary of Descriptive Statistics

Subgroup Category N %

Gender*Male 3329 64.8

Female 1811 35.2

Ethnicity**

White 2690 52.1

Asian 159 3.1

Hawaiian or Pacific Islander 88 1.7

Indian or Alaska Native 205 4.0

Hispanic 1296 25.1

African American 697 13.5

Page 16: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

16

Summary of AccommodationsSubgroup Category N %

Assistive

Presentation

Needs 278 5.4  

Used 107 2.1  

Assistive ResponseNeeds 457 8.9  

Used 191 3.7  

Braille FormNeeds ** **  

Used ** **  

Large Print FormNeeds 229 4.4  

Used 82 1.6  

Paper VersionNeeds 512 9.9  

Used 349 6.8  

Read or RereadNeeds 4471 86.6  

Used 2930 56.8  

Subgroup Category N %

Text to SpeechNeeds 1263 24.5  

Used 582 11.3  

ScribeNeeds 1103 21.4  

Used 446 8.6  

Speech to TextNeeds 338 6.5  

Used 86 1.7  

Sign InterpretationNeeds 98 1.9  

Used 40 0.8  

No Accommodation

Needed

Needs 2069 40.1  

Used 1429 27.7  

Page 17: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

17

ELA Form-Level Results

Note. * Forms included all ELA items except the extended Writing prompt.Note. Cronbach alpha coefficients ranged from 0.56 to 0.90 on ELA forms.

Page 18: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

18

Math Form-Level Results

Note. Cronbach alpha coefficients ranged from 0.31 to 0.83 on math forms.

Page 19: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

19

Classical Item Results

Range of item p-values– 0.05 to 0.95– P-value standard deviation of 0.11 to 0.23 depending

on test form– Very few items with low or high p-values

Item omit rates less than 3% across all items Majority of flagged items had low point-biserial

or high option point-biserial

Page 20: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

20

Tier results: Mean p-values

3 4 5 6 7 8 110

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1ELA

Tier 1 Tier 2 Tier 3 Tier 4Grade

Me

an

p-v

alu

e

3 4 5 6 7 8 110

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Math

Tier 1 Tier 2 Tier 3 Tier 4Grade

Me

an

p-v

alu

e

Page 21: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Discussion and Next Steps

Page 22: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

22

Main Findings

– Evidence that content is appropriate for students in the Phase I Pilot sample.

• Range of p-values• Relatively few items flagged for high or low p-values• Item omit rates and not-reached rates 3% or less• Form percent correct range of approximately 45-70%

– Evidence that tiers are functioning according to design at an aggregate level

• Tier 1 easier than the other four tiers • Tiers 2, 3 and 4 tended to have a pattern of difficulty ranging

from least to most difficult– Evidence that item bank can support forms at different

difficulty levels• Items exhibit a range of p-values

Page 23: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

23

Next Steps

– Investigate IRT scaling on forms with higher N counts– Conduct item and form-level analysis by student

subgroups– Conduct simulation studies of the adaptive design– Pilot Phase 2

• Field-test items to obtain statistics for operational item bank

• Evaluate stage-adaptive design

Page 24: Building the NCSC Summative Assessment: Towards a Stage- Adaptive Design Sarah Hagge, Ph.D., and Anne Davidson, Ed.D. McGraw-Hill Education CTB CCSSO New

Copyright © 2014 CTB/McGraw-Hill LLC.

24

Thank you!