clear 2008 annual conference anchorage, alaska “computerized mastery testing a testing...

CLEAR 2008 Annual Conference

Anchorage, Alaska

“Computerized Mastery Testing

A Testing Architecture”

F. Jay Breyer, Prometric

Bob Riley, NCTRC

Decision Error: The Problem CMT Was Designed to Solve

OK

Type A or I

Type B or II

OK

Mastery

Nonmastery

Pass Fail

Decision

What is CMT and How Does it Work?

• Examinees take the test in stages (separately timed sections):– Stage 1 is longer than subsequent stages– Stage 2 to the last or kth stage consists of a single packet of test

questions called a testlet

• Following each stage (except the last) one of three decisions is made:– Pass– Continue – Fail

• After the last stage one of two decisions is made:– Pass – Fail

From Psychometrics: Testlets – Packets of Test Questions

• Divide the content of a test into the smallest number of questions or tasks possible so that each– covers the entire test specifications consistently – has the same difficulty– spreads people out similarly – from 10 to 25 questions – 15 questions is most

common• Also useful for tasks

– All testlets are equal to each other in content, form, and difficulty and have no repeated items or tasks

From Psychometrics: Testlets – Packets of Test Questions

• Build testlets from client full-length test forms– Doesn’t use IRT

• Does use empirical Bayes small sample procedures and the psychometrics of testlets

– requires a minimum of 125 candidates per original test form

• We can start with between 3 to 5 linear CBT forms administered in a single window

• Or we can divide up a performance test into comparable sections

From Manufacturing Engineering Destructive Testing: Sequential Analysis

0 1 2 3 4 5 6 70.0%

10.0%

20.0%

30.0%

40.0%

50.0%

60.0%

70.0%

80.0%

90.0%

100.0%

Sco

re a

t Ea

ch S

tag

e

Stage of Testing

Fail RegionContinue RegionPass Region

From Mathematical Statistics: Bayesian Loss Functions

At Each Stage

• Calculate a loss value at each raw score for– Passing a nonmaster – Failing a master

• Calculate a cost value of exposing an additional testlet at each raw score

From Mathematical Statistics: Bayesian Loss Functions

The Technical Details• There is a science behind this that uses

Bayesian loss functions and• Two weights (A:B) symbolizing the client’s

perception of the seriousness of making decision error (i.e., 100:50) – I (A) and – II (B)

• These two weights implement the client’s view of making a decision error into the cuts used at each stage of testing except the last – the last stage uses the cut for a full-length test

How Do We Get the Cuts at One Stage of Testing?

0 20 40 60 80 100 120

Exp

ect

ed

Lo

sse

s

Scores

Fail Continue Pass

Fail below

Pass above

What do the Raw-Score Cuts Look Like in a Program?

• With 165-items in a full-length test• With 75 items in the base test• Sequential Testing gives us the concept of testing to a limit and making a

decision

Stage Of Testing Fail Below Pass At or Above 1 (75 items – 86 min.)* 42 52 2 (90 items – 100 min.) 52 62 3 (105 items – 114 min.) 61 71 4 (120 items – 128 min.) 71 80 5 (135 items – 142 min.) 82 90 6 (150 items – 156 min.) 93 98 7 (165 items – 170 min.) 106 106

* Pretest questions are included here

Simulated Examinee Sample Based on Statistics from Year

2001May 1-5 and July 10-August 2 2006

Tested Sample

StagePassFail Classified Continue Pass Fail Classified Continue

1 61% 6% 67% 33% 61% 7% 68% 32%2 5% 2% 7% 26% 3% 4% 7% 25%3 4% 1% 5% 21% 6% 1% 7% 18%4 3% 2% 5% 16% 4% 1% 4% 14%5 1% 2% 4% 12% 2% 4% 6% 8%6 3% 3% 6% 6% 2% 0% 3% 5%7 3% 4% 6% 0% 3% 2% 5% 0%

80% 20% 100% N=10,000 81% 19% 100% N = 225

A Simulation Result

Who Gets Longer Tests?

0.29 0.41 0.45 0.49 0.53 0.57 0.61 0.65 0.69 0.73 0.77 0.81 0.85 0.89 0.93

0.000

0.500

1.000

1.500

2.000

2.500

3.000

3.500

4.000

4.500

Sta

ge

of T

est

ing

True Score

•Borderline Examinees

Who Would Use It?

• CMT is useful for clients who – value the time of their candidates and wish to reduce

testing time for most examinees– wish to reduce the exposure of their test questions

(increase in test security)• results of year 2001 assembly – same as 2006 window

– don’t have the money or resources for CAT • CAT is resource intensive – staff, items, examinees

– desire a CBT method more powerful than a linear test (CLT)

• helps control classification errors– want accurate assignment of candidates to pass/fail

status

Who Should Not Use It?

• CMT is not recommended for clients where– small social groups share items with each

other– item harvesting is a known issue – group culture studies old test items

Where Else Can We Apply This?

• How about a Performance Test– Imagine a performance test that consists of

multiple separate observations of a candidate across different occasions

– For example:• 5 separate observations each worth 10 points• The observations are accumulated (summed)• But they are expensive – since you need two

observers

Where Else Can We Apply This?

• Let’s imagine that out of a total test worth 50 points– You want to at a minimum make two

observations – then a decision– Let’s also assume that the raw cut has been

established as 34 out of 50• About 68% of the points in the test

A Performance Test Example 1

• With 5 stages and a minimum of 2 observations, and a cut of 34:

Stage Of Testing Fail Below Pass

At or Above

1 (1st two observations) 10 17

2 (third observation) 17 24

3 (fourth observation) 25 29

4 (Fifth observation) 34 34




At or Above





Discussion & Reactions

NCTRC Mission Statement

“To protect the consumer of Therapeutic Recreation Services by promoting the provision of quality services offered by NCTRC certificants”

NCTRC Profile

NCTRC was incorporated in 1981 as an independent nonprofit organization

• Internationally recognized credentialing body for

therapeutic recreation• Accredited in 1993 by National Commission for

Certifying Agencies (NCCA)• 15,000 member CTRS Registry• Approximately 1200 exam candidates per year

NCTRC Board Exam Involvement

The NCTRC Board:

• Attended a demo of CMT at Prometric HQ

• Supported CMT because–Limited exposure of item pool–Overall cost effectiveness–Better value to the candidate

• Participated in Cut Score Process

• Appoint Exam Management Committee

NCTRC Testing Program

• Began in 1990 with 200 item written exam

• Based on Job Analysis of Certified Therapeutic Recreation Specialist (CTRS)

• Transformed in 2001 to a Linear Computer Based Exam (200 items)

• Transformed in 2002 to a Computer Mastery Exam

NCTRC Exam

• Administered three times each year

• 5 day testing window

• Conducted at Prometric Testing Centers across the US, Canada and Puerto Rico

• Offered to qualified candidates that have been granted professional eligibility by NCTRC

NCTRC Job Analysis

• Assures test specifications and the exam are related to the practice of Therapeutic Recreation

• Delineates the important tasks and knowledge deemed necessary for competent practice

• Job Tasks - practical experience• Knowledge Areas - theoretical knowledge• Conducted in 1987, 1997, and 2007

NCTRC EMC

Exam Management Committee Function:• To monitor and make revisions to NCTRC’s testing

procedures• To work with and monitor the administration of NCTRC’s

tests, such administration may be contracted for with private testing services

• To collect data necessary to periodically check for adverse impact or inadvertent bias

• To collect data necessary to demonstrate reliability and validity of the testing procedures

• To ensure reasonable accommodation of testing procedures for individuals with disabilities

NCTRC EMC

The responsibilities of the Exam ManagementCommittee include:• Item writing committee• Item review committee• Updating and maintaining exam reference list• Review current items in operational pool for

overlap and currency• Job analysis• Update and maintain practice tests

Customer Preparation

• Certification standards

• Conference workshops

• Exam content outline

• Practice exam

• Sample items

• Reference list

NCTRC CMT

• Base test consists of 90 multiple choice items (87 minutes)

• 15 items are pre-test items that are not part of score

• Depending on performance candidate can receive up to 6 additional testlets (14 minutes)

• One testlet equals15 items• Each testlet is mirror reflection of Exam Content

Outline (same proportions as full exam)

NCTRC Exam Content Outline

Content Areas Percent of Exam

No. of Test Items (per

testlet)Foundational Knowledge 33.3% 5

Practice of TR/RT 46.7% 7

Organization of TR/RT 13.3% 2

Advancement of the Profession

6.7% 1

Total 100% 15

NCTRC Exam Experience

• Immediate feedback• Faster test time• Candidate satisfaction• Some dissatisfaction and confusion • Less exposure to item pool via random

assignment of testlets• Computer-base exam a “plus” with candidates• Positive feedback re: NCTRC prep material

NCTRC Special Accommodations

• NCTRC approval process

• Relatively large percent of special accommodations

• Advanced registration with designated reservation

• Ability to offer a wide range of accommodations

• CBT and CMT conducive to special needs

Speaker Contact Information

F. Jay Breyer, Ph.D.

Executive Director of Psychometric Consulting Services

Prometric

2000 Lenox Drive

Lawrenceville, NJ 08648

e-mail: [email protected]

Speaker Contact Information

Bob Riley, Ph.D., CTRS

NCTRC Executive Director

7 Elmwood Dr

New City, NY 10956

[email protected]

clear 2008 annual conference anchorage, alaska “computerized mastery testing a testing...

Documents

performance test

repeated items

original test form

entire test specifications

test securityresults

kth stage

testing time

loss value