clear 2008 annual conference anchorage, alaska “computerized mastery testing a testing...
TRANSCRIPT
CLEAR 2008 Annual Conference
Anchorage, Alaska
“Computerized Mastery Testing
A Testing Architecture”
F. Jay Breyer, Prometric
Bob Riley, NCTRC
Decision Error: The Problem CMT Was Designed to Solve
OK
Type A or I
Type B or II
OK
Mastery
Nonmastery
Pass Fail
Decision
What is CMT and How Does it Work?
• Examinees take the test in stages (separately timed sections):– Stage 1 is longer than subsequent stages– Stage 2 to the last or kth stage consists of a single packet of test
questions called a testlet
• Following each stage (except the last) one of three decisions is made:– Pass– Continue – Fail
• After the last stage one of two decisions is made:– Pass – Fail
From Psychometrics: Testlets – Packets of Test Questions
• Divide the content of a test into the smallest number of questions or tasks possible so that each– covers the entire test specifications consistently – has the same difficulty– spreads people out similarly – from 10 to 25 questions – 15 questions is most
common• Also useful for tasks
– All testlets are equal to each other in content, form, and difficulty and have no repeated items or tasks
From Psychometrics: Testlets – Packets of Test Questions
• Build testlets from client full-length test forms– Doesn’t use IRT
• Does use empirical Bayes small sample procedures and the psychometrics of testlets
– requires a minimum of 125 candidates per original test form
• We can start with between 3 to 5 linear CBT forms administered in a single window
• Or we can divide up a performance test into comparable sections
From Manufacturing Engineering Destructive Testing: Sequential Analysis
0 1 2 3 4 5 6 70.0%
10.0%
20.0%
30.0%
40.0%
50.0%
60.0%
70.0%
80.0%
90.0%
100.0%
Sco
re a
t Ea
ch S
tag
e
Stage of Testing
Fail RegionContinue RegionPass Region
From Mathematical Statistics: Bayesian Loss Functions
At Each Stage
• Calculate a loss value at each raw score for– Passing a nonmaster – Failing a master
• Calculate a cost value of exposing an additional testlet at each raw score
From Mathematical Statistics: Bayesian Loss Functions
The Technical Details• There is a science behind this that uses
Bayesian loss functions and• Two weights (A:B) symbolizing the client’s
perception of the seriousness of making decision error (i.e., 100:50) – I (A) and – II (B)
• These two weights implement the client’s view of making a decision error into the cuts used at each stage of testing except the last – the last stage uses the cut for a full-length test
How Do We Get the Cuts at One Stage of Testing?
0 20 40 60 80 100 120
Exp
ect
ed
Lo
sse
s
Scores
Fail Continue Pass
Fail below
Pass above
What do the Raw-Score Cuts Look Like in a Program?
• With 165-items in a full-length test• With 75 items in the base test• Sequential Testing gives us the concept of testing to a limit and making a
decision
Stage Of Testing Fail Below Pass At or Above 1 (75 items – 86 min.)* 42 52 2 (90 items – 100 min.) 52 62 3 (105 items – 114 min.) 61 71 4 (120 items – 128 min.) 71 80 5 (135 items – 142 min.) 82 90 6 (150 items – 156 min.) 93 98 7 (165 items – 170 min.) 106 106
* Pretest questions are included here
Simulated Examinee Sample Based on Statistics from Year
2001May 1-5 and July 10-August 2 2006
Tested Sample
StagePassFail Classified Continue Pass Fail Classified Continue
1 61% 6% 67% 33% 61% 7% 68% 32%2 5% 2% 7% 26% 3% 4% 7% 25%3 4% 1% 5% 21% 6% 1% 7% 18%4 3% 2% 5% 16% 4% 1% 4% 14%5 1% 2% 4% 12% 2% 4% 6% 8%6 3% 3% 6% 6% 2% 0% 3% 5%7 3% 4% 6% 0% 3% 2% 5% 0%
80% 20% 100% N=10,000 81% 19% 100% N = 225
A Simulation Result
Who Gets Longer Tests?
0.29 0.41 0.45 0.49 0.53 0.57 0.61 0.65 0.69 0.73 0.77 0.81 0.85 0.89 0.93
0.000
0.500
1.000
1.500
2.000
2.500
3.000
3.500
4.000
4.500
Sta
ge
of T
est
ing
True Score
•Borderline Examinees
Who Would Use It?
• CMT is useful for clients who – value the time of their candidates and wish to reduce
testing time for most examinees– wish to reduce the exposure of their test questions
(increase in test security)• results of year 2001 assembly – same as 2006 window
– don’t have the money or resources for CAT • CAT is resource intensive – staff, items, examinees
– desire a CBT method more powerful than a linear test (CLT)
• helps control classification errors– want accurate assignment of candidates to pass/fail
status
Who Should Not Use It?
• CMT is not recommended for clients where– small social groups share items with each
other– item harvesting is a known issue – group culture studies old test items
Where Else Can We Apply This?
• How about a Performance Test– Imagine a performance test that consists of
multiple separate observations of a candidate across different occasions
– For example:• 5 separate observations each worth 10 points• The observations are accumulated (summed)• But they are expensive – since you need two
observers
Where Else Can We Apply This?
• Let’s imagine that out of a total test worth 50 points– You want to at a minimum make two
observations – then a decision– Let’s also assume that the raw cut has been
established as 34 out of 50• About 68% of the points in the test
A Performance Test Example 1
• With 5 stages and a minimum of 2 observations, and a cut of 34:
Stage Of Testing Fail Below Pass
At or Above
1 (1st two observations) 10 17
2 (third observation) 17 24
3 (fourth observation) 25 29
4 (Fifth observation) 34 34
A Performance Test Example 2
• With 5 stages and a minimum of 2 observations, and a cut of 34:
Stage Of Testing Fail Below Pass
At or Above
1 (1st two observations) 10 21
2 (third observation) 17 31
3 (fourth observation) 25 41
4 (Fifth observation) 34 34
A Performance Test Example 3
• With 5 stages and a minimum of 2 observations, and a cut of 34:
Stage Of Testing Fail Below Pass
At or Above
1 (1st two observations) 0 17
2 (third observation) 0 24
3 (fourth observation) 0 29
4 (Fifth observation) 34 34
Discussion & Reactions
NCTRC Mission Statement
“To protect the consumer of Therapeutic Recreation Services by promoting the provision of quality services offered by NCTRC certificants”
NCTRC Profile
NCTRC was incorporated in 1981 as an independent nonprofit organization
• Internationally recognized credentialing body for
therapeutic recreation• Accredited in 1993 by National Commission for
Certifying Agencies (NCCA)• 15,000 member CTRS Registry• Approximately 1200 exam candidates per year
NCTRC Board Exam Involvement
The NCTRC Board:
• Attended a demo of CMT at Prometric HQ
• Supported CMT because–Limited exposure of item pool–Overall cost effectiveness–Better value to the candidate
• Participated in Cut Score Process
• Appoint Exam Management Committee
NCTRC Testing Program
• Began in 1990 with 200 item written exam
• Based on Job Analysis of Certified Therapeutic Recreation Specialist (CTRS)
• Transformed in 2001 to a Linear Computer Based Exam (200 items)
• Transformed in 2002 to a Computer Mastery Exam
NCTRC Exam
• Administered three times each year
• 5 day testing window
• Conducted at Prometric Testing Centers across the US, Canada and Puerto Rico
• Offered to qualified candidates that have been granted professional eligibility by NCTRC
NCTRC Job Analysis
• Assures test specifications and the exam are related to the practice of Therapeutic Recreation
• Delineates the important tasks and knowledge deemed necessary for competent practice
• Job Tasks - practical experience• Knowledge Areas - theoretical knowledge• Conducted in 1987, 1997, and 2007
NCTRC EMC
Exam Management Committee Function:• To monitor and make revisions to NCTRC’s testing
procedures• To work with and monitor the administration of NCTRC’s
tests, such administration may be contracted for with private testing services
• To collect data necessary to periodically check for adverse impact or inadvertent bias
• To collect data necessary to demonstrate reliability and validity of the testing procedures
• To ensure reasonable accommodation of testing procedures for individuals with disabilities
NCTRC EMC
The responsibilities of the Exam ManagementCommittee include:• Item writing committee• Item review committee• Updating and maintaining exam reference list• Review current items in operational pool for
overlap and currency• Job analysis• Update and maintain practice tests
Customer Preparation
• Certification standards
• Conference workshops
• Exam content outline
• Practice exam
• Sample items
• Reference list
NCTRC CMT
• Base test consists of 90 multiple choice items (87 minutes)
• 15 items are pre-test items that are not part of score
• Depending on performance candidate can receive up to 6 additional testlets (14 minutes)
• One testlet equals15 items• Each testlet is mirror reflection of Exam Content
Outline (same proportions as full exam)
NCTRC Exam Content Outline
Content Areas Percent of Exam
No. of Test Items (per
testlet)Foundational Knowledge 33.3% 5
Practice of TR/RT 46.7% 7
Organization of TR/RT 13.3% 2
Advancement of the Profession
6.7% 1
Total 100% 15
NCTRC Exam Experience
• Immediate feedback• Faster test time• Candidate satisfaction• Some dissatisfaction and confusion • Less exposure to item pool via random
assignment of testlets• Computer-base exam a “plus” with candidates• Positive feedback re: NCTRC prep material
NCTRC Special Accommodations
• NCTRC approval process
• Relatively large percent of special accommodations
• Advanced registration with designated reservation
• Ability to offer a wide range of accommodations
• CBT and CMT conducive to special needs
Speaker Contact Information
F. Jay Breyer, Ph.D.
Executive Director of Psychometric Consulting Services
Prometric
2000 Lenox Drive
Lawrenceville, NJ 08648
e-mail: [email protected]
Speaker Contact Information
Bob Riley, Ph.D., CTRS
NCTRC Executive Director
7 Elmwood Dr
New City, NY 10956