caveon webinar series standard setting for the 21st century, using information integration theory...

32
Caveon Webinar Series: Applying Information Integration Theory to Setting Cutscores and Other Tasks David Foster, CEO, Caveon August 20, 2014

Upload: caveon-test-security

Post on 28-Jun-2015

178 views

Category:

Education


0 download

DESCRIPTION

From Dr. David Foster, Caveon CEO Have you ever felt the angst, doubt, and concern that comes from using current methods for setting cutscores? Well, I have, and that's why I am presenting this month's session of the Caveon Webinar Series. This month's webinar presents a promising new method for helping to make pass/fail decisions. Borrowed from Cognitive Science, Information Integration Theory (IIT) is a quantitative method for comparing human rater judgments. It is a method that adds a scientific foundation to the way we determine who's qualified and at what level. Standard setting using IIT is based on well-established, researched principles that explain and predict how we combine information in our brains in order to form consistent judgments. Since setting cutscores today is all about rater judgments, these methods should provide us with a quantitative basis for better establishing and evaluating the outcomes of our cutscore setting efforts. By attending this informative session, you'll have the chance to : • Participate in an actual "hands-on" (or more appropriately "brains-on") live pilot test of the methodology • Learn the advantages of cut score setting using IIT • Discover how the method may help in other routine psychometric analysis tasks that involve judgment (e.g., gender bias and content alignment reviews) • Better understand the concepts behind using this new method for setting cutscores • Use a software tool built on this methodology for calculating cut scores on your next test

TRANSCRIPT

Page 1: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Caveon Webinar Series:

Applying Information Integration Theory to

Setting Cutscores and Other Tasks

David Foster, CEO, CaveonAugust 20, 2014

Page 2: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

My Personal Issues with Current Cutscore Methods

1. There are too many methods/variations, perhaps hundreds. Why is that?

2. The cutscore point seems almost pre-determined.

3. The methods try to direct and conform judgments (e.g., adding item statistics).

4. There is no check on the consistency and quality of the judgments made.

5. The rating task is difficult to do.6. There is a lack of confidence in the

cutscore.

3

Page 3: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

So, What Is the Point?

• Why propose another method of setting cutscores?– To perhaps solve many of the issues above– For added value: IIT can apply to other

“judgment” tasks in testing

• Introducing Information Integration Theory or IIT, borrowed from the Cognitive Sciences– 50+ years of theoretical and scientific support

4

Page 4: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Reference Material

• Contributions to Information Integration Theory Volume I: Cognition. Edited by Norman H. Anderson (2009).

• Foundations of Information Integration Theory by Norman H. Anderson (1981).

• Methods of Information Integration Theory by Norman H. Anderson (1982).

5

Page 5: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

IIT: How Is Information Integrated?

3 Fruits2 Dips

6

Page 6: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Poll

Of the 6 given, which is your most preferred combination of a dip and a fruit?Chocolate and strawberryChocolate and apple sliceChocolate and orange sliceCaramel and strawberryCaramel and apple sliceCaramel and orange slice

Page 7: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Poll Results

• From the poll data:– There are differences in your top

choice, which is normal for food preference ratings

–MORE IMPORTANTLY, you were able to combine or integrate the information quickly, imagine the taste of the combinations, rate the combinations, and make your top pick

8

Page 8: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Much of What We Do Is Integrating Information and Making Judgments• Choosing a vacation place• Buying a car• Leaving a job for a better one• Choosing a mate• Voting• Picking foods to eat• …and everything else we do

We are constantly integrating various pieces of information, then judging, rating, and eventually deciding and acting based on the integrated value.

How we do the cognitive part of these tasks is explained by IIT.

9

Page 9: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Schematic of IIT

Source: Wikipedia

Basic Cognitive Algebra Models: ADDITIVE AND MULTIPLICATIVE

Not Directly Observable

+?x?

10

Page 10: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Cognitive Algebra:ADDITIVE MODEL Examples

• Individuals are adding the stimuli before judging

• Produces parallelism when charted

Statesmanship rated afterreading two biographical paragraphs

Cookie size evaluated by 5-year-olds given length and width

11

Page 11: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Value of a lottery ticket given odds of winning and value of the ticket

Cognitive Algebra:MULTIPLICATIVE MODEL Examples

• Individuals are multiplying the stimuli before judging

• Produces linear fan when charted

Rating of likeableness given adjective and adverb

12

Page 12: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Not Just Humans

Research I conducted in 1976 using pigeons

Information integrated: Type of food Amount of

work to obtain the food

Page 13: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Individual Pigeon Results

Page 14: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Mid-Webinar Summary of IIT Benefits for Judgment Tasks in Testing

• Easy visual evaluation of overall ratings and individual raters

• Better understanding of the judgment process

• Production of results (e.g., item difficulty ratings) on interval-level scales

• Quantitative comparison of performance levels

• Practical benefits: Quicker, easier, less expensive 15

Page 15: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Item Judgment Exercise

You were asked to go to a Caveon site and to provide a rating of the difficulty of 3 math questions for students that had completed the 2nd and 10th grades.

Information that was integrated:A. Test item content (3 items)B. Student performance level (2 grade

levels)

16

Page 16: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Example Individual Rater #12

Parallelism?Additive Model?

17

Page 17: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Example Individual Rater #2

Parallelism?Additive Model?

18

Page 18: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Example Individual Rater #6

Parallelism?Additive Model?

19

Page 19: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Evaluation of Individual Raters

Here are the results for Rater #21 who either didn’t try, didn’t understand the task or simply answered randomly.

His results were removed from the analysis.

20

Page 20: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Graphical Results for IIT Data

N = 47Multiplicative Model!

21

Page 21: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

ANOVA Results for IIT Data

Factors F Score Probability

Items 208.48 6.70-35

Proficiency Levels (Grades) 483.97 4.71-26

Items x Proficiency Levels 26.93 6.21-10

Confirms the multiplicative model

22

Page 22: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Hewlett-Packard Certification Exam

Unqualified

Ideally Qualified

Highly Qualified

Page 23: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Excelsior College Exam

Highly Competent

Competent

Marginally Competent

Weak

Page 24: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

So, What Can We Do with These Results?

Whether the model is ADDITIVE or MULTIPLICATIVE, interpreting the results is the same:

1. A model is confirmed.2. Raters performed the task consistently and

properly.3. Marginal means of item ratings can be used

as difficulty estimates on an interval scale.4. Marginal means of performance level ratings

can be used for setting cutscores or other purposes.

25

Page 25: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

How to Set a Cutscore using IITAt this point, the process is not very different from what occurs with other methods.

It is always a challenge to get from ratings or judgment data to a corresponding value on the score scale.

26

Page 26: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Use Mean Ratings of Items for Each Proficiency Level

• 2nd Grade = 4.95– Average Difficulty Rating of 15.05– Subtract from 20 to reverse the scale

• 10th Grade = 15.47– Average Difficulty Rating of 4.53– Subtract from 20 to reverse the scale

Remember that these are cutscores based on the IIT rating scale of 0 - 20

27

Page 27: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Graphical Display of IIT Cutscores

Cutscore for 2nd Grade = 4.95 (20 - avg rating of 15.05)

Cutscore for 10th Grade = 15.47 (20 - avg rating of 4.53)

Page 28: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

One Conceptual Process for Converting IIT Ratings to a Score Scale

For a particular IIT ratings-based cutscore, how many items (or what % of items) have IIT difficulty ratings below that IIT cutscore?

That number (or %) becomes an equivalent cutscore on the score scale.

There will likely need to be some adjustments for error.

29

Page 29: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Mean Ratings

Num

ber o

f Ite

ms

Converting IIT Ratings to Score Scale: Number of Items

Pretend we have 100 items Instead of only 3

80

10th Grade

15.46

And this graph is a cumulativefrequency distribution of those items and mean ratings

30

Page 30: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Mean Ratings

Num

ber o

f Ite

ms

Converting IIT Ratings to Score Scale: Number of Items

Pretend we have 100 items Instead of only 3

7

2nd Grade

4.95

31

Page 31: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Other Applications of IIT in Testing• Besides determining cutscores,

where else do we require ratings or judgments?– Item accuracy reviews– Essay scoring– Bias reviews (gender, race, age,

etc.)– Item quality (e.g., alignment with

objectives)– Others?

32

Page 32: Caveon webinar series    Standard Setting for the 21st Century, Using Information Integration Theory to Produce Cut Scores - August 2014

Thank you!

Follow Caveon on twitter @caveonCheck out our blog www.caveon.com/blog

LinkedIn Group “Caveon Test Security”

Dr. David FosterCEO, Caveon Test

[email protected]

om

34