multiple choice test item analysis facilitator: sophia scott

26
Multiple Choice Test Item Multiple Choice Test Item Analysis Analysis Facilitator: Sophia Scott Facilitator: Sophia Scott

Post on 21-Dec-2015

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

Multiple Choice Test Item Multiple Choice Test Item Analysis Analysis

Facilitator: Sophia ScottFacilitator: Sophia Scott

Page 2: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

Workshop FormatWorkshop Format

1. What is Multiple Choice Test Item Analysis?

2. Background information

3. Fundamentals

4. Guided Practice

5. Individual Practice

Page 3: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

What is Multiple Choice Test What is Multiple Choice Test Item Analysis?Item Analysis?

Statistically analyzing your multiple choice test items so that you can ensure that your items are effectively evaluating student learning.

Page 4: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

Background informationBackground information

• What does a test score mean?• Reliability and Validity• Norm-referenced or Criterion-referenced

Page 5: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

What does a Test Score Mean?What does a Test Score Mean?

• A score that is a reflection of what you really knew (true score) and error (things like atmosphere, nerves etc that modify your true score).

• The purpose of a systematic approach to test design is to reduce error in test taking.

Page 6: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

Reliability and ValidityReliability and Validity

• Reliability – the test scores are consistent– Test-retest reliability (measure of an individual score

is consistent over time)

– Inter-rater reliability (consistency of individual judges’ ratings of a performance)

• Validity – the test measured what it was suppose to measure.

You want your test to be both reliable and valid

Page 7: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

Norm-referenced or Criterion-referencedNorm-referenced or Criterion-referenced

• Norm-referenced – defines the performance of test-takers in relation to one another. Use the frequency distribution and can rank students. Often used to predict success like GRE or GMAT.

• Criterion-referenced – defines the performance of each test taker without regard to the performance of others. The success is being able to perform a specific task or set of competencies. Uses a mastery curve.

Page 8: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

Item analysisItem analysis

How you interpret the results of a test and use individual item statistics to improve the quality of a test

Terms used – Standard deviation – range above and below the

average score, the more the scores are spread out the high the SD

– Mean – average score

– N – number of items on the test

– Raw scores – actual scores

– Variance = standard deviation squared

Page 9: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

Fundamentals of Item AnalysisFundamentals of Item Analysis

1. Were any of the items too difficult or easy?

2. Do the items discriminate between those students who really knew the material from those that did not?

3. What is the reliability of the exam?

Page 10: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

1. Were any of the items too difficult 1. Were any of the items too difficult or too easy?or too easy?• Use the Difficulty Factor of a question

– Proportion of respondents selecting the right answer to that item

D = c / n

D = difficulty factor

c = number of correct answers

n = number of respondents

• Range 0 -1• The HIGHER the difficulty factor – the easier the

question is, so a value of 1 would mean all the students got the question correct and it may be too easy

Page 11: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

Difficulty FactorDifficulty Factor

• Optimal Level is .5• To be able to discriminate between different levels

of achievement, the difficulty factor should be between .3 and .7

• If you want the students to master the topic area, high difficulty values should be expected.

D = c / n

Page 12: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

Guided PracticeGuided Practice

What is the D for Items 1-3

StudentRaw score Item 1 Item 2 Item 3 Item 4 Item 5

A 8 a b a d e

B 6 c b e c e

C 6 a c e c b

D 4 a b e a c

E 2 c a b d c

F 8 a b c c e

G 10 a b a c e

H 6 a b c d e

I 8 a c a c e

J 4 a c a d b

Page 13: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

Difficulty FactorDifficulty Factor

• Item # 1 = .8• Item # 2 = .6• Item # 3 = .4

What does it mean?• Item # 1 = .8 may be too easy• Item # 2 = .6 good• Item # 3 = .4 good

Page 14: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

Individual PracticeIndividual Practice

What is the D for Items 4-5

StudentRaw score Item 1 Item 2 Item 3 Item 4 Item 5

A 8 a b a d e

B 6 c b e c e

C 6 a c e c b

D 4 a b e a c

E 2 c a b d c

F 8 a b c c e

G 10 a b a c e

H 6 a b c d e

I 8 a c a c e

J 4 a c a d b

Page 15: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

Difficulty FactorDifficulty Factor

• Item # 4 = .5• Item # 5 = .6

What does it mean?• Item # 4 = .5 optimal• Item # 5 = .6 good

Overall, you can say that only item #1 may be too easy

Page 16: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

2. Do the items discriminate between those students 2. Do the items discriminate between those students who really knew the material from those that did not?who really knew the material from those that did not?

• The Discrimination Index– DI = (a-b) / n – a=response frequency of the High group– b=response frequency of the Low group– n-number of respondents

• Point- Biserial Correlation

Page 17: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

2. Do the items discriminate between those students 2. Do the items discriminate between those students who really knew the material from those that did who really knew the material from those that did not?not?

• Correlates the test-takers performance on a single test item with their total score.

• Range +1.00 to -1.00• Items which discriminate well are those which

have difficulties between .3 and .7

Page 18: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

2. Do the items discriminate between those students 2. Do the items discriminate between those students who really knew the material from those that did not?who really knew the material from those that did not?

• Positive coefficient means that test-taker who got the item right generally did well on the test as a whole, while those who did poorly on the item did poorly on the test.

• Negative coefficient means that the test-taker who did well on the test missed the item, while those who did poorly got the item right.

• Zero coefficient means that all test-takers got the item correct or incorrect.

Page 19: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

2. Do the items discriminate between those students 2. Do the items discriminate between those students who really knew the material from those that did not?who really knew the material from those that did not?

The Discrimination Index Steps

1. Rank test scores from highest to lowest, so the highest is at the top of the list

2. Define high group (top 27%)

3. Define low group (bottom 27%)

4. Calculate DI= a-b / n

Page 20: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

What does it mean?What does it mean?

Point Biserial• Item # 1 = .48• Item # 2 = .43• Item # 3 = .47• Item # 4 = .62• Item # 5 = .83

Item 5 is close to not discriminating

Overall the test does discriminate

Page 21: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

3. What is the reliability of the exam3. What is the reliability of the exam

1. Kuder- Richardson 20

2. Kuder-Richardson 21

3. Cronbach alpha

Page 22: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

3. What is the reliability of the exam3. What is the reliability of the exam

• Range 0-1• Higher value indicates a strong relationship

between items and test• Lower value indicates a weaker relationship

between test item and test

r = n / n-1[s2 + Σp1q1 / s2 ]

n = number of items on tests= standard deviation

p1= proportion of correct responses

q1= 1- p1

Page 23: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

What does it mean?What does it mean?

Kuder 20• Item # 1 = .88• Item # 2 = .63• Item # 3 = .40• Item # 4 = .76• Item # 5 = .89

Item 3 may not relate as well

Overall the test is reliable

Page 24: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

Review Review

Purpose - statistically analyze multiple choice test items to ensure items are effectively evaluating student learning.

1. Were any of the items too difficult or easy? (Difficulty index)

2. Do the items discriminate between those students who really knew the material from those that did not? (Discrimination index or Point Biserial)

3. What is the reliability of the exam? (Kuder 20)

Page 25: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

More Practice…More Practice…

Item Difficulty Discrimination

Reliability

# 1 .28 .40 .80

# 2 .30 .68 .76

# 3 .80 .78 .70

# 4 .10 -1.00 .20

Page 26: Multiple Choice Test Item Analysis Facilitator: Sophia Scott

Thank you for your TimeThank you for your Time

Any Questions or Comments?Any Questions or Comments?