chapter 7 item analysis in constructing a new test (or shortening or lengthening an existing one),...

41
Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified t hrough a process known as item analysis. —Linda Croker

Upload: denis-walton

Post on 12-Jan-2016

219 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

Chapter 7 Item Analysis

In constructing a new test (or shortening or

lengthening an existing one), the final set of item

s is usually identified through a process known a

s item analysis.

—Linda Croker

Page 2: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

Both the validity and the reliability of any

test depend ultimately on the characteristics of

its items.

Page 3: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

Two Approaches of Item Analysis

Qualitative Analysis

Quantitative Analysis

Page 4: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

Qualitative Analysis

includes the consideration of content

validity (content and form of items), as well as

the evaluation of items in terms of effective

item-writing procedures.

Page 5: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

• Quantitative Analysis

includes principally the measurement of

item difficulty and item discrimination.

Page 6: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

§1 Item Difficulty

1. Definition

The item difficulty for item i, pi , is defined

as the proportion of examinees who get that

item correct.

Page 7: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

Though the proportion of examinees passing

an item traditionally has been called the item

difficulty, this proportion logically should be

called item easiness, because the proportion

increase as the item becomes easier.

Page 8: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

2. Estimation Methods

• Method for Dichotomously Scored Item

• Method for Polytomously Scored Item

• Grouping Method

Page 9: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

• Method for Dichotomously Scored Items

(7.1)

p is the difficulty of a certain item.

R is the number of examinees who get that item correct.

N is the total number of examinees.

N

RP

Page 10: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

Example 1

There are 80 high school students attending a

science achievement test, and 61 students pass

item 1, 32 students pass item 10. Please calculate

the difficulty for item 1 and 10 separately.

Page 11: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

• Method for Polytomously Scored Items

(7.2)maxX

XP

X , the mean of total examinees’ scores on one item

maxX , the perfect scores of that item

Page 12: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

Example 2

The perfect scores of one open- ended item is

20 points, the average score of total examinees

on this item is 11 points. What is the item

difficulty?

Key: .55

Page 13: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

• Grouping Method (Use of Extreme Groups)

Upper (U) and Lower (L) Criterion groups are selected

from the extremes of distribution of test scores or job ratings.

T. L. Kelley (1939) proposed that upper and lower

27% could lead to the optimal point when the total test

scores are normally distributed.

Page 14: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

2LU PP

P

UP is th proportion for examinees of upper group who get the item correct.

(7.3)

LP is the proportion for examinees of lower group who get the item correct.

Page 15: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

Example 3

There are 370 examinees attending a language

test. Known that 64 examinees of 27% upper

extreme group pass item 5, and 33 examinees of 27%

lower extreme group pass the same item. Please

compute the difficulty of item 5.

Key : .49

Page 16: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

3. Correct Chance Effects on Item

Difficulty for Multiple-Choice Item

(7.4)1

1

K

KPCP

CP ,corrected item difficulty

P ,uncorrected item difficulty

K , the number of choices for that item

Page 17: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

Example 4

The diffuculty of one five-choice item is .50, t

he difficulty of another four-choice item is .53.

Which item is more difficulty?

Page 18: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

ANSWER

So, the four-choice item is more difficulty.

38.015

15.05

1

11

K

KPCP

37.014

153.04

1

12

K

KPCP

Page 19: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

4. Item Difficulty and Discrimination

Discrimination

Difficulty

Page 20: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

If there are 100 persons in one population ,

then ,we can calculate the discriminations as

following:

P=.01, 1 × 99 = 99

P=.02, 2 × 98 = 196

P=.3, 30× 70 = 2100

P=.5, 50 × 50 = 2500

Page 21: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

5. Test difficulty and the Distribution of Test Scores• How to Calculate the Test Difficulty ?

Two Methods

A calculate the mean of all item difficulties of the test

B compute the ratio of mean of test scores to perfect test scores

Page 22: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

• Test difficulty and the Distribution of Test Scores

(a)Positive Skewed Distribution

(b)Negtive Skewed Distribution

Page 23: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

§2 Item Discrimination

When the test as a whole is to be evaluated by means of

criterion-related validation, the items may themselves be

evaluated and selected on the basis of their relationships to

the external criterion.

When we identify an item for which high scoring

examinees have a high probability of answering correctly

and low-scoring examinees have a low probability of

answer correctly, we would say such an item can

discriminates or differentiates the examinees.

Page 24: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

1. Interpretation

Item discrimination refers to the degree to

which an item differentiates correctly among

test takers in the behavior that the test is

designed to measure.

Page 25: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

2. Estimation Methods

• Index of Discrimination (used for dichotomously scored items)

D = PH - PL (7.5)

We need to set one or two cutting scores to divide the examinees into

upper scoring group and lower scoring group.

PH is the proportion in the upper group who answer the item

correctly and PL is the proportion in the lower group who answer the

item correctly.

Values of D may range from -1.00 to 1.00.

Page 26: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

Example 1

There are 140 students attending a world history test.

(1) If we use the ratio 27% to determine the upper and lower group, then how many examinees are there in the upper and lower group separately? (2)If 18 examinees in upper group answer item 5 correctly, and 6 examinees in lower group answer it correctly, then calculate the discrimination index for item 5.

Page 27: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

Example 2 50 Examinees’ Test Data on 8-Item Scale About

Job Stress.

Item 1 2 3 4 5 6 7 8

PH

PL

.54 .81 .47 .32 .51 .18 .63 .56

.32 .56 .11 .05 .10 . 23 .25 .19

D .18 .25 .36 .27 .41 -.05 .38 .37

Page 28: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

Guidelines for Interpretation of D Value

D≥.40, the item is functioning quite satisfactorily

.30≤ D≤.39, little or no revision is required

.20 ≤ D≤.29, the item is marginal and needs revision

D≤.19, the item should be eliminated or completely revised

Page 29: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

• Correlation Indices of Item

Discrimination(1) Pearson Product Moment Correlation

Coefficient

This formula is commonly used to estimate the degree of the relationship between item and criterion scores

YXXY sNs

xyr

Page 30: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

(2) Point Biserial Correlation

If we use the total test score as the criterion, and test item

is scored 0 to 1, then we can use the following formula:

qps

XXr

t

tppbi /

(7.6)

pX is the mean test scores for those who answer the item correctly

tX is the mean scores for the entire group

ts is the standard deviation of test scores for entire group

p is the pass ratio of that item (difficulty)

q is fail ratio of that item

Page 31: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

Example 3

the Test Data of 15 Examinees

Examinees 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Test score

Item score

90 81 80 78 77 70 69 65 55 50 49 42 35 31 10

1 0 1 1 1 1 1 0 0 0 1 0 1 0 0

n

XXst

2)(note:

Page 32: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

72.2115

15/88258936 2

ts

48.4667./5333.72.21

80.585.68

pbir

80.5815

10...808190

tX

Page 33: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

Transformation of Formula 7.6

pqs

XXr

t

qppbi

(7.7)

qX is the mean test scores for those

who answer that item incorrectly

Page 34: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

(3) Biserial Correlation Coefficient

or

Y

p

s

XXr

t

tpb

Y

pq

s

XXr

t

qpb

Page 35: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

(4) Correlation Between Items a) Tetrachoric Correlation Coefficient

Each variable is created through dichotomizing an underlying normal distribution

)180cos(

BCAD

ADrt

(7.8)

A B

C D

Item i 0 1

Item j

1

0

A+C B+D

A+B

C+D

Page 36: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

b) PHI Coefficient

))()()(( DBDADCBA

ADBCr

(7.9)

Page 37: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

• Variance for item

2

12

( )n

ij ij

i

X X

sn

(7.10)

Page 38: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

Difficulty and Discrimination

P D 1.00 0.00 0.90 0.20 0.70 0.60 0.60 0.80 0.50 1.00 0.40 0.80 0.30 0.60 0.10 0.20 0.00 0.00

Page 39: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

§3 Application Case of Item Analysis

1. Procedures• Select a representative sample of examinees and

administer the test;

• Differentiate the examinees into upper 27% (or 30% etc.) group and lower 27% group according to their test scores;

• Calculate PU and PL, then estimate P and D for each item;

• Compare the responses on different choices for each item between the upper group and lower group;

● Revise items.

Page 40: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

2. Analysis Case

P D

0.71 0.42 0.52

0.42 0.32 0.33

0.31 -0.06 -0.04

0.12 0.04 0.08

Item Group Number of Examinees on Each Choice

Key

A B C D Omit

1 Upper 5 92 1 2 0B

Lower 22 50 12 16 0

2 Upper 58 10 15 16 1A

Lower 26 21 15 36 2

3 Upper 17 15 28 28 12D

Lower 25 11 19 34 11

4 Upper 1 44 14 36 5C

Lower 1 56 10 28 5

br

Page 41: Chapter 7 Item Analysis In constructing a new test (or shortening or lengthening an existing one), the final set of items is usually identified through

Choice Analysis• Whether the examinees who choose the correct choice

is more than those who choose the wrong choices

• Whether a lot of examinees choose the wrong choices

• Whether the examinees of upper group who choose the correct choice is more than the examinees of lower group

• Whether the examinees of upper group who choose the wrong choice is more than those of lower group

• Whether there is any choice that few examinees choose

• Whether there is any item that quite a number of examinees make no choices