administering, analyzing, and improving the test or assessment

ADMINISTERING, ANALYZING, AND IMPROVING THE TEST OR

ASSESSMENT

NEMA GRACE B. MEDILLO

GOAL IN THIS CHAPTER:

To provide suggestion on how to avoid common pitfalls in test assembly, administration, and scoring.

Start

Assembling the Test

Analysing the Test Debriefing

Scoring the Test

Process of Evaluating Classroom Achievement

Administering the Test

Nema Grace B. Medillo

ASSEMBLING THE TEST

Packaging the Test Reproducing the Test

Packaging the Test

Group together all items of similar formatArrange test items from easy to hardSpace the items for easy readingKeep the items and options on the same pagePosition illustration near descriptions and above the itemCheck your answerDetermine how the students record their answerProvide space for Name and DateCheck test directionsProofread the test

Reproducing the TestKnow the Photocopying Machine

Specify Copying Instructions

File Original Test

ADMINISTERING THE TEST

Maintain a Positive AttitudeMaximize Achievement MotivationEqualize AdvantagesAvoid SurprisesClarify the RulesRotate Distribution

Remind Students to Check Their Copies

Monitor Students

Minimize Distractions

Give Time Warnings

Collect Test Uniformly

SCORING THE TEST

Prepare the Answer Key

Check the Answer Key

Score Blindly

Check Machine – Scored Answer Sheets

Check Scoring

Record Scores

ANALYZING THE TEST

Quantitative Item Analysis

Qualitative Item Analysis

Item Analysis Modifications for the Criterion – Referenced Test

Item Analysis Terminology

Item Analysis TerminologyQuantitative Item Analysis• A numerical method for analyzing test items employing

students response alternatives or options.

Qualitative Item Analysis• A non – numerical method for analyzing test items not

employing students responses, but considering test objectives, content validity, and technical item quality

Key• Correct option in a multiple – choice item

Item Analysis TerminologyDistractor• Incorrect option in a multiple – choice item

Difficulty Index• Proportion of students who answered the item correctly.

Discrimination Index D• Measure of the extent to which a test item discriminates or

differentiates between students who do well on the overall test and those who do not do well on the overall test.

Discrimination Index (D)

• Those who did well on the overall test chose the correct answer for a particular item more often than those who did poorly on the overall test.

Positive Discrimination Index

• Those who did poorly on the overall test chose the correct answer for a particular item more often than those who did well on the overall test.

Negative Discrimination Index

• Those who did well and those who did poorly on the overall test chose the correct answer for a particular item with equal frequency

Zero Discriminanation Index

Quantitative Item Analysis

Case Sample and Guide Questions

Difficulty Level

Discrimination Index

Miskeying Guessing Ambiguity

Consider the case belowSuppose your students chose the options to a four – alternative multiple – choice item.

Let C as the correct answer.

A B C* D3 0 18 9

How does this information help us?Is the item too difficult/easy for the students?Are the distractors of the items effective?

Item X

Guide questions in quantitative item analysis

1.What is the difficulty level?2.What is the discrimination index?3.Should this item be eliminated?4.Should any distractor(s) be modified?

Difficulty LevelTo compute the difficulty level of an item;

A B C* D3 0 18 9

What is the difficulty level of the item?Do you consider the item difficult or easy? Why?

𝒑=𝑵𝒖𝒎𝒃𝒆𝒓 𝒔𝒆𝒍𝒆𝒄𝒕𝒊𝒏𝒈𝒄𝒐𝒓𝒓𝒆𝒄𝒕 𝒂𝒏𝒔𝒘𝒆𝒓

𝑻𝒐𝒕𝒂𝒍 𝑵𝒖𝒎𝒃𝒆𝒓 𝒕𝒂𝒌𝒊𝒏𝒈𝒕𝒉𝒆𝒕𝒆𝒔𝒕

Item X

𝒑=𝑵𝒖𝒎𝒃𝒆𝒓 𝒔𝒆𝒍𝒆𝒄𝒕𝒊𝒏𝒈𝒄𝒐𝒓𝒓𝒆𝒄𝒕 𝒂𝒏𝒔𝒘𝒆𝒓

𝑻𝒐𝒕𝒂𝒍 𝑵𝒖𝒎𝒃𝒆𝒓 𝒕𝒂𝒌𝒊𝒏𝒈𝒕𝒉𝒆𝒕𝒆𝒔𝒕

𝒑=𝟎 .𝟔𝟎

Solving the difficulty index

𝒑=𝟏𝟖𝟑𝟎

Since the difficulty level of the item is 0. 60 (60%), the item is moderately difficult.

A B C* D3 0 18 9

Item X

Note: If P level > 0.75, the item is considered relatively easy.If P level < 0. 25, the item is considered relatively difficult.

Discrimination IndexSteps in determining Discrimination Index

1. Arrange the papers from highest to lowest score.2. Separate the papers into an upper group and

lower 3. For each item, count the number in the upper

group and the number in the lower group that chose each alternatives

4. Record your information for each item Example for item X ( Class Size = 30)

Options A B C* DUpper 1 0 11 3Lower 2 0 7 6

5. Compute D, by plugging the appropriate numbers in the formula

What is the discrimination index of item X?Is the discrimination index positive or negative?Which of the groups frequently get the item correctly?

(if group sizes are unequal, choose the higher number)

Plugging the data

Since the discrimination index of item X is 0. 267, which is positive. More students who did well on the overall test answered the item correctly than students who did poorly on the overall test.

Implication Difficulty Level (p) = 0. 60

Discrimination Index (D) = 0.267

Should this item be eliminated?

The item is considered a moderately difficult item that has positive (desirable) discrimination ability.

NO

Should any distractor(s) be modified?

Option B is ought to be modified or replaced.

A B C* D3 0 18 9

Item X YES

(No one chose it)

Experts say;

• Build tests that have most items between p levels of 0.20 and 0. 80, with an average p level about 0.50.

• Consider any item with a positive D value.

Let’s look at the responses for another itemItem Y ( Class size = 28)

Options A* B C DUpper 4 1 5 4Lower 1 7 3 3

Item Z ( Class size = 30)

Options A B* C DUpper 3 4 3 5Lower 0 10 2 3

1. What is the difficulty level?2. What is the discrimination index?3. Should this item be eliminated?4. Should any distractor(s) be modified?

Item Y Difficulty Level = 0.18

Discrimination Index = 0. 214Should this item be eliminated?

Should any distractor(s) be eliminated?

No, since it is positively discriminating. However, it is difficult item; only 18% of the class got it right.

Yes; C and D attracted more students who did well on the test overall.

Remember:Fewer students who do well on the test should choose each distractor than students who do poorly. More students who do well on the test should choose the correct answer than students who do poorly

Difficulty Level = 0.467Discrimination Index = - 0.40

Should this item be eliminated?

Should any distractor(s) be eliminated?

Yes! The item is moderately difficult but it discriminate negatively.

Since we already decided to eliminate the item, this is moot question.

Item Z

Remember:Testing is to discriminate between those students who know their stuff and those who do not.

MISKEYINGMost students who did well on the

test will likely select an option that is a distractor, rather than the option that is keyed.

Consider the miskeyed itemWho was the first astronaut

to set foot on the moon?

a. John Glennb. Scott Carpenterc. Neil Armstrong

*d. Alan Sheppard

Responses A B C D*Upper 1 1 9 2

Most students in the upper half of the class fail to select the keyed option.

Remember, just as you are bound to make scoring errors, you are bound to miskey an item occasionally.

GUESSINGLikely to occur when the item measures content that is,

• Not covered in class or the text• So difficult that even the upper – half students

have no idea what the correct answer is• So trivial that students are unable to choose from

among the options provided.

The choice distribution suggest that guessing occurred:

A B C* DUpper Half 4 3 3 3

Each alternative is about equally attractive to students in the upper half.

AmbiguityAmong the upper group, one of the distractors is

chosen with about the same frequency as the correct answer.

The distribution suggests that an item is ambiguous:

A B C D*Upper Half 7 0 1 7

QUALITATIVE ITEM ANALYSIS

• Matching items and objectives • Editing poorly written items• Improving content of the validity of the test• Analyzing grammatical cues, specific determiners,

double negatives, multiple defensible answers, and items that fail to match instructional objectives

ITEM ANALYSIS MODIFICATIONS FOR THE CRITERION – REFERENCED TEST

Using Pre- and Posttest as Upper and

Lower Group

Comparing the Percentage

Answering Each Item Correctly on Both Pre- and Posttest

Determining the Percentage of

Items Answered in the Expected

Directions

Limitations of Modifications

Using Pre- and Posttests as Upper and Lower Groups

Pretest prior instructions

Most of the test items answered incorrectly

Results for Lower Group

P level = 0. 30 or lower

Post test after instructions

Most items were answered correctly

Results for Upper Group

P = 0.70 or higher

Analyzing sample DataExample 1:

Number of students choosing option (n = 25)Option At pretest (L) At posttest(U)A 9 1B 7 1C 3 2D* (Key) 6 21

Steps1. Compute p Levels for both tests.

PretestPosttest

= 0.24 =0.84

It was an improvement from 24% to 84%.

2. Determine the discrimination index (D) for the key

The key has a positive discrimination index.

3. Determine whether each option separately discriminates negatively.

Option A: Option B: Option C:

Summary1. There was sizeable increase in p value from

pretest to posttest.2. The D index for the key was positive.3. The distractors are all discriminated

negatively.If a criterion – reference test item manifests these features, it has passed our “test” and a probably

is a good item with little or no need for modification.

Contrast with, the test item failed. Rather than modify, it is probably more efficient to

replace it with another.

Comparing the Percentage Answering Each Item Correctly on Both Pre- and

PosttestIf your test is sensitive to your objectives.

What to do?Percentage Passing Posttest – Percentage of Passing Pretest

The more positive the difference, the more the item tapping the content you are teaching.

Analyzing Sample DataConsider the following percentages for five test items:

Item Percentage passing pretest

Percentage passing posttest

Difference

1 16 79 +63%

2 10 82 +72%

3* 75 75 0%

4 27 91 +64%

5* 67 53 -14%

Item 3 may be eliminated. (Students already know the content it represents.) Item 5 may be revised. (Instruction was not related to the item or confused some students.)

Determining the Percentage of Items Answered in the Expected

Directions for the Entire TestUses the number of items each

learner failed on the test prior to instruction but passed on the test after instruction

Steps1. Find the number of items each student failed on the

pretest but passed on the posttest.

Result of MaryItem Pretest Posttest1* Incorrect Correct2 Correct Correct3* Incorrect Correct4 Correct Incorrect5 Incorrect Incorrect

Do same things to other students.

2. Add the counts and divide by the number of students.

ExampleMary 18 Carlos 15Sharon 22 Amanda 20Charles 13

3. Divide by number of test items.

4. Multiply by 100.

0.70 x 100 = 70%

The greater the overall positive percentage of change, the more your test is likely to match your instruction and to be a content – valid test

Limitations of Modification

Difficult

Unit of instruction is brief

From norm – referenced test to criterion – referenced test

Time devote to instruction (pre – post)

DEBRIEFINGDiscuss Problem ItemsListen to Students ReactionsAvoid – on – the Spot DecisionsBe Equitable with ChangesAsk Students to Double – CheckAsk Students to Identify Problems

PROCESS OF EVALUATING CLASSROOM

ACHIEVEMENTS

THANK YOU…

THANK YOU…

THANK YOU…

THANK YOU…THANK YOU…

THANK YOU…

Shutting down

administering, analyzing, and improving the test or assessment

Education