Download - Irt 1 pl, 2pl, 3pl.pdf

Item Response Theory

Advance Psychometric Theory

CPS723P

Dr. Carlo Magno

Importance of Test Theories

• Estimate examinee ability and how the contribution of error might be minimized

• Disattenuation of variables

• Reporting true scores or ability scores and associated confidence

Psychometric History

• Lord (1952, 1953) and other psychometricians were interested in psychometric models with which to assess examinees independently of the particular choice of items or assessment tasks that were used in the assessment.

• Measurement practices would be enhanced if item and test statistics would be made sample independent.

• Birnbaum (1957, 1958)• George Rasch (1960)• Wright (1968)

Limitations of the CTT

• Item difficulty and item discrimination are group dependent.

• The p and r values are dependent on the examinee sample from which they are taken.

• Scores are entirely test dependent.• No basis to predict the performance of

examinees on an item.

Assumptions in IRT

• Unidimensionality– Examinee performance is a single

ability

• Response Dichotomous– The relationship of examinee

performance on each item and the ability measured by the test is described as monotonically increasing.

• Monotonicity of item performance and ability is typified in an item characteristic curve (ICC).

• Examinees with more ability have higher probabilities for giving correct answers to items than lower ability students (Hambleton, 1989).

• Mathematical model linking the observable dichotomously scored data (item performance) to the unobservable data (ability)

• Pi(θ) gives the probability of a correct response to item i as a function if ability (θ)

• b is the probability of a correct answer (1+c)/2b=item difficulty

a=item discrimination

c=psuedoguessing parameter

a b

c

• Two-parameter model: c=0

• One-parameter model: c=0, a=1

ba

• Three items showing different item difficulties (b)

• Different levels of item discrimination

Polychotomous IRT Models

• Having more than 2 points in the responses (ex. 4 point scale)

• Partial credit model

• Graded response model

• Nominal model

• Rating scale model

Graded Response model for a 5-point scale

• In IRT measurement framework, ability estimates of an examinee obtained from a test that vary difficulty will be the same.

• Because of the unchanging ability, measurement errors are smaller

• True score is determined each test.• Item parameters are independent on

the particular examinee sample used.• Measurement error is estimated at

each ability level.

Test Characteristic Curve (TCC)• TCC: Sum of ICC that

make up a test or assessment and can be used to predict scores of examinees at given ability levels.

TCC(Ѳ)=∑Pi(Ѳ)• Links the true score to the

underlying ability measures by the test.

• TCC shift to the right of the ability scale=difficult items

Item Information Function• I(Ѳ), Contribution of

particular items to the assessment of ability.

• Items with higher discriminating power contribute more to measurement precision than items with lower discriminating power.

• Items tend to make their best contribution to measurement precision around their b value.

Item Information Function

–3 –2 –1 1 2 30

0

0.2

0.4

0.6

0.8

1

Ability ()

Four item characteristic curves

1 2 3

4

–3 –2 –1 1 2 30

0

0.5

1

1.5

2

Ability ()

Item information for four test items

1

2

3

4

Figure 6: Item characteristics curves and corresponding item information functions

Test Information Function

• The sum of item information functions in a test. • Higher values of the a parameter increase the

amount of information an item provides.• The lower the c parameter, the more information an

item provides.• • The more information provided by an assessment at

a particular level, the smaller the errors associated with ability estimation.

their corresponding IFF

30

0

0.5

1

1.5

2

Ability ()

Figure 7: Test information function for a four–item test

Item Parameter Invariance

• Item/test characteristic functions and item/test information functions are integral features of IRT.

Benefits of Item Response Models• Item statistics that are independent of the

groups from which they were estimated.• Scores describing examinee proficiency or

ability that are not dependent on test difficulty.

• Test models that provide a basis for matching items or assessment tasks to ability levels.

• Models that do not require strict parallel tests or assessments for assessing reliability.

Application of IRT on Test Development• Item Analysis

– Determining sample invariant item parameters.

– Utilizing goodness-of-fit criteria to detect items that do not fit the specified response model (χ2, analysis of residuals).

Application of IRT on Test Development• Item Selection

– Assess the contribution of each item the test information function independent of other items.

– Using item information functions:• Describe the shape of the desired test

information function vs. desired range abilities.

• Select items with information functions that will fill up the hard to fill areas under the target information function

• Calculate the test information function for the selected assessment material.

• Continue selecting materials until the test information function approximates the target information function to a satisfactory degree.

• Item banking– Test developers can build an

assessment to fit any desired test information function with items having sufficient properties.

– Comparisons of items can be made across dissimilar samples.

Download - Irt 1 pl, 2pl, 3pl.pdf

Top Related