presented by zhu jinxin

Comparison of Reliability Measures under Factor Analysis and Item Response

Theory

—Ying Cheng，Ke-Hai Yuan， and Cheng Liu

Presented by Zhu Jinxin

Outline of the Presentation• Introduction of four reliability

coefficients: a, w, p, and r• The relationship among them• Conclusion and discussion

Cronbach’s alpha

• One of the definitions is

• K is the number of components (items or testlets)• sX

2 is the variance of the observed total test scores,

• sYi2 is the variance of component i for the current

sample of persons.

Cronbach’s alpha’s feature

• It is most widely used• Raw sum score is used• a may underestimates reliability

at population level, when the assumption of essential tau-equivalency is violated

about Tau-equivalency

about Tau-equivalency

In this case, the reliability is underestimated by a, which is only a lower-bound estimate of the true reliability of scale when measures are congeneric .

w & r in congeneric measuresin Single-factor model


Suppose we have m items


Variance of true score

Variance of unweighted composite score

feature of w

1.It neglects that people with the same sum score can have completely deferent response patterns. 2.w≧a, when


r≧w≧a

when is w equal to r?

Reliability in IRT• The variance of the MLE is (approximately) given by

the inverse of the information• The variance of q is 1 in MLE, in which

• The study use information in a broader sense by equating it with the inverse of a variance even when the parameter estimate is not an MLE

• so

w from information perspective

r from information perspective

w & r from information perspective

Reliability in IRT• With a single parameter, I, the information is

defined as the negative expected value of the second derivative of the log likelihood function.

• The IRT models directly relate the discrete responses to an underlying latent factor.

• When q is normally distributed, the normal ogive IRT models are equivalent to the item factor analysis model.

Reliability in IRT• For binary response

Where id the response and

Approximately

Reliability in IRT• For binary response

Reliability in IRT• For binary response The information is defined as the negative

expected value of the second derivative of the log likelihood function:

For each item

For test

Reliability in IRT• For binary response the reliability is

and (the deduction is put in the appedix)

Reliability in IRT• For response of ordered categories, supposing the

continuous response to item j is discretized by g threshold.

• The information of jth item is given by

The relationship

• r≧w≧a• • It is expected that

• There is no dominant relationship between p(2) • Simulation demonstrated that, as the number

of response increase, p can exceed w in practice.

Conclusion

• Keep as many many response categories as possible and use ML factor score.

• However, after having a certain number of response options, it may not be worth adding more.

Discussion

• Only graded response (order categories) models is studied. (comparing to other types polytomous IRT models)

• Only unidimensional models are studied.

Thank you!

presented by zhu jinxin

Documents