presented by zhu jinxin
DESCRIPTION
Comparison of Reliability Measures under Factor Analysis and Item Response Theory —Ying Cheng , Ke-Hai Yuan , and Cheng Liu. Presented by Zhu Jinxin. Outline of the P resentation. Introduction of four reliability coefficients: a , w , p , and r The relationship among them - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/1.jpg)
Comparison of Reliability Measures under Factor Analysis and Item Response
Theory
—Ying Cheng,Ke-Hai Yuan, and Cheng Liu
Presented by Zhu Jinxin
![Page 2: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/2.jpg)
Outline of the Presentation• Introduction of four reliability
coefficients: a, w, p, and r• The relationship among them• Conclusion and discussion
![Page 3: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/3.jpg)
Cronbach’s alpha
• One of the definitions is
• K is the number of components (items or testlets)• sX
2 is the variance of the observed total test scores,
• sYi2 is the variance of component i for the current
sample of persons.
![Page 4: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/4.jpg)
Cronbach’s alpha’s feature
• It is most widely used• Raw sum score is used• a may underestimates reliability
at population level, when the assumption of essential tau-equivalency is violated
![Page 5: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/5.jpg)
about Tau-equivalency
![Page 6: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/6.jpg)
about Tau-equivalency
![Page 7: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/7.jpg)
about Tau-equivalency
In this case, the reliability is underestimated by a, which is only a lower-bound estimate of the true reliability of scale when measures are congeneric .
![Page 8: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/8.jpg)
w & r in congeneric measuresin Single-factor model
![Page 9: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/9.jpg)
w & r in congeneric measuresin Single-factor model
Suppose we have m items
![Page 10: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/10.jpg)
w & r in congeneric measuresin Single-factor model
Variance of true score
Variance of unweighted composite score
![Page 11: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/11.jpg)
feature of w
1.It neglects that people with the same sum score can have completely deferent response patterns. 2.w≧a, when
![Page 12: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/12.jpg)
w & r in congeneric measuresin Single-factor model
r≧w≧a
when is w equal to r?
![Page 13: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/13.jpg)
Reliability in IRT• The variance of the MLE is (approximately) given by
the inverse of the information• The variance of q is 1 in MLE, in which
• The study use information in a broader sense by equating it with the inverse of a variance even when the parameter estimate is not an MLE
• so
![Page 14: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/14.jpg)
w from information perspective
![Page 15: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/15.jpg)
r from information perspective
![Page 16: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/16.jpg)
w & r from information perspective
![Page 17: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/17.jpg)
Reliability in IRT• With a single parameter, I, the information is
defined as the negative expected value of the second derivative of the log likelihood function.
• The IRT models directly relate the discrete responses to an underlying latent factor.
• When q is normally distributed, the normal ogive IRT models are equivalent to the item factor analysis model.
![Page 18: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/18.jpg)
Reliability in IRT• For binary response
Where id the response and
Approximately
![Page 19: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/19.jpg)
Reliability in IRT• For binary response
![Page 20: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/20.jpg)
Reliability in IRT• For binary response The information is defined as the negative
expected value of the second derivative of the log likelihood function:
For each item
For test
![Page 21: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/21.jpg)
Reliability in IRT• For binary response the reliability is
and (the deduction is put in the appedix)
![Page 22: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/22.jpg)
Reliability in IRT• For response of ordered categories, supposing the
continuous response to item j is discretized by g threshold.
• The information of jth item is given by
![Page 23: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/23.jpg)
The relationship
• r≧w≧a• • It is expected that
• There is no dominant relationship between p(2) • Simulation demonstrated that, as the number
of response increase, p can exceed w in practice.
![Page 24: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/24.jpg)
Conclusion
• Keep as many many response categories as possible and use ML factor score.
• However, after having a certain number of response options, it may not be worth adding more.
![Page 25: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/25.jpg)
Discussion
• Only graded response (order categories) models is studied. (comparing to other types polytomous IRT models)
• Only unidimensional models are studied.
![Page 26: Presented by Zhu Jinxin](https://reader036.vdocuments.site/reader036/viewer/2022062305/56815bbd550346895dc9bcee/html5/thumbnails/26.jpg)
Thank you!