latent rank theory: test theory for making can-do chart

Latent Rank Theory:Test Theory for Making Can-Do Chart

SHOJIMA KojiroThe National Center for University Entrance Examinations, Japan

[email protected]

1

Accuracy• Scale (Weighing machine)

– A1 weighs 73 kg

– fW(A1)=73

• fW (A1)≠74

• fW (A1)≠72

• Academic test– B1 scores 73 points

– fT(B1)=73

• fT(B1)≠74 ?

• fT(B1)≠72 ?

Discriminating Power• Scale (Weighing machine)

– A1 weighs 73 kg

– A2 weighs 75 kg

• fW(A1)<fW (A2)


– B2 scores 75 points

• fT(B1)<fT (B2) ?

Resolution• Scale (Weighing machine)

– A1 weighs 73 kg

– A2 weighs 75 kg

– A3 weighs ...


– B2 scores 75 points

– B3 scores ...kgT

Test Limitations• Precise measurement is almost impossible

– CTT reliabilities: 10% measurement error

• A test is at best capable of classifying academic ability into 5–20 levels

• Why continuous scale?– Classical Test Theory: Continuous Scale

– Item Response Theory: Continuous Scale

• Common European Framework of Reference for Languages (CEFR)– 6 levels: A1, A2, B1, B2, C1, C2

Graded evaluation↓

Accountability↓

Qualification testOrdinal academic ability evaluation scale based on Neural Test Theory

Ordinal academic ability evaluation scale based on Neural Test Theory

Continuous academic ability evaluation scale based on IRT or CTT

Continuous academic ability evaluation scale based on IRT or CTT

It is difficult to explain the relationship between scores and abilities because individual abilities also change continuously

It is difficult to explain the relationship between scores and abilities because individual abilities also change continuously

Because the individual abilities also change in stages, it is easy to explain the relationship between scores and abilities. This increases the test’s accountability.

Because the individual abilities also change in stages, it is easy to explain the relationship between scores and abilities. This increases the test’s accountability.

Latent Rank Theory(or Neural Test Theory)

• A test theory– Ordinal scale (not continuous scale)

– Self-organizing map (SOM) or generative topographic mapping (GTM) mechanism

• Shojima, K. (2009) Neural test theory. K. Shigemasu et al. (Eds.) New Trends in Psychometrics, Universal Academy Press, Inc., pp. 417-426.

• Shojima, K. (2011) Local dependence model in latent rank theory. Jpn J of Applied Statistics, 40, 141-156.

Statistical Learning Framework in LRT

・ For (t=1; t ≤ T; t = t + 1)

・ U(t)←Randomly sort row vectors of U

・ For (h=1; h ≤ N; h = h + 1)

　　・ Obtain zh(t)　from uh

(t)

　　・ Select winner rank for uh(t)

　　・ Obtain V(t,h) by updating V(t,h−1)

・ V(t,N)←V(t+1,0)

Point 1

Point 2

LRT Mechanism (SOM)

0

0

0

1

0

0

0

1

0

0

0

1

0

1

1

1

1

0

1

0

1

0

0

1

Latent Rank Scale

Nu

mb

er

of

Item

s

InputPoint 1Point 2 Point 1Point 2

Point 1: Winner Rank Selection

Bayes

ML

)1,()()1,()(

1

)()1,()( 1ln1ln)|(

htqj

thj

htqj

thj

n

j

thj

htth vuvuzp Vu

Likelihood

)|(lnmaxarg: )1,()()(

htt

hQq

MLw pwR Vu

)(ln)|(lnmaxarg: )1,()()(q

htth

Qq

MAPw fppwR

Vu

Point 2: Update the Reference Vectors

• The nodes of the ranks nearer to the winner are updated to become closer to the input data

• h: tension

• α: size of tension

• σ: region size of learning propagation

)1,(')(')()()1,(),( )()'( htQ

thQ

th

tn

htht V1u1zh1VV

1

)1()(1

)1()(

2

)(exp

)1(}{

1

1

22

2)(

)()(

T

ttTT

ttT

Q

wq

N

Qh

nh

Tt

Tt

t

ttqw

tqw

t

h

Example• A geography test of the NCT

N 5000n 35Median 17Max 35Min 2Range 33Mean 16.911Sd 4.976Skew 0.313Kurt -0.074Alpha 0.704

0 5 10 15 20 25 30 35SCORE

0

100

200

300

400

500

YCNEUQERF

Fit Indices and N of RanksN of ranks is

10N of Ranks is 5

Item Reference Profile (IRP)

Monotonic increasing constraint can be imposed.

Test Reference Profile (TRP)

• Strongly ordinal alignment condition (SOAC)– All IRPs increase monotonically TRP also increases monotonically

• Weakly ordinal alignment condition (WOAC)– TRP increases monotonically, but not all IRPs increase monotonically

• For the scale to be ordinal, at least the WOAC must be satisfied.

• (Weighted) sum of IRPs• Expected score at each

latent rank

Rank Membership Profile (RMP)

• Posterior distribution of the latent rank to which each examinee belongs

Q

q qqi

qqiiq

fpp

fppp

1' '' )()|(

)()|(

vu

vuRMP

Examples of RMP

2 4 6 8 10LATENT RANK

0

0.2

0.4

0.6

0.8

1

YTILIBABORP

Examinee 11


0

0.2

0.4

0.6

0.8

1

YTILIBABORP

Examinee 12


0

0.2

0.4

0.6

0.8

1

YTILIBABORP

Examinee 13


0

0.2

0.4

0.6

0.8

1

YTILIBABORP

Examinee 14


0

0.2

0.4

0.6

0.8

1

YTILIBABORP

Examinee 15


0

0.2

0.4

0.6

0.8

1

YTILIBABORP

Examinee 6


0

0.2

0.4

0.6

0.8

1

YTILIBABORP

Examinee 7


0

0.2

0.4

0.6

0.8

1

YTILIBABORP

Examinee 8


0

0.2

0.4

0.6

0.8

1

YTILIBABORP

Examinee 9


0

0.2

0.4

0.6

0.8

1

YTILIBABORP

Examinee 10


0

0.2

0.4

0.6

0.8

1

YTILIBABORP

Examinee 1


0

0.2

0.4

0.6

0.8

1

YTILIBABORP

Examinee 2


0

0.2

0.4

0.6

0.8

1

YTILIBABORP

Examinee 3


0

0.2

0.4

0.6

0.8

1

YTILIBABORP

Examinee 4


0

0.2

0.4

0.6

0.8

1

YTILIBABORP

Examinee 5

17

Can-Do Chart (Example)

http://antlers.rd.dnc.ac.jp/~shojima/ntt/Can-DoChart.pdf

Extended Models• Graded LRT Model (RN07-03)

– LRT model for ordinal polytomous data

• Nominal LRT Model (RN07-21)– LRT model for nominal polytomous data

• Continuous LRT Model

• Multidimensional LRT Model

Graded LRT ModelBoundary Category Reference Profiles

0 0 01

1 1

2

2

2

33

3

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0

L A T E N T R A N K

PRO

BA

BIL

ITY

0 0 01 1 1

2

2

2

33

3

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0

L A T E N T R A N KPR

OB

AB

ILIT

Y

0 0 01 1 1

22

2

33

3

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0

L A T E N T R A N K

PRO

BA

BIL

ITY

0 0 0

11

1

2

2

2

33

3

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0

L A T E N T R A N K

PRO

BA

BIL

ITY

0 0 0

11 1

2

2

2

33

3

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0

L A T E N T R A N K

PRO

BA

BIL

ITY

0 0 0

11

1

2

2

2

33

3

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0


OB

AB

ILIT

Y

Graded LRT ModelItem Category Reference Profile

00 0

11

1

22

2

33

3

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0

L A T E N T R A N K

PRO

BA

BIL

ITY

0 0 0

11

1

22

2

33

3

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0


OB

AB

ILIT

Y

0 0 0

11

1

2 2 23

33

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0

L A T E N T R A N K

PRO

BA

BIL

ITY

00

0

11

12

2 2

33

3

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0

L A T E N T R A N K

PRO

BA

BIL

ITY

00 0

11

1

22 2

33

3

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0

L A T E N T R A N K

PRO

BA

BIL

ITY

00

0

1 1

122 2

33

3

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0


OB

AB

ILIT

Y

Nominal LRT ModelItem Category Reference Profile

*Correct selection, x Combined categories selected less than 10% of the time

2 22

3 33

4 4 4x x x

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0

L A T E N T R A N K

PRO

BA

BIL

ITY

1 11

33

3

44

4

x x x

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0

L A T E N T R A N K

PRO

BA

BIL

ITY

2 2 23 3

3

4 44

x x x

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0

L A T E N T R A N K

PRO

BA

BIL

ITY

2 22

33

3

4 4 4x x x

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0

L A T E N T R A N K

PRO

BA

BIL

ITY

22 2

3 3 3

4

44

x x x

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0

L A T E N T R A N K

PRO

BA

BIL

ITY

11

12 2 23 3 34

4

4

x x x

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0

L A T E N T R A N K

PRO

BA

BIL

ITY

3

33

44 4

x x x

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0

L A T E N T R A N K

PRO

BA

BIL

ITY

22

2

3 3

3x x x

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0

L A T E N T R A N K

PRO

BA

BIL

ITY

44 4

xx x

2 4 6 8 1 00 .0

0 .2

0 .4

0 .6

0 .8

1 .0

L A T E N T R A N K

PRO

BA

BIL

ITY

latent rank theory: test theory for making can-do chart

Documents

classical test theory

measurement errora test

machinea1 weighs 73kgfwa1

reference vectorsthe

lrtfor t

local dependence model

levelswhy continuous

continuous scaleself