1 epsy 546: lecture 3 generalizability theory and validity george karabatsos

19
1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

Upload: milo-foster

Post on 04-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

1

EPSY 546: LECTURE 3

GENERALIZABILITY THEORYAND

VALIDITY

George Karabatsos

Page 2: 1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

2

GENERALIZABILITY THEORY

Page 3: 1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

3

TRUE SCORE MODEL• Recall the true score model:

X+n Observed Test Score of person n,

Tn True Test Score (unknown)

en Random Error (unknown)

X T en n n

Page 4: 1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

4

TRUE SCORE MODEL• Recall the true score model:

• One may view that the true score model narrowly defines error.1 variable, simple ANOVA:

Between (true score) var + Within (random error) var.

X T en n n

Page 5: 1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

5

GENERALIZABILTY THEORY

• Generalizability Theory extends the true score model by acknowledging that multiple factors affect the measurement variance.

– Multivariable ANOVA:

The observed test response is a function of 2 or more variables, their interactions, and random measurement error.

Page 6: 1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

6

G-THEORY MODEL (example)

Xnjt = Grand mean

+ n – Person n’s effect

+ j – Item j’s effect

+ t – Time t’s effect

+ nt – n – t + Person Time effect

+ nj – n – j + Person Item effect

+ tj – t – j + Time Item effect

+ residual Three way

interaction, and error

Page 7: 1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

7

G-THEORY VARIANCE PARTITION

Systematic

Persons 2P

Measurement Error (facet contributions)

Items 2I

Time 2T

Person Time 2 PT

Person Item 2 PI

Time Item 2 TI

3-way inter + error 2PIT, error

Page 8: 1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

8

G-THEORY OF DECISIONS

• Relative decisions: Decisions based on the rank ordering of persons (e.g., college admission, pass-fail testing).

• Variance contributing to measurement error for relative decisions:

2Relat = 2

PI + 2PT + 2

PIT,error

(all variance components associated with the interaction of persons)

Page 9: 1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

9

G-THEORY OF DECISIONS

• Absolute decisions: Decisions based on the level of the observed score, without regard to the performance of others. (e.g., driver’s license).

• Variance contributing to measurement error for absolute decisions :

2Abs = 2

T + 2I + 2

PI + 2PT + 2

IT + 2PIT,error

(all variance components associated with the facets, which introduce “constant” effects to absolute decisions)

Page 10: 1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

10

GENERALIZABILITY COEFFICIENT

• Indicates how accurately the observed test scores allows us to generalize about persons’ behavior in a designed universe of situations (Cronbach, 1972).

E P

P D ecision

D ecision

22

2 2

2 2 2

,

w ith : o r R ela t A b s

Page 11: 1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

11

STUDIES

• G-Study (Generalizability Study):

Aims to estimate the variance components underlying a measurement process by defining the universe of admissible observations as broadly as possible.

Page 12: 1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

12

STUDIES

• D-Study (Design Study):

Using G-study results to address “what if” questions about variation in measurement design (Thompson & Melancon, 1987).

This helps pinpoint sources of error to specify protocol modifications to obtain the desired level of generalizability.

Page 13: 1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

13

EXAMPLES OF G- THEORY

• Nice illustrations are offered in:

Webb, Rowley, & Shavelson (1988)

and

Crowley, Thompson, & Worchel (1994)

Page 14: 1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

14

VALIDITY

Page 15: 1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

15

TEST VALIDITY

• VALIDITY: A test is valid if it measures what it claims to measure.

• Types: Face, Content, Concurrent, Predictive, Construct.

Page 16: 1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

16

• Face validity: When the test items appear to measure what the test claims to measure.

• Content Validity: When the content of the test items, according to domain experts, adequately represent the latent trait that the test intends to measure.

TEST VALIDITY

Page 17: 1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

17

• Concurrent validity: When the test, which intends to measure a particular latent trait, correlates highly with another test that measures that trait.

• Predictive validity: When the scores of the test predict some meaningful criterion.

TEST VALIDITY

Page 18: 1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

18

• Construct validity: A test has construct validity when the results of using the test fit hypotheses concerning the theoretical nature of the latent trait. The higher the fit, the higher the construct validity.

TEST VALIDITY

Page 19: 1 EPSY 546: LECTURE 3 GENERALIZABILITY THEORY AND VALIDITY George Karabatsos

19

– Content: Item content relevance, representativeness, and technical quality (includes face).

– Substantive: Theoretical rationales for the observed consistencies in the test responses.

– Structural: Fidelity of scoring structure to the structure of the content domain.

– Generalizability: The extent to which the score properties and interpretations generalize over population groups, settings, and tasks.

– External: Concurrent/convergent, discrim., pred.– Consequential: refers to the (potential and actual)

consequences of test use.

MESSICK’S UNIFIED CONSTRUCT VALIDITY