dr. kline references slide details · 3 kline, r. b. (2013). exploratory and confirmatory factor...

Post on 25-Aug-2020

10 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

PSYC 426 Dr. Kline

References

Slide details

2

References Binet, A, & Simon, T. (1916). New methods for the diagnosis of the intellectual level of subnormals. In E. S. Kite (Trans.),

The development of intelligence in children. Vineland, NJ: Publications of the Training School at Vineland.

Spearman, C. (1904a). General intelligence, objectively determined and measured. American Journal of

Psychology, 15, 201–293.

Spearman, C. (1904b). The proof and measurement of the association between two things. American Journal of

Psychology, 15, 72–101.

American Educational Research Association, American Psychological Association, & National Council on

Measurement in Education. (2014). Standards for educational and psychological testing. Author: Washington,

DC.

Appelbaum, M., Cooper, H., Kline, R. B., Mayo-Wilson, E., Nezu, A. M., & Rao, S. M. (2018). Journal article reporting

standards for quantitative research in psychology: The APA Publications and Communications Board Task

Force report. American Psychologist, 73, 3–25.

Aiken, L. S., West, S., & Millsap, R. E. (2008). Doctoral training in statistics, measurement, and methodology in

psychology: Replication and extension of Aiken, West, Sechrest, and Reno’s (1990) survey of PhD programs in

North America. American Psychologist, 63, 32–50.

Vacha–Haase, T., & Thompson, B. (2011). Score reliability: A retrospective look back at 12 years of reliability

generalization. Measurement and Evaluation in Counseling and Development, 44, 159–168.

Thompson, B. (Ed.). (2003). Score reliability: Contemporary thinking on reliability issues. Thousand Oaks, CA: Sage.

Mulsant, B. H., Kastango, K. B., Rosen, J., Stone, R. A., Mazumdar, S., & Pollock, B. G. (2002). Interrater reliability in

clinical trials of depressive disorders. American Journal of Psychiatry, 159, 1598–1600.

Streiner, D. L. (2003). Starting at the beginning: An Introduction to coefficient alpha and internal consistency. Journal

of Personality Assessment, 80, 99–103.

Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50, 1–

73.

Reeves, T. D., & Marbach-Ad, G. (2016). Contemporary test validity in theory and practice: A primer for discipline-

based education researchers. CBE Life Sciences Education, 15(1), 1–9.

3

Kline, R. B. (2013). Exploratory and confirmatory factor analysis. In Y. Petscher & C. Schatsschneider (Eds.), Applied

quantitative analysis in the social sciences (pp. 171–207). New York: Routledge.

Costello, A. B., & Osborne, J. W. (2005). Best practices in exploratory factor analysis: Four recommendations for

getting the most from your analysis. Practical Assessment, Research & Evaluation, 10(7). Retrieved from

http://pareonline.net/

Huck, S. W. (2016). Statistical misconceptions (Classic ed.). New York: Routledge.

Downing, S. M., & Haladyna, T. M. (1997). Test item development: Validity evidence from quality assurance

procedures. Applied Measurement in Education, 10, 61–82.

First, M. B., Williams, J. B. W., Karg, R.S., & Spitzer, R. L. (2016). Structured Clinical Interview for DSM-5 Disorders,

Clinician Version (SCID-5-CV). Arlington, VA, American Psychiatric Association.

Schwarz, N., Knäuper, B., Hippler, H.-J., Noelle-Neumann, E., & Clark, L. (1991). Rating scales: Numeric values may

change the meaning of scale labels. Public Opinion Quarterly, 55, 570–582.

Baron, H. (1996). Strengths and limitations of ipsative measurement. Journal of Occupational and Organizational

Psychology, 69, 49–56.

Kelley, T.L. (1939). The selection of upper and lower groups for the validation of test items. Journal of Educational

Psychology, 30, 17–24.

Ellingsen, K. M. (2016). Standardized assessment of cognitive development: Instruments and issues. In A. Garro (Ed.),

Early childhood assessment in school and clinical child psychology (pp. 25–49). New York: Springer.

Gardner, H. (1993). Multiple intelligences: The theory in practice. New York: Basic.

Cormier, D. C., Kennedy, K. E., & Aquilina, A. M. (2016). Test review of the Wechsler Intelligence Scale for Children,

Fifth Edition: Canadian (WISC-VCDN). Canadian Journal of School Psychology, 31, 322–334.

van Widenfelt, B. M., Treffers, P. D. A., de Beurs, E., Siebelink, B. M., & Koudijs, E. (2005). Translation and cross-cultural

adaptation of assessment instruments used in psychological research with children and families. Clinical Child

and Family Psychology Review, 8, 135–147.

Van de Vijver, F., & Hambleton, R. K. (1996). Translating tests: Some practical guidelines. European Psychologist, 1,

89–99.

Ercikan, K., Gierl, M. J., McCreith, T., Puhan, G., & Koh, K. (2004). Comparability of bilingual versions of assessments:

Sources of incomparability of English and French versions of Canada's national achievement tests. Applied

Measurement in Education, 17, 301–321.

Reynolds, C. R., & Suzuki, L. A. (2013). Bias in psychological assessment: An empirical review and recommendations.

In J. R. Graham, J. A. Naglieri, & I. B. Weiner (Eds.), Handbook of psychology: Assessment psychology (pp. 82–

113). Hoboken, NJ: Wiley.

4

Tierney, R. D. (2016). Fairness in educational assessment. In M.A. Peters (Ed.), Encyclopedia of educational

philosophy and theory (pp. 793–798). Singapore: Springer.

Messick, S. (1995). Validation of inferences from persons’ responses and performances as scientific inquiry into score

meaning. American Psychologist, 50, 741–749.

Karami, H., & Mok, M. M. C. (Eds.). (2013). (Eds). Fairness issues In educational assessment [Special issue]. Educational

Research and Evaluation, 19(2–3).

Laundra, K., & Sutton, T. (2008). You think you know ghetto? Contemporizing the Dove “Black IQ Test.” Teaching

Sociology, 36, 366–377.

Kaufman, A. S. (1990). Assessing adolescent and adult intelligence. Boston: Allyn & Bacon.

Kline, R. B. (1998). [Review of the software Kaufman WISC-III Integrated Interpretive System (K-WIIS, Version 1.00), by

A. S. Kaufman, N. L. Kaufman, E. H. Doughterty, & K. S. C. Tuttle]. Journal of Psychoeducational Assessment, 16,

365–384.

Hunsley, J., Lee, C, M., Wood, J. M., & Taylor, W. (2015). Controversial and questionable assessment techniques. In S.

O. Lilienfeld, S. J. Lynn, & J. M. Lohr (Eds.), Science and pseudoscience in clinical psychology (2nd ed.) (pp.

42–82). New York: Guilford.

Wood, J. M., Nezworski, M. T., Lilienfeld, S. O., & Garb, H. N. (2003). What's wrong with the Rorschach: Science

confronts the controversial inkblot test. San Francisco: Jossey-Bass.

5

Slide details

Introduction

CTT

Scores

Reliability

Validity

Items

Ability

Translation

Bias

Ethics

6

Introduction

E.g., WISC-V

7

8

9

10

11

12

13

14

16

CTT

17

18

http://legisquebec.gouv.qc.ca/en/

19

20

Scores

112 100.0

E.g., .815.0

z

X

M, SD

21

22

http://onlinestatbook.com/2/calculators/normal_dist.html

23

Grade M

2.0 15.0

3.0 25.0

4.0 30.0

24

Grade M Interpolation

2.0 15

2.1 16

2.2 17

2.3 18

2.4 19

2.5 20

2.6 21

2.7 22

2.8 23

2.9 24

3.0 25

25

Grade M Interpolation

3.0 25

3.1 25.5

3.2 26

3.3 26.5

3.4 27

3.5 27.5

3.6 28

3.7 28.5

3.8 29

3.9 29.5

4.0 30

26

RIQ = MA

100CA

E.g., CA = 10.0, MA = 12.5

RIQ = 12.5

100 12510.0

27

28

29

30

Example: z = .5

Type M SD

IQ (W) 100 15

IQ (SB) 100 16

Subtest 10 3

Type M SD

T 50 10

CEEB 500 100

Stanine 5 2

Sten 5.5 2

NCE 50 21.06

31

32

Case

VIQ = 115, 84.1 %tile

RC = 95, 36.9 %tile

SB5

Fluid

Knowledge

Quantitative

Visual-Spatial

Working Memory

Bender-Gestalt Visual-Motor Test

33

Reliability

34

35

X = t + e

t

36

X = t + e

e > 0, X > t

e < 0, X < t

e = 0, X = t

2 2 2s s sX t e

2

2

s

st

XX

X

r 2

2

s1

se

XX

X

r XX Xtr r

37

Test-retest rtt

Interrater (interscorer) r

38

Mulsant et al. (2002)

Reported (%)…

No. raters, 17; Training, 10; Interrater, 14–22; Drift, 5

Alternate-form r11

39

12-item test

1st half–2nd half

tot1/2 = ∑ (i1 – i6)

tot2/2 = ∑ (i7 – i12)

Odd–even

toto = ∑ (i1, i3, i5, i7, i9, i11)

tote = ∑ (i2, i4, i6, i8, i10, i12)

1 12!

4622 6! 6!

1 30!77,558,760

2 15! 15!

40

Split-half rhh

2 2(.70).82

1 1 .70hh

S B

hh

rr

r

41

α, rK-R20

0

ˆ 0

0

ijr

Reverse coding

0 = disagree, 1 = uncertain, 2 = agree

1. My general health is good.

2. I often feel unhealthy.

3. I worry little about my health.

42

0 = disagree, 1 = uncertain, 2 = agree

1. My general health is good.

3. I worry little about my health.

2 = disagree, 1 = uncertain, 0 = agree

2. I often feel unhealthy.

ijn r

2 2

21

t i

t

s sn

n s

2

20 21

t i i

K-R

t

s p qnr

n s

43

BDI-II Example

44

45

46

WAIS-IV Vocabulary

rtt = .89 1 − .89 = .11 α = .94 1 − .94 = .06

r = .95 1 − .95 = .05

∑ 1 − .22 = .78

S-B revisited

1 ( 1)XX

S-B

XX

k rr

k r

new

old

nk

n

47

E.g., n = 80, rXX = .75

Reduce to 5

k = 5/80 = .0625

.0625 (.75)

.161 (.0625 1) (.75)

S-Br

E.g., n = 10, rXX = .55

Double length

k = 20/10 = 2

2(.55)

.711 (2 1).55

S-Br

48

E.g., rXX = .35, rS-B = .90

Find k

(.35).90

1 ( 1).35

k

k

k = ? .90 .35 /[1 ( 1)(.35)]k k

.90[1 ( 1)(.35)] .35k k

.90[(1 .35 .35)] .35k k

.90 .315 .315 .35k k

.90 .315 .35 .315k k

.585 .035k .585 / .035 k

k = 16.71

49

+SDe

X = t + e

t

SEM

50

SEM 1t XX

SD r

If rXX = 1, then

SEM = 0

If rXX = 0, then

SEM = SDt

E.g., rXX = .80, SDt = 15.0

SEM 15.0 1 .80 6.71

E.g., X = 92

95% CI

X ± SEM (z.05)

z.05 = 1.96

92 ± 6.71 (1.96)

92 ± 13.15, [78.85, 105.15]

51

T′ = rXX (X – M) + M

E.g., X = 92,

rXX = .80, M = 100.0

T′ = .80 (92 – 100.0) + 100.0

= 93.6

93.6 ± 6.71 (1.96)

[80.45, 106.75]

52

Validity Kane (2013)

1. Context of use

2. Score interpretation

3. Evidence needed

Evidence

1. Test content

2. Internal structure

3. Covariance

4. Response process

53

Test specifications

Input (content)

Operations

Output

Number of items

Time limits

Difficulty

E.g., real estate law

54

E.g., Grade 6 biology

55

E.g., Revised SAT

https://collegereadiness.collegeboard.org/pdf/test-

specifications-redesigned-sat-1.pdf

56

57

58

E.g., KABC-I

8 subtests

2 factors

Sequential scale

Hand Movements, HM

Number Recall, NR

Word Order, WO

Simultaneous scale

Gestalt Closure, GC

Triangles, Tr

Spatial Memory, SM

Matrix Analogies, MA

Photo Series, PS

59

60

61

62

X 1

2

14 16 18 20 22

Y

8

0

90

10

0

110

12

0

13

05

63

ˆ 2.5 40.0Y X Given X = 17, no Y

ˆ 82.5Y

ˆ (1.96)est

Y SE

X

12 14 16 18 20 22

Y

8

0

90

10

0

110

12

0

13

05

64

21

est Y XYSE SD r

If rXY = 1, then

SEest = 0

If rXY = 0, then

SEest = SDY

E.g., rXY = .60, SDY = 7.5 ˆ 82.5Y

27.5 1 .60 6.0

estSE

95% CI

65

Y ± SEest (z.05)

82.5 ± 6.0 (1.96)

82.5 ± 11.76

[70.74, 94.26]

E.g., rXX = .70, rYY = .10

max | | .70 .10 .26XYr

−.26 ≤ rXY ≤ .26

66

Jingle

Same name, must be same

Jangle

Different name, must be different

X1, X2

Same trait, same method

r12 = .60…?

X1, X2

Same trait, Δ methods

r12 = .10

67

But … X1, X2

Same trait, “leadership”

Δ methods

r12 = .75

Both measure leadership (no)

X1, X2

Δ traits, same method

r12 = .60…?

X1, X2

Δ traits, Δ methods

r12 = .10

68

69

70

71

72

Items Rate your success in life

0 1 2 3 4 5 6 7 8 9 10

Not at all

successful

Extremely

successful

−5 −4 −3 −2 −1 0 1 2 3 4 5

Not at all

successful

Extremely

successful

73

Schwarz et al. (1991)

0 to 10

0 to 5: 30%

Unipolar (S degree)

−5 to 5

−5 to 0: 15%

Bipolar (F to S)

Which of the following describes you best?

a. I am outgoing

b. I work hard

74

Item statistics

Difficulty (p)

Discrimination (D)

Item-total, rit

2

is = p(1 – p)

max 2

is , p = .5

75

76

D = p (U) – p (L)

E.g., p (U) = .65, p (L) = .25

D = .65 − .25 = .40

Kelley (1939)

U: ↑ 27%

L: ↓ 27%

max 2

Ds

Item-total, rit

Keep rit > 0

Drop rit < 0

rit > 0

↑ ijr , ↑ α

rit < 0

↓ ijr , ↓ α, maybe < 0

77

78

79

80

81

82

83

84

85

86

ψ 315 Outcome

Test score OK Not

80–100% 88% 12%

40–79 77 23

0–39 55 45

87

Guessing

Difficulty

ICC

Discrimination

3.0 2.0 1.0 0 −1.0 −2.0 −3.0

Latent Ability (θ)

Pro

ba

bili

ty o

f C

orr

ec

t

Re

spo

nse

.9

.8

.7

.6

.5

.4

.3

.2

.1

0

1.0

88

Ability

89

Ontario

A learning disorder evident in both academic and social situations that involves one or

more of the processes necessary for the proper use of spoken language or the symbols of

communication, and that is characterized by a condition that

a. is not primarily the result of

impairment of vision;

impairment of hearing;

physical handicap;

mental retardation;

primary emotional disturbance; or

cultural difference; and

b. results in a significant discrepancy between academic achievement and assessed

intellectual ability, with defects in one or more of:

receptive language (i.e., listening, reading);

language processing (i.e., thinking, conceptualizing, integrating);

expressive language (i.e., talking, spelling, writing);

mathematical computations; and

90

c. may be associated with one or more conditions diagnosed as:

a perceptual handicap;

a brain injury;

minimal brain dysfunction;

dyslexia; or

developmental aphasia.

91

≥ 130 Very superior

120–129 Superior

110–119 High average

0–109 Average

80–89 Low average

70–79 Low (Borderline)

< 70 Very low

92

MI IQ

Visual-spatial Linguistic Logical-

mathematical

Bodily-kinesthetic Musical Interpersonal Intrapersonal Naturalistic

93

Got Got Got Got

Hero Hero

Aid Aid Aid Aid

× 3

94

WISC-V

6.0–16.11 yrs.

Full Scale (Short)

Primary Index (5)

Ancillary Index (5)

Complementary Index (3)

95

Full Scale (Short)

96

Primary (5)

Verbal Comprehension

Fluid Reasoning

Visual-Spatial

Working Memory

Processing Speed

97

Ancillary (5)

Quantitative Reasoning

Auditory Working Memory

Nonverbal

General Ability

Cognitive Proficiency

98

Complementary Index (3)

Naming Speed

Symbol Translation

Storage and Retrieval

99

Cormier et al. (2016)

N = 880 (2,200)

80, 11 age grps.

6-0 to 16-11 yrs.

English

QC, E areas only

WISC-VCDN-F

https://www.pearsonclinical.ca/content/dam/school

/global/clinical/canada/programs/WISC5-FR/WISC-

V-CDN-FR_FAQ.pdf

100

KABC-I, 1984, 2.5 to 12.5 yrs.

Sequential

Simultaneous

MPC

Achievement

101

Translation Ercikan et al. (2004)

DIF, 18–36% items

40%, translation

30%, curriculum

102

Bias https://news.slashdot.org/story/19/05/16/2138210/sat-to-add-adversity-score-that-rates-students-hardships

104

Bias types

Content

Predictive

Construct

Language

Wording

Idioms, slang

105

Laundra & Sutton (2008)

1) Translate this phrase: “Jet to the Jects.”

a. Run home

b. Walk to the store

c. Go to the house of your significant other

d. Go to the projects

5) A “blunthead” is a:

a. Brother or male cousin

b. Person who is mentally ill

c. Pencil or pen

d. Person who smokes a lot of marijuana

6) What is “cakin’ it” ?

a. Arguing

b. Making cornbread

c. Being lovey-dovey with your boyfriend or girlfriend

d. Making pancakes

106

8) One of these things is not like the other. Which word is out of place?

a. Shawdy

b. Ma

c. Shorty

d. Boss

9) Who were the rappers involved in the first and most famous Rap rival?

a. Jay-Z and Nas

b. Ja Rule and DMX

c. Biggie and Tupac

d. Eminem and Benzino

10) What is “gwap”?

a. A term used to refer to money

b. A term used to refer to male genitalia

c. A term used to refer to nice shoes

d. Another name for a college or university

13) Being “boo’d up” means that you are:

a. Cool

b. Spending time with your boyfriend or girlfriend

c. Constipated

d. Being ridiculed in public

107

108

Ethics Kaufman (1990)

Agree with model (%)

Experienced, 32

Inexperienced, 35

Range for IQ = 110

Experienced, 107–115

Inexperienced, 108–117

109

K-WIIS report sections

1. Referral & background

2. Physical observations

3. Test behaviors

4. WISC-III scaled scores

5. WISC-III IQ, Index scores

6. IQ-ACH Δs

Observed Δ

IQ – YACH

Predicted Δ

YACH, rYY

YACH - ACH

Y

110

Example of a K-WIIS Report

WISC-III Interpretive Report

Name: Tony Date of Evaluation: [omitted]

Date of Birth: [omitted] Grade: First

Chronological Age: 7 yrs. 6 mos. Examiner: Kline

Referral Information

Tony was referred for evaluation by his teacher because of learning problems in

school and attention and concentration difficulties. The main goals of this evaluation

were to answer the following questions: Is Tony in an appropriate classroom setting? Are

special education services recommended for Tony? Should Tony be monitored for future

developments?

Tony is a Caucasian male, age 7 years and 6 months. The background information

presented here about Tony is primarily based on reports from his mother and also from his

teacher. Tony's parents immigrated to this country. His parents are bilingual and

Italian is spoken in Tony's home. He lives with his biological parents and is the only

child in his residence. His family economic status is working class. As a child, his

home environment was average, that is, neither impoverished nor enriched. Cultural

opportunities at home (e.g., availability of books, family trips to museums) are average,

neither inadequate nor excellent. Both Tony's mother and his father graduated from high

school.

111

WISC-III Psychometric Summary IQ 90% Factor Index 90% IQs Score %ile C.I. Indexes Score %ile C.I. --------------------------------- ---------------------------------- VIQ 88 21 83- 94 VC 92 30 87- 98 PIQ 121 92 112-126 PO 130 98 120-134 FSIQ 104 61 99-109 FD 72 3 68- 83 PS 83 13 77- 94 Scaled Scaled Verbal Score %ile Performance Score %ile --------------------------------- ---------------------------------- Information 9 37 Picture Completion 11 63 Similarities 8 25 Coding 7-W 16 Arithmetic 5 5 Picture Arrangement 18-S >99 Vocabulary 7 16 Block Design 13 84 Comprehension 10 50 Object Assembly 17-S 99 (Digit Span) 5 5 (Symbol Search) 6-W 9 --------------------------------- ----------------------------------

Tony was given the WISC-III, a test that evaluates the present level of intellectual functioning of children and adolescents. He scored in the Average range of intelligence, earning a Full Scale IQ of 104. Tony's overall performance on the WISC-III ranks him at the 61st percentile relative to other 7-year-olds. The chances are very good (about 19 out of 20) that Tony's true Full Scale IQ is likely to fall between 99 and 109. For Tony, however, the Full Scale IQ is not meaningful because he displayed a striking discrepancy between his verbal and nonverbal intelligence.

Tony's Performance IQ of 121 is significantly and strikingly higher than his Verbal

IQ of 88. His High Average to Superior PIQ (92nd percentile), when compared to his Low Average to Average VIQ (21st percentile), suggests that his intelligence on these two scales is inconsistent.

Tony's strikingly lower verbal abilities may be related to his referral for a

possible learning problem, bilingual parents, reported weakness in vocabulary and verbal expression, and delayed social development.

112

Recommendations Tony has been referred for school learning problems and may require remediation. If so, then the following suggestions may prove beneficial for Tony: A. Individualize each area of instruction so that Tony is taught at the appropriate readiness level for each different skill. B. Teach to Tony's tolerance level and avoid pushing beyond. (For example, help teacher pinpoint his threshold level and stay at it.) C. Begin new tasks only when you know Tony is not tired and is "ready" to learn. Tony had a weakness in auditory and visual short term memory. To help Tony with his memory problem: A. Employ distributed review: space out the demands for practice of new skills. This will avoid boredom and fatigue, and provide for overlearning and the development of automatic skills. (This is especially useful for math difficulties and word recognition skills.) Suggest ten 3 minute sessions if a half hour's work is necessary. B. Provide Tony with a written list of reminders. Tony can check each one that is completed off the list as he goes along. C. Follow a predictable routine, so Tony won't have to learn new formats for completing work successfully. This will help preset expectations and reduce memory load. D. Don't accept the excuse of "I forgot" to allow Tony to avoid assigned homework or chores. Have him complete the assignment when reminded.

113

http://www.quackwatch.com

114

MMPI-2

L, F, K

Hypochondriasis, Hs

Depression, D

Hysteria, Hy

Psychopath. Deviate, Pd

Masc./Fem., Mf

Paranoia, Pa

Psychasthenia, Pt

Schizophrenia, Sc

Hypomania, Ma

Social Introversion, Si

115

116

Assessment in Education

Practical Assessment, Research & Evaluation

Assessment & Evaluation in Higher Education

Educational Assessment

Educational Assessment, Evaluation and Accountability

Educational Evaluation and Policy Analysis

Educational Measurement: Issues and Practice

Journal of Educational Measurement

International Journal of Educational and Psychological

Assessment

Educational and Psychological Measurement

Applied Psychological Measurement

Psychometrika

Journal of Technology, Learning and Assessment

Assessing Writing

Language Testing

Language Assessment Quarterly

top related