reliability, validity, generalizability and the use of multi-item scales

24
Reliability, validity, generalizability and the use of multi-item scales Edward Shiu (Dept of Marketing) [email protected] Reliable? Valid? Generalizable?

Upload: dakter-cmc

Post on 01-Nov-2014

2.021 views

Category:

Technology


4 download

DESCRIPTION

Reliability, validity, generalizability and the use of multi-item scales

TRANSCRIPT

Page 1: Reliability, validity, generalizability and the use of multi-item scales

Reliability, validity,

generalizability and the use of

multi-item scales

Edward Shiu (Dept of Marketing)

[email protected]

Reliable? Valid?

Generalizable?

Page 3: Reliability, validity, generalizability and the use of multi-item scales
Page 4: Reliability, validity, generalizability and the use of multi-item scales

How to use a questionnaire from

published work

• Appendix with items

• Methodology section

Page 5: Reliability, validity, generalizability and the use of multi-item scales

Existing multi-item scales

• Used by many

• Reliability and validity may be known

• Good starting block

• Basis to compare / contrast results

Page 6: Reliability, validity, generalizability and the use of multi-item scales

Development of a Multi-item Scale (Doing it the HARD way!! See Malhotra & Birks, 2007)

Develop Theory

Generate Initial Pool of Items: Theory, Secondary Data, and Qualitative Research

Collect Data from a Large Pretest Sample

Statistical Analysis

Develop Purified Scale

Collect More Data from a Different Sample

Final Scale

Select a Reduced Set of Items Based on Qualitative Judgment

Evaluate Scale Reliability, Validity, and Generalizability

Page 7: Reliability, validity, generalizability and the use of multi-item scales

Example of Scale Development

• See Richins & Dawson (1992) “A Consumer

Values Orientation for Materialism and its

Measurement: Scale Development and

Validation,” Journal of Consumer Research, 19

(December), 303-316.

• Materialism scale (7 items)

– Marketing Scales Handbook (Vol IV) p. 352.

1. It is important to me to have really nice things.

2. I would like to be rich enough to buy anything I want.

3. I‟d be happier if I could afford to buy more things.

4. ......

• Note, published scales not always perfect!!!

Page 8: Reliability, validity, generalizability and the use of multi-item scales

Scale Evaluation (See Malhotra & Birks, 2007)

Discriminant Nomological Convergent

Test/ Retest

Alternative Forms

Internal Consistency

Content Criterion Construct

Generalizability Reliability Validity

Scale Evaluation

Page 9: Reliability, validity, generalizability and the use of multi-item scales

Reliability & Validity

• Reliability - extent a measuring

procedure yields consistent results on

repeated administrations of the scale

• Validity - degree a measuring

procedure accurately reflects or assesses

or captures the specific concept that the

researcher is attempting to measure

Reliable Valid

Page 10: Reliability, validity, generalizability and the use of multi-item scales

Reliability • Internal consistency reliability

DO THE ITEMS IN THE SCALE GEL WELL TOGETHER

• Split-half reliability, the items on the scale are divided

into two halves and the resulting half scores are

correlated

• Cronbach alpha (α)

– average of all possible „split-half‟ correlation coefficients resulting

from different ways of splitting the scale items

– value varies from 0 to 1

– α < 0.6 indicates unsatisfactory internal consistency reliability

(see Malhotra & Birks, 2007, p.358)

– Note: alpha tends to increase with an increase in the number of

items in scale

Page 11: Reliability, validity, generalizability and the use of multi-item scales

• test-retest reliability – identical scale items administered at two different

times to same set of respondents

– assess (via correlation) if respondents give similar answers

• alternative-forms reliability – two equivalent forms of the scale are constructed

– same respondents are measured at two different times, with a different form being used each time

– assess (via correlation) if respondents give similar answers

– Note. Hardly ever practical

Page 12: Reliability, validity, generalizability and the use of multi-item scales

Construct Validity

• Construct validity is evidenced if we can establish – convergent validity, discriminant validity and nomological validity

• Convergent validity extent to which scale correlates positively with other measures of the same construct

• Discriminant validity extent to which scale does not correlate with other conceptually distinct constructs

• Nomological validity extent to which scale correlates in theoretically predicted ways with other distinct but related constructs.

• Also read Malhotra & Birks, 2007, 358-359 on – content (or face) validity, criterion (concurrent & predictive)

validity

Page 13: Reliability, validity, generalizability and the use of multi-item scales

Generalizability

• Refers to extent you can generalise from

your specific observations to beyond your

limited study, situation, items used,

method of administration, context.....

• Hardly even possible!!!

Page 14: Reliability, validity, generalizability and the use of multi-item scales

Fun time

• Now onto the data (COCB.sav) !!!!!!

• Read my forthcoming JBR article for

background on COCB and the scale

• 1st SPSS and Cronbach alpha

• Next, Amos and CFA

• Followed by Excel to calculate

composite/construct reliability and AVE, as

well as establish discriminant validity

Page 15: Reliability, validity, generalizability and the use of multi-item scales

Cronbach alpha (α)

• SPSS (Analyze…Scale…Reliability Analysis)

• α < 0.6 indicates unsatisfactory internal consistency reliability (see Malhotra & Birks, 2007, p.358)

• α > 0.7 indicates satisfactory internal consistency reliability (Nunnally & Berstein,1994)

Ref: Nunnally JC & Berstein IH. (1994) Psychometric Theory. New York: McGraw-Hill.

Page 16: Reliability, validity, generalizability and the use of multi-item scales
Page 17: Reliability, validity, generalizability and the use of multi-item scales

SPSS output for α

Alpha value for dimension Credibility = 0.894 > 0.7 hence satisfactory

Page 18: Reliability, validity, generalizability and the use of multi-item scales

SPSS further output for α

• We note that alpha value for the Credibility dimension would increase in value (from 0.894 to 0.902) if item cred4 is removed.

• However, unless the improvement is dramatic AND there is separate reasons (e.g. similar findings from other studies), then we should leave the item as part of the dimension.

Page 19: Reliability, validity, generalizability and the use of multi-item scales

Limitations for Cronbach alpha

• We should employ multiple measures of

reliability (Cronbach alpha, composite/construct

reliability CR & Average Variance Extracted

AVE)

– Alpha and CR values often are very similar

but AVE‟s can vary much more from alpha

values

– AVE‟s are also used to assess construct

discriminant validity

Page 20: Reliability, validity, generalizability and the use of multi-item scales

Composite/Construct Reliability • CR = {(sum of standardized loadings)2} / {(sum of

standardized loadings)2 + (sum of indicator measurement errors)}

• AVE = Average Variance Extracted = Variance Extracted

= {sum of (standardzied loadings squared)} / {[sum of (standardzied loadings squared)] + (sum of indicator measurement errors)}

• Note: Recommended thresholds: CR > 0.6 & AVE > 0.5, then construct internal consistency is evidenced (Fornell & Larker, 1981).

Ref: Fornell, Claes and David G. Larcker (1981). “Evaluating Structural

Equation Models with Unobservable Variables and Measurement Error,” Journal of Marketing Research, 18(1, February): 39-50.

Page 21: Reliability, validity, generalizability and the use of multi-item scales

Discriminant validity

• Discriminant validity is assessed by comparing

the shared variance (squared correlation)

between each pair of constructs against the

minimum of the AVEs for these two constructs.

• If within each possible pairs of constructs, the

shared variance observed is lower than the

minimum of their AVEs, then discriminant validity

is evidenced (Fornell and Larker, 1981).

Page 22: Reliability, validity, generalizability and the use of multi-item scales

Amos (Analysis of Moment Structures)

Commcomm2e2

1

comm1e3 11

Bene

bene3e4

bene2e5

bene1e6

1

1

11

Cred

cred3e8

cred2e9

cred1e10

cred4e11

1

1

1

1

1

COCB

ave_SSI e12

ave_POC e13

ave_Voice e14

ave_wom e15

1

1

1

1

1

ave_BAoSF e161

ave_DoRA e171

ave_Flex e181

ave_PiFA e191

Loyalty

loy1

e22

1

1

loy2

e231

loy3

e241

Rectangles

= observed variables

Ellipses

= unobserved variables

loy1; loy2; loy3; comm1;

comm2;….; cred1; ….

bene1;....;ave_PiFA

= SPSS variables

e1 to e24

= error variances

= uniqueness

Loyalty; Comm; Cred;

Bene; COCB

= latent factors

= unobserved factors

Page 23: Reliability, validity, generalizability and the use of multi-item scales

CFA and goodness of fit

• See Hair et al.‟s book

• E.g.,

• The CFA resulted in an acceptable overall fit

(GFI=.90, CFI=.94, TLI=.92, RMSEA=.068, and

χ2=524.64, df=160, p<.001). All indicators load

significantly (p<.001) and substantively

(standardized coef >.5) on to their respective

constructs; thus providing evidence of

convergent validity.

Page 24: Reliability, validity, generalizability and the use of multi-item scales

Refs

• Baumgartner H, Homburg C. (1996). “Applications of structural

equation modeling in marketing and consumer research: a review,”

International Journal of Research in Marketing,13(2):139–61.

• Churchill, Gilbert A., Jr. (1979). “A Paradigm for Developing Better

Measures of Marketing Constructs,” Journal of Marketing Research,

16(1, February): 64-73.

• Fornell, Claes and David G. Larcker (1981). “Evaluating Structural

Equation Models with Unobservable Variables and Measurement

Error,” Journal of Marketing Research, 18(1, February): 39-50.

• Hair, Joseph F., Jr., Rolph E. Anderson, Ronald L. Tatham, and

William C. Black (1998), Multivariate Data Analysis. 5th ed.

Englewood Cliffs, NJ: Prentice Hall.

• Nunnally JC & Berstein IH. (1994) Psychometric Theory. New York:

McGraw-Hill.