reliability, validity, generalizability and the use of multi-item scales
Post on 01-Nov-2014
2.021 Views
Preview:
DESCRIPTION
TRANSCRIPT
Reliability, validity,
generalizability and the use of
multi-item scales
Edward Shiu (Dept of Marketing)
edward.shiu@strath.ac.uk
Reliable? Valid?
Generalizable?
Multi-item scales
How to use a questionnaire from
published work
• Appendix with items
• Methodology section
Existing multi-item scales
• Used by many
• Reliability and validity may be known
• Good starting block
• Basis to compare / contrast results
Development of a Multi-item Scale (Doing it the HARD way!! See Malhotra & Birks, 2007)
Develop Theory
Generate Initial Pool of Items: Theory, Secondary Data, and Qualitative Research
Collect Data from a Large Pretest Sample
Statistical Analysis
Develop Purified Scale
Collect More Data from a Different Sample
Final Scale
Select a Reduced Set of Items Based on Qualitative Judgment
Evaluate Scale Reliability, Validity, and Generalizability
Example of Scale Development
• See Richins & Dawson (1992) “A Consumer
Values Orientation for Materialism and its
Measurement: Scale Development and
Validation,” Journal of Consumer Research, 19
(December), 303-316.
• Materialism scale (7 items)
– Marketing Scales Handbook (Vol IV) p. 352.
1. It is important to me to have really nice things.
2. I would like to be rich enough to buy anything I want.
3. I‟d be happier if I could afford to buy more things.
4. ......
• Note, published scales not always perfect!!!
Scale Evaluation (See Malhotra & Birks, 2007)
Discriminant Nomological Convergent
Test/ Retest
Alternative Forms
Internal Consistency
Content Criterion Construct
Generalizability Reliability Validity
Scale Evaluation
Reliability & Validity
• Reliability - extent a measuring
procedure yields consistent results on
repeated administrations of the scale
• Validity - degree a measuring
procedure accurately reflects or assesses
or captures the specific concept that the
researcher is attempting to measure
Reliable Valid
Reliability • Internal consistency reliability
DO THE ITEMS IN THE SCALE GEL WELL TOGETHER
• Split-half reliability, the items on the scale are divided
into two halves and the resulting half scores are
correlated
• Cronbach alpha (α)
– average of all possible „split-half‟ correlation coefficients resulting
from different ways of splitting the scale items
– value varies from 0 to 1
– α < 0.6 indicates unsatisfactory internal consistency reliability
(see Malhotra & Birks, 2007, p.358)
– Note: alpha tends to increase with an increase in the number of
items in scale
• test-retest reliability – identical scale items administered at two different
times to same set of respondents
– assess (via correlation) if respondents give similar answers
• alternative-forms reliability – two equivalent forms of the scale are constructed
– same respondents are measured at two different times, with a different form being used each time
– assess (via correlation) if respondents give similar answers
– Note. Hardly ever practical
Construct Validity
• Construct validity is evidenced if we can establish – convergent validity, discriminant validity and nomological validity
• Convergent validity extent to which scale correlates positively with other measures of the same construct
• Discriminant validity extent to which scale does not correlate with other conceptually distinct constructs
• Nomological validity extent to which scale correlates in theoretically predicted ways with other distinct but related constructs.
• Also read Malhotra & Birks, 2007, 358-359 on – content (or face) validity, criterion (concurrent & predictive)
validity
Generalizability
• Refers to extent you can generalise from
your specific observations to beyond your
limited study, situation, items used,
method of administration, context.....
• Hardly even possible!!!
Fun time
• Now onto the data (COCB.sav) !!!!!!
• Read my forthcoming JBR article for
background on COCB and the scale
• 1st SPSS and Cronbach alpha
• Next, Amos and CFA
• Followed by Excel to calculate
composite/construct reliability and AVE, as
well as establish discriminant validity
Cronbach alpha (α)
• SPSS (Analyze…Scale…Reliability Analysis)
• α < 0.6 indicates unsatisfactory internal consistency reliability (see Malhotra & Birks, 2007, p.358)
• α > 0.7 indicates satisfactory internal consistency reliability (Nunnally & Berstein,1994)
Ref: Nunnally JC & Berstein IH. (1994) Psychometric Theory. New York: McGraw-Hill.
SPSS output for α
Alpha value for dimension Credibility = 0.894 > 0.7 hence satisfactory
SPSS further output for α
• We note that alpha value for the Credibility dimension would increase in value (from 0.894 to 0.902) if item cred4 is removed.
• However, unless the improvement is dramatic AND there is separate reasons (e.g. similar findings from other studies), then we should leave the item as part of the dimension.
Limitations for Cronbach alpha
• We should employ multiple measures of
reliability (Cronbach alpha, composite/construct
reliability CR & Average Variance Extracted
AVE)
– Alpha and CR values often are very similar
but AVE‟s can vary much more from alpha
values
– AVE‟s are also used to assess construct
discriminant validity
Composite/Construct Reliability • CR = {(sum of standardized loadings)2} / {(sum of
standardized loadings)2 + (sum of indicator measurement errors)}
• AVE = Average Variance Extracted = Variance Extracted
= {sum of (standardzied loadings squared)} / {[sum of (standardzied loadings squared)] + (sum of indicator measurement errors)}
• Note: Recommended thresholds: CR > 0.6 & AVE > 0.5, then construct internal consistency is evidenced (Fornell & Larker, 1981).
Ref: Fornell, Claes and David G. Larcker (1981). “Evaluating Structural
Equation Models with Unobservable Variables and Measurement Error,” Journal of Marketing Research, 18(1, February): 39-50.
Discriminant validity
• Discriminant validity is assessed by comparing
the shared variance (squared correlation)
between each pair of constructs against the
minimum of the AVEs for these two constructs.
• If within each possible pairs of constructs, the
shared variance observed is lower than the
minimum of their AVEs, then discriminant validity
is evidenced (Fornell and Larker, 1981).
Amos (Analysis of Moment Structures)
Commcomm2e2
1
comm1e3 11
Bene
bene3e4
bene2e5
bene1e6
1
1
11
Cred
cred3e8
cred2e9
cred1e10
cred4e11
1
1
1
1
1
COCB
ave_SSI e12
ave_POC e13
ave_Voice e14
ave_wom e15
1
1
1
1
1
ave_BAoSF e161
ave_DoRA e171
ave_Flex e181
ave_PiFA e191
Loyalty
loy1
e22
1
1
loy2
e231
loy3
e241
Rectangles
= observed variables
Ellipses
= unobserved variables
loy1; loy2; loy3; comm1;
comm2;….; cred1; ….
bene1;....;ave_PiFA
= SPSS variables
e1 to e24
= error variances
= uniqueness
Loyalty; Comm; Cred;
Bene; COCB
= latent factors
= unobserved factors
CFA and goodness of fit
• See Hair et al.‟s book
• E.g.,
• The CFA resulted in an acceptable overall fit
(GFI=.90, CFI=.94, TLI=.92, RMSEA=.068, and
χ2=524.64, df=160, p<.001). All indicators load
significantly (p<.001) and substantively
(standardized coef >.5) on to their respective
constructs; thus providing evidence of
convergent validity.
Refs
• Baumgartner H, Homburg C. (1996). “Applications of structural
equation modeling in marketing and consumer research: a review,”
International Journal of Research in Marketing,13(2):139–61.
• Churchill, Gilbert A., Jr. (1979). “A Paradigm for Developing Better
Measures of Marketing Constructs,” Journal of Marketing Research,
16(1, February): 64-73.
• Fornell, Claes and David G. Larcker (1981). “Evaluating Structural
Equation Models with Unobservable Variables and Measurement
Error,” Journal of Marketing Research, 18(1, February): 39-50.
• Hair, Joseph F., Jr., Rolph E. Anderson, Ronald L. Tatham, and
William C. Black (1998), Multivariate Data Analysis. 5th ed.
Englewood Cliffs, NJ: Prentice Hall.
• Nunnally JC & Berstein IH. (1994) Psychometric Theory. New York:
McGraw-Hill.
top related