1979 churchill a paradigm for developing better constructs in marketing
Post on 20-Jul-2016
Embed Size (px)
Measure and Construct Validity Studies
GILBERT A. CHURCHILL, JR,*
A critical element in the evolution of a fundamental body of knowledgein marketing, as well as for improved marketing practice, is the developmentof better measures of the variables with which marketers work. In this articlean appraach is outlined by which this goal can be achieved and portions
of the approach are illustrated in terms of a job satisfaction measure.
A Paradigm for Developing Better Measuresof Marketing Constructs
In an article in the April 1978 issue of the Journalof Marketing, Jacoby placed much of the blame forthe poor quality of some of the marketing literatureon the measures marketers use to assess their variablesof interest (p. 91):
More stupefying than the sheer number of our measuresis the ease with which they are proposed and theuncritical manner in which they are accepted. In pointof fact, mostof our measures are only measures becausesomeone says that they are, not because they havebeen shown to satisfy standard measurement criteria(validity, reliability, and sensitivity). Stated somewhatdifferently, most of our measures are no more sophisti-cated than first asserting that the number of pebblesa person can count in a ten-minute period is a measureof that person's intelligence; next, conducting a studyand fmding that people who can count many pebblesin ten minutes also tend to eat more; and, finally,concluding from this: people with high intelligence tendto eat more.
Gilbert A. Churchill is Professor of Marketing, University ofWisconsin-Madison. The significant contributions of Michael Hous-ton, Shelby Hunt, John Nevin, and Michael Rothschild throughtheir comments on a draft of this article are gratefully acknowledged.as are the many helpful comments of the anonymous reviewers.
The AMA publications policy states: "No articie(s) will bepublished in the Journal of Marketing Research written by the Editoror the Vice President of Publications." The inclusion of this articlewas approved by the Board of Directors because: (1) the articlewas submitted before the author took over as Editor, (2) ihe authorplayed no part in its review, and (3) Michael Ray, who supervisedthe reviewing process for the special issue, formally requestedhe be allowed to publish it.
Burleigh Gardner, President of Social Research,Inc., makes a similar point with respect to attitudemeasurement in a recent issue of the Marketing News(May 5, 1978, p. 1):
Today the social scientists are enamored of numbersand counting. . .. Rarely do they stop and ask, "Whatlies behind the numbers?"When we talk about attitudes we are talking aboutconstructs of the mind as they are expressed in responseto our questions.But usually all we really know are the questions weask and the answers we get.
Marketers, indeed, seem to be choking on theirmeasures, as other articles in this issue attest. Theyseem to spend much effort and time operating bythe routine which computer technicians refer to asGIGOgarbage in, garbage out. As Jacoby so suc-cinctly puts it, "What does it mean if a finding issignificant or that the ultimate in statistical analyticaltechniques have been applied, if the data collectioninstrument generated invalid data at the outset?'' (1978,p. 90).
What accounts for this gap between the obviousneed for better measures and the lack of such mea-sures? The basic thesis of this article is that althoughthe desire may be there, the know-how is not. Thesituation in marketing seems to parallel the dilemmawhich psychologists faced more than 20 years ago,when Tryon (1957, p. 229) wrote:
If an investigator should invent a new psychologicaltest and then turn to any recent scholarly work for
Journal of Marketing ResearchVol. XVI (February 1979). 64-73
PARADIGM FOR DEVELOPING MEASURES OF MARKETING CONSTRUCTS 65
guidance on how to determine its reliability, he wouldconfront such an array of different formulations thathe would be unsure about how to proceed. After fiftyyears of psychological testing, the problem of discover-ing the degree to which an objective measure ofbehavior reliably differentiates individuals is still con-fused.
Psychology has made progress since that time.Attention has moved beyond simple questions ofreliability and now includes more "direct" assess-ments of validity. Unfortunately, the marketing litera-ture has been slow to reflect that progress. One ofthe main reasons is that the psychological literatureis scattered. The notions are available in many bitsand pieces in a variety of sources. There is nooverriding framework which the marketer can embraceto help organize the many definitions and measuresof reliability and validity into an integrated whole sothat the decision as to which to use and when isobvious.
This article is an attempt to provide such a frame-work. A procedure is suggested by which measuresof constructs of interest to marketers can be devel-oped. The emphasis is on developing measures whichhave desirable reliability and validity properties. Partof the article is devoted to clarifying these notions,particularly those related to validity; reliability notionsare well covered by Peter's article in this issue. Finally,the article contains suggestions about approaches onwhich marketers historically have relied in assessingthe quality of measures, but which they would dowell to consider abandoning in favor of some neweralternatives. The rationale as to why the newer al-ternatives are preferred is presented.
THE PROBLEM AND APPROACHTechnically, the process of measurement or opera-
tionalization involves "rules for assigning numbersto objects to represent quantities of attributes" (Nun-nally, 1967, p. 2). The definition involves two keynotions. First, it is the attributes of objects that aremeasured and not the objects themselves. Second,the definition does not specify the rules by whichthe numbers are assigned. However, the rigor withwhich the rules are specified and the skill with whichthey are applied determine whether the construct hasbeen captured by the measure.
Consider some arbitrary construct, C, such as cus-tomer satisfaction. One can conceive at any givenpoint in time that every customer has a "true" levelof satisfaction; call this level X,. Hopefully, eachmeasurement one makes will produce an observedscore. A",,, equal to the object's true score, Xj.Further, if there are differences between objects withrespect to their X,, scores, these differences wouldbe completely attributable to true differences in thecharacteristic one is attempting to measure, i.e., truedifferences in X,-. Rarely is the researcher so fortu-
nate. Much more typical is the measurement wherethe X^ score differences also reflect (Selltiz et al.,1976, p. 164-8):
1. True differences in other relatively stable charac-teristics which affect the score, e.g., a person swillingness to express his or her true feelings.
2. Differences due to transient personal factors, e.g.,a person's mood, state of fatigue.
3. Differences due to situational factors, e.g., whetherthe interview is conducted in the home or at a centralfacility.
4. Differences due to variations in administration, e.g.,interviewers who probe differently.
5. Differences due to sampling of items, e.g., thespecific items used on the questionnaire; if the itemsor the wording of those items were changed, theXg scores would also change.
6. Differences due to lack of clarity of measuringinstruments, e.g., vague or ambiguous questionswhich are interpreted differently by those respond-ing.
7. Differences due to mechanical factors, e.g., a checkmark in the wrong box or a response which is codedincorrectly.
Not all of these factors will be present in everymeasurement, nor are they limited to informationcollected by questionnaire in personal or telephoneinterviews. They arise also in studies in which self-ad-ministered questionnaires or observational techniquesare used. Although the impact of each factor on theA",, score varies with the approach, their impact ispredictable. They distort the observed scores awayfrom the true scores. Functionally, the relationshipcan be expressed as:
systematic sources of error such as stable char-acteristics of the object which affect its score,andrandom sources of error such as transient per-sonal factors which affect the object's score.
A measure is va//f/when the differences in observedscores reflect true differences on the characteristicone is attempting to measure and nothing else, thatis, X^ = Xj.. A measure is reliable to the extent thatindependent but comparable measures of the sametrait or construct of a given object agree. Reliabihtydepends on how much of the variation in scores isattributable to random or chance errors. If a measureis perfectly reliable, X^ = 0. Note that if a measureis valid, it is reliable, but that the converse is notnecessarily true because the observed score whenA"^ = 0 could still equal Xj- + X^. Thus it is oftensaid that reliability is a necessary but not a sufficientcondition for validity. Rehability only provides nega-tive evidence of the validity of the measure. However,the ease with which it can be computed helps explain
JOURNAL OF MARKETING RESEARCH, FEBRUARY 1979
its popularity. Reliability is much more routinelyreported than is evidence, which is much more difficultto secure but which relates more directly to the validityof the measure.
The fundamental objective in measurement is toproduce X^ scores which approximate Xj- scores asclosely as possible. Unfortunately, the researchernever knows for sure what the A*7. scores are. Rather,the measures are always inferences. The quality ofthese inferences depends directly on the proceduresthat are used to develop the measures and the evidencesupp