letter from mr. patrick marquis
TRANSCRIPT
Volume 4 • Number 4 • 2001VALUE IN HEALTH
© ISPOR 1098-3015/01/$15.00/344 344–345
344
LETTERS TO THE EDITOR
To the Editor—Health-related quality of life (HRQL),health status, and other specific functional statusassessments that are included udner the umbrellaof PRO (patient-reported outcomes), are increas-ingly used as efficacy end points in randomizedcontrolled trials. It is now recognized that, althoughperceptual, PROs can be measured in reliable andvalid ways. Indeed, evidence of the scientific sound-ness of the questionnaire should be provided. Inthat sense we fully support Paul Kind’s generalstatement “we need demonstrable rigor in our meth-ods.” Nevertheless, Dr. Kind’s comments raise twomajor issues: the perspective taken for scaling, andthe level of data reported in a single manuscript.
Scaling
It is common practice for multi-item scales in de-scriptive (psychometric) questionnaires to be scoredusing the method of summated ratings. Indeed,simple summing of scores over the individual itemsis the most rational index. This “linear model” ap-proach works if items are measuring the sameconstruct, the scaling assumption being based on asimilar distribution of responses to items and simi-lar item variances. In addition, the internal consis-tency reliability of the scale is estimated using Cron-bach’s alpha coefficient. This provides an indicationof the degree of convergence between different itemshypothesized to represent the same construct. Clas-sic references include Likert [1], Nunnally [2], andStreiner [3].
This was the perspective we took for the MSF-4,bearing in mind that the MSF-4 was a descriptivequestionnaire aimed at evaluating the sexual func-tional status of men with benign prostatic hyper-trophy (BPH). We actually followed the psycho-metric criteria described and recommended by theMedical Outcomes Trust and its Scientific Advi-sory Committee [4], based on Likert’s [1] theory.
We did not introduce a valuation system (ex-plicit weights) in the scoring algorithm of theMSF-4, given that this questionnaire is not a pref-erence-based instrument and the introduction ofdifferential weights in a one-domain, multi-itemscale does not seem to provide a substantial ad-vantage over using the unweighted score, particu-larly when item-total correlations are similar or
when the reliability is acceptable [5,6]. Furthermore,improvement in the quality of the items and/or in-creases in the number of items are generally rec-ommended ways of improving reliability, ratherthan the weighting of items. In addition, major is-sues related to weighting are still under discussion:Which method should be used? Whose value shouldbe taken into account?
Level of data displayed in a single manuscript
If the first issue raised by Dr. Kind can be consid-ered as theoretical, or even philosophical, the sec-ond one is very practical. According to the perspec-tive taken, the underlying theory, and the context,authors have to face a difficult choice. What is theminimum level of data that should be reported in asingle manuscript, taking into account the type ofjournal and the numbers of words/tables recom-mended by the editor? How much evidence shouldbe provided to demonstrate the appropriateness ofthe scoring system and the reliability and validityof the PRO instrument? One can easily note that,even though standards of validation are available,great variability exists in the types of data reportedin manuscripts describing the development and useof PRO instruments. In particular, details support-ing the scoring algorithm or the ordinality of itemresponse categories are not commonly reported.
Following the usual practice, in our manuscriptwe decided to put the focus on the clinical validityof the MSF-4 questionnaire rather than report de-tails on the scaling assumptions. A great deal moreinformation is available in the analyses than wasreported in the manuscript. Interested readers cancontact the author for additional details on theMSF-4 instrument characteristics.
Again, we think the main issue is the absence ofconsensus regarding the type of data that shouldbe shown in a manuscript to support the valida-tion of a scale. In any case, as stated by Dr. Kind,we should go beyond Cronbach’s alpha.—PatrickMarquis, Mapi Values, Lyon, France.
References
1 Likert R. A technique for the measurement of atti-tudes. Arch Psychol 1932;140:5–55.
Letters to the Editor
345
2 Nunnally JC. Psychometric Theory (2nd ed.). NewYork: McGraw-Hill, 1978.
3 Streiner DL, Norman GR. Health Measurementscales. A Practical Guide to Their Developmentand Use. New York: Oxford University Press,1989.
4 Lohr KN, Aaronson NK, Alonso J, et al. EvaluatingQuality of Life and Health Status Instruments: de-
velopment of scientific review criteria. Clin Thera-peutics 1996;18:979–92.
5 Lei H, Skinner HA. A psychometric study of lifeevents and social readjustment. J Psychosomatic Re-search 1980;24:57–65.
6 Edwards AL, Kenney KC. A comparison of theThurstone and Likert Kechniques of attitude scaleconstruction. J Appl Psychology 1946;30:72–83.