nutrition and pregnancy

test developer correlates the scores on one testing with scores on a second testing that uses the same instrument and the same group of people. This correlation results in a test-retest reliability coefficient. One problem with using the identical test for the testretest reliability is that the person taking the test may learn from the test administration and carry the learning over to the next test session. This carryover effect could invalidate the reliability estimate because "all other sources of inconsistency" would not have been held constant.

Reliability derived from equivalence compares the scores on 2 parallel test forms that purport to measure the same content domain or construct. The scores on one form of the test are correlated with the scores on the other form. Equivalent or parallel tests eliminate some of the problems, due to learning from the test, that may arise with test-retest on the same instrument. However, the validity of this approach relies heavily upon the extent to which the 2 forms are, in fact, parallel.

In split-half (e.g., odd-even, matched half) reliability, one constructs parallel forms within a single test and computes the correlation between the 2 parts.

Closely related to the split-half reliability estimate is internal consistency reliability, which refers to the degree to which each item or cluster of items relates to the total test score. The Kuder-Richardson formulae, Hoyt's ANOV A, and Cronbach's alpha are among the approaches commonly used to determine internal consistency reliability. These formulas use data from a single test administration to estimate how highly the test would correlate with a parallel test.

Several aspects of an instrument may influence reliability. In general, the greater the number of items in a test, the higher will be the reliability. Longer tests have a greater chance of comprehensively sampling the domain being tested. However, practical considerations such as time and expense will

limit test expansion as a means of improving reliability. Subscores of a test usually have a lower reliability coefficient than does the test as a whole. Subscore reliabilities must be determined separately; they cannot be assumed from the reliability of the total test.

Item analyis procedures often reveal reasons for poor reliability of an instrument. The difficulty index is the percentage of respondents that choose the correct response. An item with a very low difficulty index may be too complicated for the purpose of the test or may be poorly worded. The discrimination index and the item-tototal test correlation are approaches to measurement of the degree to which high-scoring respondents score higher on the item than do low-scoring respondents. High, positive values are desirable. Items that have negative discrimination indices are answered correctly more often by lowscoring than by high-scoring respondents. An item might have a low discrimination index because it is so easy or difficult that everyone scores similarly, because it is ambiguous or confusing, or because the key is incorrect. Revision or deletion of items with low or negative discrimination indices or item-to-total test correlations usually will improve reliability of the instrument.

CONCLUSIONS In considering the overall worth of an instrument, an adequate reliability is a necessary but not a sufficient condition to justify the instrument. Unless the instrument is valid for its specific purpose, no matter how high the reliability coefficient, the instrument has no value. For example, a food attitude instrument with a reliability of 0.90 cannot properly be used as a predictor of food choice behavior unless the predictive validity of attitude and behavior has been established. Furthermore, we cannot assume that because a test is valid and reliable for one group, it will be equally valid

NUTRITION AND PREGNANCY

A supplement to the American Journal oj Clinical Nutrition [34(4):655-817, 1981] presents the proceedings of a workshop entitled "Nutrition of the Child: Maternal Nutritional Status and Fetal Outcome." Conference participants addressed 3 main topics: the assessment of the nutritional status of the mother and fetus, the relation-

VOLUME 13 NUMBER 3 1981

ship of maternal status to fetal outcome, and the impact of nutritional intervention. Many papers relate more closely to .the specific research interests of the authors than to a broad comprehensive review of the various topics. Nonetheless, the proceedings gather together a wealth of information and opinion on nutrition and pregnancy. The

and reliable for all groups. Validity and reliability are dependent upon the characteristics of the test subjects and the purpose of the assessment.

The bottom line on the worth of a research study is the quality of the instruments used to obtain the data. The primary burden of responsibility for quality instruments rests with the test developer. However, this burden is shared by users and interpreters of the data, particularly with respect to understanding the limitations of instruments.. D

ACKNOWLEDGMENT

The authors would like to express their appreciation for the critical review and suggestions offered by Edward Haertel, Geneva Haertel, Judy Brun, Linda Junker, Peggy Uguroglu, Susan Levy, and Ernest Pascarella during various stages of development of this paper.

LITERATURE CITED

I American Psychological Association. Joint Committee of the American Psychological Association, American Educational Research Association, and National Council on Measurement in Education. Standards for educational and psychological tests. Washington, D.C.: American Psychological Assn., 1974,76 pp.

2 Buros, O. K., ed. The eighth mental measurements yearbook. 2 vols. Highland Park, N.J.: Gryphon Press, 1978,2,182 pp.

3 Anastasi, A. Psychological testing. 4th ed. New York: Macmillan Publishing Co., 1976, 750 pp.

4 Thorndike, R. L., and E. Hagen. Measurement and evaluation in psychology and education. 4th ed. New York: John Wiley & Sons, 1977,693 pp.

5 Popham, W. J., ed. Criterion-referenced measurement: An introduction. Englewood Cliffs, N.J.: Educational Technology Pubs., 1971, 108 pp.

6 Wolf, R. M. Evaluation in education: Foundations of competency assessment and program review. New York: Praeger Pubs., 1979,217 pp.

7 Cronbach, L. J. Essentials of psychological testing. 3d ed. New York: Harper & Row Pubs., 1970,752 pp.

workshop participants identified priority topics for further research in this area- examination of the usefulness of developmental standards, the limits and benefits of nutrition intervention, and the relationship of nutritional status to psychological development. S.M o.

JOURNAL OF NUTRITION EDUCATION 85

nutrition and pregnancy

Documents