eric - t,t ,,jt[at ndt, do sta,ro nf p1pepe-sent or 1 ,1 as tatjtn t,t ,,jt[vollf a 04 prist t lop....

0

Z5_

MICROCOPY RESOLUTION TEST CHARTNATIoNAL BM ALI OF s ANDARDs- 196.J.,1

DOCUME T RESUME

ED 137 418 TM 006 232

AUTHOR- Cannel, Charles F.; And Others-TITLE A Summary of Studies of Interviewing Methodology.

-Vital. and Health Statistics:- Series 2, DataEvaluation and Methods 2esearch; No. 69. DREWPublication No. (HRA) 77-1343.

-_INSTITUTI N micbigan Univ., Ann-Arbor. SurveyNational Center for Health Statis ics (DHE ),Rockville, Md.

PUB DATE Ear 77NOTEAVAILABLE FROM Superintendent of Documents,_ U.S. .Government -Printing

Office, Washington,_D.C. 10401:(Stock no.017-022-00533-6,- $1.45)

Research Center.;

EDRS- -PRICEDESCRIPTORS

ABSTRACT

mE-$0.83 HC-$4.67 Plus Postage.Bias; *Data Collection; Feedback; *Interviews;Memory; Problems; *Public Health; QuestioningTechniques; Questionnaires; Reinforcement; *Resea chReviews (Publications); Response Mode; Sampling;supervision; *Surveys; Training; Valid ty; Verbalcommunication

In several studies of exPerimental interviewing_.

techniques and their effect on reporting behavior described in .thispublioation,-.an.attempt_is.made,.to;identify.the elements:of-the

-interview process that are ipotential-sources for_ itproving datacollection-. Methodological studies-designed-to- test theeffectivenessof certain gueStionnaire designs and interviewing-techniques:used in--the-collection of 'data on-health- eitrentshoOsehold.interviews:arepresented. The role of behaviors, attitudes,. _perceptions,-andinformation-levels of both the_reSpondent anil.the-_interviever isinvestigated.ARC)

******** ****************************** ******************* Documents acquired by ERIC include many informal unpublished ** materials not available from ottier sources. ERIC makes every effort *

to obtain the best copy available. Nevertheless, items of marginal *reproducibility are often encountered and this affects the quality *

* of the microfiche and hardcopy reproductions ERIC makes availablevia the ERIC Document Reproduction Service (BIDES), EDRS is not

* responsible for the quality of the original document. Reproductions *supplied by EDES are the best that can be made from the original. **********************************************************************

U S DEPARTMENT OF NEALT4EDUCATION &WELFARENATIONAL 11.4571...J7E Cr

EDu<ATIoN

DOCA,VE, HAS REEN RV PIJO01.1CA V ,C C., AS WV CE IvVE) WOMTraf PP W50% OW V){4GASIjATiQryAT Ndt, DOSTA,ro Nf P1PEPE-SENT or 1 ,1 As tATJtn t,T ,,JT[VOLlf A 04 PrISt T LOP. 0(.1 POL IC v

Cataloging in Publication Data

-nell, Charles F

ary of research studies of intervie ing methodology, 1959-1970.

(Vital and health statistics: Series 2, Data evaluation and methods' researdi; no. 69)(DHEW publication; n ). (HRA) 77-1343)

Supt. of Does. no.; HE 20.6209:2/69Bibliography: p.1. Health surveys. 2. Intenriewing. 3. Medical history taking. L Marquis, Kent H.,

joint author. II. Laurent, Andrd, joint author. HL Title. IV. Series: United States. NationalCenter for Health Statistics. Vital and health statistics: Series 2, Data evaluation andmethods research; no. 69. V. Series: United States. Dept. of Health, Education, and Welfare.DREW publication; no. (HRA) 77-1343. [DNLM: 1. Medical history taking.' W2 AN148vbno. 691--RA 409.U45 no. 69 312'.07'23s [312%07'23]ISBN 0-8406-0062-3 75-619406

-

ho anpori tondont of Documen vornmont, Printing 'Moe -Washington, n.c. 2d1027 price $1.116_

_

Stook No. 01,7-022-0Dda-tiA-,t_ ,7 1_

DATA EVALUATION AND METHODS RESEARCH

_-Summaty of -StudietOf interviewing Methodology

A summar} of methodological studies designed to test theeffectiveness of certain questionnaire designs and interviewingtechniques used in the collection of data on health events inhousehold interviews and to investigate the role of behaviors,attitudes, perceptions, and information levels of both the respond-ent and the interviewer.

DHEW Publication No. DiRA) 77.1343

Series 2Number 69

U.S. DEPARTMENT OF HEALTH, EDUO2kTIO1, AND WELFARE.Public Health Service:

Health Resources AdministrationNational Center for Health Statistics

Rockville, Md. March 1977

NATIONAL CENTER FOR HEALTH STATISTICto-

DOROTHY P. RICE, Director

ROBERT A. ISRAEL, Deputy DirectorJACOB J. FELDMAN, Ph.D., Associate Director for Analysis

GAIL F. FfSHER, Associate Director for the Cooperative Health Statistics SystemELIJAH L. WHITE, Associate Director for Data Systems

ANDERS S. LUNDE, Ph.D., Associate Director for International StatisticsROBERT C. HUBER, Associate Director for Management

MONROE G. SIRKEN, Ph.D.., Associate Director for Mathematical StatisticsPETER L. HURLEY, Associate Director for Operations

TAMES M. ROBEY, Ph.D., Associate Director for Program DevelopmentPAUL E. LEAVERTON,, Ph.D., Assodate Director for Statistical Research

ALICE HAYWOOD, In ormation Officer

DMSION 0 HEALTH INTERVIEW STATISTICS

ROBERT R. FUCHSBERG, DirectorPETER RIES, Ph.D., Chief,111ness and Disability Statistics Branch

ROBERT A. WRIGHT, Acting Chief, Utilization and Expenditure Stadstics BranchCLINTON E. BURNHAM, Chief Survey Planning and Development Branch

COOPERATION OF THE U.S. BUREAU OF ME CENSUS

In accordance with specifications established by the National HealthSurvey, the Bureau of the Census, under a contractual agreement, partici-pated in the design mid selection of the sample, and carried out the first stageof the field interviewing and certain parts of the statistical processing.

Vital and Health Statistics Series 2 - No. 69

OHEW Publication No. (HRA) 77-1343Library. of Congress Catalog Card plumber 75-619406

PREFACEFor more than a decade the Survey Research Center of the

University of Michigan and the Division of Health InterviewStatistics of the National Center for Health Statistics (NCHS) havehad a continuous contractual arrangement for the investigation ofresponse problems in reporting health information in samplingsurve ys.

The contract program, which started in the late 1950's shortlyafter the initiation of the Health Interview Survey, began with aseries of validity studies in which sampks drawn from medicalrecords were -compared with data collected by interview. Thesestudies were designed to identify patterns of response bias as abasis for developing procedures to improve reporting. Theseinvestigations of levels of underreporting and characteristics ofresponse patterns were evaluated in terms of respondent status,the attitudes and behavinr of the interviewer and the respondent,and nature of the events being reported. These studies arediscussed in the sections, "Behavior in Interviews," and "Inter-viewer Performance Difference" of this publication.

The more rece-nt studies, which have developed out of findingsof the preceding research, involved experimental proceduresdesigned to improve reporting. Investigated were such proceduresas the use of verbal reinforcement of the respondent, probing as amethod of improving memory and information retrieval, andvarying the length of questions in an attempt to increaserespondent participation in the interview. These studies aredescribed in the sections "The Use af Verbal Reinforcement inInterviews and Its Data Accuracy," "Memory and InformationRetrieval in the Interview," and "Question Length and ReportingBehavior in the Interview" of this publication.

All but one of the NCHS studies summarized in this reporthave appeared as complete research reports in series 2 of Vital andHealth Statistics, The study by Cannell and Fowler (1963)1 onvalidity of reporting visits to physicians was not published in theseries. The second study of interviewer-respondent interactions byMarquis and Canne112 was done in 1969 under a contract with theDepartment of Labor, Manpower Administration. All others werecontracted for by NCHS. In addition, two NCHS studies notconducted by the Survey Research Center are frequently referredto here because they had as their subject some of the sameproblems of reporting: one in 1967 by W. G. Madow, the StanfordResearch Institute report published as Series 2-Number 233 ; andthe other in 1961 by E. Balamuth, et al., the Health InsurancePlan study, most recently published as Series 2-Number 74.

Because these studies have had considerable interest formethodologists and for survey researchers more generally, it was

thought useful to review them in a single volume so that thesequence of the major lines of inquiry could be followed. Thisreport does not include a review of literature nor does itattempt to integrate underlying theories. It does present thefindings in such a way as to make apparent their consistencies orinconsistencies, and does discuss some underlying hypotheses. This-compilation also allows more emphasis to be placed on interpreta-tion and explanation than was possible in the individual presenta-tions.

In the concluding sections of this report the findings ofthe several studies are synthesized, a model of reporting isdeveloped, and a description is offered of how the researchperformed at the Survey Research Center (SRC) has been appliedto collection procedures used in the Health Interview Survey (HIS)to improve the quality of the collected data.

Since these studies were completed, much additional method-ological work has been conducted by the Survey Research Centerfocusing on experimental procedures for improving the validity ofreporting. This newer research at times confirms findings in thisreport, provides further support for these hypotheses and, attimes, runs counter to some of the conclusions. Some of thesefindings can be found in a forthcoming -report, "Experiment inInterviewing Techniques," summarizing research conducted bySRC for the National Center for Health Services Research.

The contractual relationship between the SRC and the HISdoes not consist solely of a financial arrangement; much of theresearch is the cooperative work of the two organizations. TheBureau of the Census has also been an active participant in severalof the studies, both in the planning and data collection phases.

Charles F. CannellProgram Director,Survey Research Center,University of Michigan

-ACKNOWLEPGMpYTSn a cooperative and integrated research program it is difficult

to acknowledge the contribution of all participants. This isespecially true of this project because the studies have extendedmore than a decade and the research staff has changed during thattime. In addition to the authors, others who have had major

ponsibility in one or more of the studies include Mr. ThomasBakker, Professor Gordon Fisher, Floyd Fowler, Ph.D., andThomas deKoning, Ph.D. Special assistance in the preparation ofthe manuscripts was provided by Linda Winter and Marion Wirick.

SYMBOLS USED IN TABLES

Data not available

Category not applicable---------- --- -----------

Quantity zer ------------- ----- ----- ------

Quantity more than 0 but less than 0-0

Figure does not meet standards ofreliability or precision-- - ---- -------

CONTENTS

Preface

Acknowledgments

IntroductionUnderstanding the Interview Process

Influence of the Interviewer

Page

1

1Response Error . _ . . . . . . _ . . . . . . . . . . . 2Problems of Recall and Information Retrieval 2Bias Introduced Through Interviewer Feedback 3Research Needs 4

Studies of Underreporting of Health Events in the Household Interview 4Underreporting and Characteristics of Health Events 5

Effect of Elapsed Time on Reporting 5Effect of Impact of the Event Upon Reporting 8Effect of Social and Personal Threat Upon Reporting _ . 9Summary 11

Underreporting and Characteristics of Respondents . . . . .. . . . . . 11Age of Respondent 11Sex of Respondent 13Education of Respondent 13Family Income of Respondent 14Color of Respondent 15Reporting for Self Versus Reporting for Other Family Member 15Conclusions 16

Behavior in Interviews 17Health Interview Survey Observation Study . . . . . . . . . . . . . . . . 17

Health Interview Survey Data 17Observation . . . . . . . . . . . . . = . . . 18Interviewer Ratings of the Respondent 19Reinterview With the Respondent . . . 19Interview With the Health Interviewer . . . . . . . . . . . 19Results 19

Urban Employment Survey Behavior Interaction Study . . . . = . . . = . 22New Coding Scheme . . . . . ..... ... . - 22Main Findings . . . ... . .... 22

Interviewer Performance Difference: Some Implications for Field Supervision andTraining . . . .......... ... = . 30

Conclusions and Discussion . . . . . . . 36

The Use of Verbal Reinforcement in Interviews and Its Data AccuracyResearch on Interviewer Reinforcement

Operant Conditioning Studies With Verbal Re=.nforcerrientEffects of Feedback on Amount ReportedUse of Feedback To Increase AccuracyUse of Feedback in a Nonexperimental Interview

Page

373838383942

Discussion . . 42Three Kinds of Interviewer Verbal Reinforcem-mt Effects 43

Cognitive Effects of Reinforcement 43Conditioning Effects of Reinforcement 46Motivational Effects of Reinforcement 48

Experimental Study 50

Memory and Information Retrieval in the Inte 52The Inadequate Search Hypothesis 52An Integrative Hypothesis: Cognitive Inadequacy of Stimuli Questions 53Design of an Experimental Interviewing Approach = 53

The Extensive Questionnaire . . . . . . . 55Control Questionnaire 55Field Experiment 56Dependent Variables 56Hypothesis 56Results and Discussion 57Conclusions 59

Question Length and Repo ting Behavior in the Interview: PreliminaryInvestigations 60

Empirical Findings on Behavior Matching in the Interview 61Hypotheses About the Effects of Question Length on Reporting Behavior 61Experiment I: Effects of Question Length on Answer Duration and

Reporting Frequency 62Questionnaire Procedures ................. . 62Field Procedures 63Dependent Variables 64Results and Discussion 64

Experiment II: Effects of Question Length on Validity of Report 66Results and Discussion 67

Conclusions 70

References .. . . . ...... . . ..... . .. .... 72

Appendix. Application of Survey Research Center Findings to HealthInterview Survey Procedures 75

Recall of Health Events 75Effective Probing for Health Events . . . . . . . . . . . . . 76Interviewer-Respondent Communication . . . .. . . . . . . 77Other Considerations 77

A SUMMARY OF STUDIESOF INTERVIEWING METHODOLOGY

Charles F. Cannell, Kent H. Marquis, and André Laurent,Survey Research Center, Institute for Social Research The University of Mtchigan

INTRODUCTION-

Survey interviewing as a technique of datacollection has developed from early attempts tocollect simple demographic information to thecurrent more sophisticated inquiries concerningattitudes, motives, and a wide variety of factualinformation. Despite the increasingly complexdemands on the survey inteiview, methodologiesfor question construction and interviewer be-havior have not changed a great deal.

Much research on interview method has beendirected to the general problems of underreport-ing, to inaccuracies in interview data due tointeniewer bias -or response error, and to theproblems of recall and information retrieval. Theinadequacies of the interview method have beenwell documented and the need for improvedtechniques in data collection is readily apparent.However, little has been done toward perfectingthe interview procedure as a method of datacollection.

One reason interviewing techniques have ad-vanced slowly may be that interviewing has nocomprehensive theory to draw T upon for cause-and-effect relationships. Ideas about effectivequestioning must be drawn from fragments ofpsychological theory or, more often, from folk-lore, experience, and common sense. Beforemajor advances can be made, it is necessary tolearn more about what happens in the interviewsituation and to develop some theories about thecause-and-effect sequences that occur. In severalstudies of experimental interviewing techniques

and their effect on reporting behavior describedin this publication, an attempt has been made toidentify the elements of the interview processthat are potential sources for improving datacollection.

UNDERSTANDINGTHE INTERVIEW PROCESS

Influence of the Interviewer

Early attempts zo investigate inaccuratereporting in interview surveys were focusedprimarily on the interviewer. Results of theseearly studies suggested that the interviewer'sattitudes, expectations, background, and physi-cal characteristics introduced important sourcesof bias into the household interview.

In a 1929 pioneer study of interviewer effectreported by Stuart Rice,5 it was found thatinterviewers who were prohibitionists werelikely to ascribe the sad plight of destituterespondents to the excesrive use of alcohol,while socialist interviewers attributed indigencyof their respondents to generally bad economicconditions.

The influence of interviewer expectations onthe interview process was demonstrated in 1942by Stanton and Baker.6 Respondents wereshown 6 geometric designs and were later askedin an interview to select from 12 designs the 6which they had seen previously. The inter-

viewers were given inside information" aboutwhich designs were originally shown to respond-ents, but were purposely told the wrong sixdesigns. During the interview, the designs whichthe interviewers thought were the correct oneswere identified more often than the designsoriginally shown to the respondents

Another type of study demonstrated a less-direct, but still powerful interviewer effect.Katz7 found in 1942 that interviewers fromworking-class backgrounds consistently obtainedmore radical opinions, both social and political,

_ _

from respondents than did interviewers from themiddle class. Robinson and Rohde8 conducted astudy in 1946 on attitudes toward Jews in whichinterviewers in one group were Jewish in appear-ance and those in another group appeared to benon-Semitic. The Jewish-appearing interviewersobtained significantly fewer reports_ of anti-Semitic attitudes than did the interviewers whoappeared to be non-Semitic.

Response Er or

In a household interview, a respondent can beexpected to provide information: (a) that per-tains to items about which he is knowledgeable;(b) that he can remember at the time of theinterview; and (c) that he is willing to report toan interviewer. Underreporting or inaccuracies inreporting on the part of a respondent may resultfrom lapses in any or all of these three cate-gories.

Myers8 published data in 1940 from the 1930decennial census that showed a suspiciouspattern of reported ages ending in zero (30, 40,50, etc.). In the 1930's Twila Neely" foundthat.one out of every nine families receiving city,relief failed to report this fact. Perry andCrossley11_, published data in 1950 showing thata comparison of interviews with agency recordsproduced significant differences on such items asvoting arid registration, contributions to theCommtnilty Chest, age, and ownership Of a

_library card.Validity studies comparing data obtained,

from interviews with data obtained from objec-,tive records show discrepancies between the twosources Of ififormation for topics such as bankaccounts' 2 '1 3 ; airplane trips14 ; pediatric his-

tory' work history' and publichealth. '4'17'113

One way to interpret underreporting on thepart of the respondent is to consider it aconsequence of poor memory. The disusetheory, described by Thorndike18 in 1913 inaccordance with the findings in 1885 of Ebbing-haus," suggests that events from the moredistant past are more likely to be forgotten thanare recent events. Thorndike assumed that thesheer passage of time brings about a weakeningof the memory trace. Similarly, one could derivefrom the Gestalt theory21.2 2 a prediction of thehigh probability of the respondent to forgetevents of low impact, particularly with thepassage of time.

There is, however, another theory of forget-ting. According to McGeoch,2 3 who first ( 1932)explicity stated the basic ideas of the moderninterferences theory later (1961) expounded byPostman,24 forgetting does not occur in anabsolute sense. Information does not disappearfrom memory but may be more difficult toretrieve from storage because of competingassociations or interferences. Only the accessi-bility of information declines, resulting in a"lessening probability of retrieval from thestorehouse."28 This would indicate that under-

ireporting s a problem of retrieval, and thatreporting can be-- improved by manipulatingconditions that facilitate the recall- of informa-tion.

Problems of Recall and Information RetrievalThere are-two critical stages for a respondent

who is asked to report information from mem-ory. First, he has to search for and retrieve therequested information from his memory; thenhe has to transmit this information to aninterviewer. While performance may vary ac-cording to the level of the respondent's generalmotivatir,n or dedication to the role, it is usefulto think of recalling and reporting as twospecific variables that can affect the accuracy ofdata. For example, underreporting may resultfrom failure of recall or from failure of com-munication. An example of the latter case is thetendency of the respondent to withholdthreatening or embarrassing information.28 Afertile field for study is the type of underrePort-

13

mg that results from the failure of the cognitiveprocesses in searching for and retrieving informa-tion from memory.

The three major activites of the interviewerare: (a) question asking, (b) probing, and (3) giv-ing feedback. If a question is not properlyworded, -the probability of obtainitrg accuratedata is low. A question that is improperlyworded, inserted out of context, or that conveysto the respondent the type of answer the inter-viewer wants can produce data that are biased.Probing refers to repetition, or rephrasing of aquestion or the addition of a new question toobtain an adequate response when a previousresponse has not been adequate. The problem ofintroducing unwanted bias into the data throughprobing is solved by distinguishing betweendirective and nondirective probes. Interviewerfeedback consists oi evaluative statements thatthe interviewer makes after the -respondentanswers a question. These statements may con-sist of verbalizations indicating approval, atten-tion, or understanding, varying from a simple"Um-hmm," acknowledging the successful Com-munication of an answer, to an elaborate rein-forcement of the respondent's behavior.

Several classic experimental studies havedemonstrated that simple positive verbal rein-forcement- can have marked effects on adultperformance. Taffe127 in 1955 gave his experi-mental subjects a pack of 3-inch-by-5-inch cardseach containing a single verb In the simple pasttense as well as a list of six pronouns. Heinstructed each subject to form- a sentence fromeach card beginning with any of the pronounsand using the verb. In the first part of thesession, during which the experimenter remainedsilent, the subject showed a preference for usingeach of the pronouns on the card. In the secondpart of the session the task remained the samefor the_ subjects, but the experimenter said"Um-hmm" or All right" whenever the subjectsused either the pronoun "I" or "we" in con-structing a sentence. Consequently, the -ate ofusing "I" or "wc" increased significantly duringthe second part of the session.

Research by Greenspoon28 showed that ver-bal reinforcement after a respondent mentioneda plural noun in a free-association test increasedthe rate at which plural nouns were named.

Another method of demonstrating the effects ofverbal reinforcement involves the occurrence ofcertain kinds of behavior in a casual conversa-tion setting. Verplanck" in 1955 was apparent-ly the first to publish results from this type ofstudy. However, subsequent research38 indicatesthat the conditioning effect obtained was prob-ably due to several extraneous variables (pri-marily experimenter, cheating, conscious orunconscious) in using the procedures or report-ing the data. More recently, Centers" hassuccessfully shown that with Kreat care, one canobtain an increase in the rate with which aperson gives opinion statements in a conversa-tion setting if such statements are reinforced byanother person.

In 1954 Kanfer and MeBrearty3 2 interviewed32 female undergraduates about 4 predeter-mined topics. During the first part of theinterview, in which the women were handedcards designating four topics and asked to talkabout each, the experimenter remained silent.During the second part of the session theexperimenter reinforced the respondent when-ever she talked about a predetermined two ofthe four possible topics.- Reinforcement con-sisted of a posture of interest, including smiles,and the phrases "1 see," "Um-hmm," and"Yes." During the second phase of the experi-ment the students spent more time talking aboutthe reinforced topics than about those that hadnot been reinforced.

:

Bias Introduced Through Interviewer FeedbackThe foregbing studies indicate that inter-

viewer feedback may have important effects onthe amount ofinformation reported, but theyreveal very little about how different kinds offeedback procedures affect interview data.

Hildrum and Brown33 were the first investi-gators to show systematically, in a surveyinterview setting, that interviewer feedback canproduce response bias. Two groups of 10Harvard University students were telephonedand asked their opinions about Harvard Univeri-sity's philosophy of general education. Onegn-oup was reinforced by the investigators (whoused the word "Good") each time a favorablecomment was made, and the other group wasreinforced after each unfavorable comment.

Responses of the group reinforced for positiveopinions were significantly more favorable to-ward Harvard's philosophy of general educationthan those of the group receiving reinforcementfor unfavorable comments. Interviewer feedbackapplied in this systematic way produced a majordistortion in the overall attitude responses.

In 1957, Nuthman" asked two groups ofcollege students a series of questions aboutthemselves. In one group the experimenter said"Good" when the respondent answered a ques-tion in a way that indicated self-acceptance. Theother group was given no reinforcement.Respondents who received reinforcement forself-acceptance responses gave more answers ofthis kind than did the group that was notreinforced.

A. W. Staats and his -colleagues have doneseveral studies in this general area. In a 1962study35 the experimenter said "Good," "Verygood," or "That's fine" whenever the respond-ent scored in a positive direction on sociability

ms. Another group was given the same inter-view but no reinforcement. Staats et al. foundthat the group receiving the reinforcementscored significantly higher on the sociabilityscale than did the group not receiving reinforce-ment. Studies by Singer3 6 and Insko3 7 followed

these general kinds of experirobtained similar results.

Research Needs

One of the_ conclusions that can be drawnfrom this background information on the inter-view process is that research on the improve-ment of reporting can fruitfully be devoted tothe .cause-and-effect relationship between theoccurrence of different kinds of behavior orpatterns of behavior and the validity of datareported. The behavior that occurs during theinterview situation includes not only that of theindividual interviewer and respondent, but alsothat involved in the interaction between thetwo. Behavior may be motivated by controlledinterviewer feedback, techniques designed tofacilitate recall, verbal reinforcement, and aneffective interviewing instrument, namely, thequestionnaire.38.33

Obtaining good information in an interview isnot simply a matter of asking many questions.More must be learned about the basic principlesof memory and retrieval in order to provide abetter understanding of the way in whichinformation is stored and to devise more effec-tive ways of retrieving that information.

STUDIES;OF:,L,INPERREPORTING:i----OF:HEWH -EVENTS IN THE HOUSEHOLD-INTERVIEW:-

This section summaxizes some major findingsof validity studies about the reporting of healthevents and health-related behavior in thehousehold interview. It focuses primarily onunderreporting, since health events are morelikely to be underreported than overreported.Estimates of the magnitude of bias in surveysand calculations -of_ correction indexes for dataanalysis are not Included in this discussibn, sincethe studies show only underreporting bias, notnet bias.

The five major studies discussed here wereconducted fbr the National Center for HealthStatistics. Their focus was not on the inter-viewer, but on the characteristics of the

respondents and their reporting patterns. Partic-ular attention was also paid to the nature of theinformation being reported. In five studiessimilar questionnaires and comparable interview-ing procedures were used, and the reports ofrespondents were compared with independentrecords assumed to be. valid. The studies arcidentified as follows:

HIP: a study of the Health Insurance Pof Greater New York, in which interviewreports were compared with medicalreco rds ;4

a SRC: Three studies1.1 7,18 conducted bythe Survey Research Center, in two of

which reports of hospitalizations werecompared with hospital dischargerecords," '1 8 and a third in which reportsof physician visits were compared withchnic records;1SRI: a study carried out by the StanfordResearch Institute in which respondentreports were checked against physicianrecords.4

Since the studies were designed to investigatevalidity of reporting and were directed towardproblems of underreporting, they were based onsamples of records of presumed high accuracy.Hospital discharge_ records, clinic records, andphysicians' records were used as sample frames:Samples were usually weighted for certaincharacteristics and in some cases certain types ofrecords- were omitted from the sample. (Forexample, in the second SRC study of hospitali-zation reporting,1 8 normal deliveries wereomitted and the sample was weighted withhospitalizations of more than 3-months' dura-tion.)

In each of these studies, interviewers weregiven the family names_and addresses of therespondents. Usually a dumniy sample was alsodrawn from the phone book or city directory tohelp disguise the aims of the study. Interviewerswere told that the study *as special, but werenot told its purpose. Formal inquiry conductedafter the studies were completed showed that inno case had an interviewer guessed the study'strue purpose.

Interviewers were either experienced in theHealth Interview Survey of the National HealthSurvey, or were given a 2-week intensive trainingsession. Standard interviewing techniques of theHealth Interview Survey were used in thesestudies. While the questionnaires differed insome ways, they were all essentially the same asthose used in the Health Inteniew Survey.

In four of the studies1 1 7 ,1 8 the usualprocedure of using proxy respondents wasfollowed. The interviewer personally questionedall adults who were home at the time and used aproxy respondent for all adults not present and

aA more recent report fromStatistics. Sales 2, No. 57.

study is Vital d Heal h

1 children. One study (S included onlyself-respondents.

The analysis consisted of matching reportsfrom the interview with information containedin the medical records. The first part of theanalysis involved an examination of therelationship between the characteristics of thehealth events investigated and the patterns ofunderreporting. The second part of the analysiswas confined to the relationship betweencharacteristics of the respondents and patternsof underreporting.

UNDERREPORTING AND CHARACTERIS-TICS OF HEALTH EVENTS

Effect of Elapsed Time on Reporting

Investigators have long been aware of thelimited timespan ;over which a person givesaccurate reports. However, few studies have hadadequate data to demonstrate the extent towhich this phenomenon occurs.

Table lb demonstrates the decrease inreporting of hospitalization that occurs as the

Table 1. Number of recorded hospital discharges and percentnot reported in interviews, by time elapsed between dischargeand intervitw: Survey Research Center

Ti e elapsed

1-10 weeks11-20 weeks21-30 weeks31-40 weeks41-50 weeks51-53 weeks . . . . , . . .. . .

Souren: reference 17.

PercentRecorded

not re-discharges

ported

114426459339364

42

bMost tables ue based on events (hospitalizatiorw, visits tophysicians, chronic conditions), not on persons. The person withtwo events thus has a wriest of two. For hospitalizations, 90percent are single events, and thus persons and events tend to bedie same. For physician visits and chronic conditions, however,the concentration is higher.

In tables from the same study, the number of cvents differsomewhat. For clarity of presentation some irrelevant categories

("not wcertained,' for example) are omitted- The full report ofthese studies gives complete data_

interval increases between the date of the event=el the date of the interview. This appears to bea typical "forgetting" curve in which failure toreport an event grows as time passes. The samecurves are evident for both male and femalerespondents. The curve rises more slowly forself-reports than for reports given by anotherfamily member. Both SRC studies of hospitaliza-ion17. I showed very similar patterns.

The HIP study4 also showed underreportingof hospitalizations (table 2). The numbers arcnot stable because of the small sample, but therates of underreporting and the general patternare similar to those in the SRC studies.1 7.18

Table 2. Number of recorded hospital admissions and percentnot reported in interviews, by time elapsed between admis.sion end Interview: Health Insurance Plan of Greater NewYork

Recordedadmissions

Percentnot reporte3

of underreporting for the second week precedingthe interview was twice that for the weekimmediately preceding the inteniew.

Time etaps wordedvisits

Percentnet: re-

1 week .2 weeks .

......... 2

6

n the SRI study on reporting of chronicconditions,3 similar patterns of higher rates ofunderreporting occurred with an increase in timeelapsed since the last clinic visit (table 3).

Table 3. Number of recorded chronic conditions end percentnot reported in interviews, by time elapsed between lastclinic visit and 'interview: Stanford Research institute

Lea than 1-2 months .

2-6 months6-8 months0-11 months10-11 months

Time elapsedRecordedconditions

1.7 days . .. -

8-14 days15-28 days2Sk56 days

-84 days12 days

-113-140 dales141-168 days169-224 days ....225-280 days . . . . -281-364 days . .

MB days or more . .

Source: reference 4.

The reader should consider that the interview-ing took place over a period of rougldy 2monthsfrom May 2 to July 6, 1958. If thedates of hospitvl admsion are to be expressedas approximate iraervals from date of hospitaladmission to date of household interview, thereare overlaps in the classes, but rough equivalents

_are:

Date of admission to hospita Approximate interval tohousehold interview

Before July 1957July-September 1957Gctober-December 1957January-March 1958April-June 1958

monthsonths

months275 months

than 1-2

In the SRC study on visits to physicians,respondents were asked to report visits over the2-week perioa preceding the week in which theinterview took place. As shown below, the rate

116218440683674513476356372

1 2321,078

71 ,


Percentnot re-ported

9213

24423742464857

Similar data for the reporting of chronicconditions for both checklist recognition ques-tions and nonchecklist, free-response items fromthe HIP study are shown in table 4.

Although the phenomenon of increase inunderreporting over time is evident for bothhospitalizations and chronic conditions, theshapes of the curves differ. The curve__ forhospitalization underreporting increases slowlyduring the 6 months following the event, butincreases sharply beyond that period. The curvefor the underreporting of chronic conditionsrises rapidly during the first few weeks after thevisit to the clinic and thin flattens out after a

Table 4.. Percent of recorded chronic conditions, by checkliststatus, which were not reported in household interviews, bytime elapsed between last clinic vhit and interview: HealthInsurunce Plan of Greater New York

Time elapsed

Percent not reported

ConditionsConditions not onn checklist

checklistrecognition recognition

list list

Less than 2 weeks2 weeks-1 months4 months or more

3251

587984


few months. It is interesting to note that datafrom a feasibility study conducted in Chester,England; Smederevo, Yugoslavia; and Chitten-den, Vt.411 showed that there were significantlyfewer visits to physicians reported for thesecond week preceding the interview than forthe first week (see table 5). Similar data on theunderreporting of both medically attended andnonmedically attended illnesses over a 4-weekreporting period were found in the CaliforniaHealth Survey (see table 6).41-

Perhaps the bes': documented_phenomenon ofunderreporting of health events as well as of awide variety of other types of events andbehaviors, is the decrease in- the reporting ofevents as time elapses. This is characteristic ofstudies of _consumer purchases, reports ofincome, behavior of children as reported byparents, and so forth. Some investigators havehypothesized that this decrease in reporting isnot a result of forgetting but is due to thetendency of the respondent to misplace the

Table 5. Percent distribution of reported physician visitS, byweek of occurrence reported in interview, in three selectedareas

Reportedoccurrence

1Chester, Smederevo, Chittenden,England Yugoslavia Vermont (U.S.)

Percent distribi_ion

Last week.2 weeks ago 43

57 53

47


5743

Table 6, lllnewaS reported for a 4-week recall period expresset :as a percentage of the number reported in the --;;;;;t week ofthe recall period: California Health Survey,

Illnesseswith

Illnesseswithout

Reported occurrence Total activity activityof _illness illnesses restraints

or medicalattendance

restraintsor medicalattendance

s -_ k . . 100

Percent reported

2 weeks ago 603 weeks ago 40 68 284 weeks ago 39 66 ,


event in time and recall it as being outside thereference period. This explanation is especiallyrelevant to the sharp increase in underreportingof events of the very last (earliest) weeks of thereporting period. While these studies do notprovide a conclusive alswer, some of thefindings strongly suggest that misplacement intime does not explain a significant amount ofunderreporting.

In the SRC study of the reporting ofphysician visits,1 respondents were asked toreport visits made during the 2 weeks precedingthe week of the interview. HoWever, the samplewas drawn to include personi who had had visitswithin 4 weeks of the interview. If theunderreporting in this study were due to randommisplacement_ of the event in time, one wouldexpect compensatory overreporting of physicianvisits from the third and fourth week to bereported as having taken place in the first andsecond week preceding the interview. This didnot occur. Telescoping into a more recent timeperiod accounted for only a small amount of thereporting error. In one experimental study inwhich the usual 12-month reporting period forhospitalizations was lengthened to 18 months,the data were compared to see whether knownevents were inaccurately reported as havingoccurred in the 12-18 months prior to theinterview. Tfus was not the case.

When respondents who did report theirhospitalizations were asked for the month ofdischarge, 82 percent correctly stated the monthand only 3 percent were in error by more than 1

18

month in either direction. For those whomisplaced the mon th of the hospitalization therewas no predominant pattern of reporting theevent as having occurred ,earlier or later.Furthermore, respondents were as accurate inreporting the month of discharge for hospitaliza-tions that occurred between 45 and 52 weeksprior to the interview as they were in reportingthose of the most recent weeks. For visits tophysicians, over three-quarters of the reportedvisits were accurately dated to within a day.

These findings present strong evidence thatthe increase in underreporting as time elapses isnot primarily a function of the respondent'sinability to place the event in time. One mustlook to other sources for an adequateexplanation.

Effect of Impact of the Event UponReporting

Since early studies of memory, it has beenrecognized that the greater the impact of theevent upon the person, the more readily it isrecalled. Impact is a term that is poorly definedbut generally refers to personal importance orsignificance of the event. Psychologically, itsuggests that certain events occupy a greater partof one's psychic life, having greater relevancethan other events for one's present life. In thissection some indexes of impact and theirrelation to underreporting are examLned.

Both SRC studies of reporting of hospitaliza-onsl 7,1 S clearly demonstrate that the longer

the duration of the hospitalization, the lowerthe rate of underreporting. Table 7 shows theresults of one of these studies.

Table 7. Number of recorded hospital discharges and percentnot reported in interviews, by recorded duration of hospitali-zation (excluding overreports): Survey Research Center

Recorded durationof hospitalization

Recordeddischarges

1 day2-4 days5-7 days8-1 4 day s15-21 days .22-30 days .31 days or more

. 7 .se

456352111

5946


2614101052

According to the third SRC study,' the levelof underreporting of physician visits was lowerwhen two or more such visits had occurredwithin the 2 weeks prior to the interview:

Recorded individual visits within 2weeks prior to interview

ported

23 or more . .. ..... .

19711096

In the SRI study, a similar decrease in theunderreporting of chronic conditions was notedas the number of clinic visits relating to thecondition increased (table 8).3

Table 9 demonstrates another index oimpact. Reporting of automobile accidents was

Table B. Number of recorded chronic conditions and percentnot reported in interviews, by number of individual visits toclinic: Stanford Research Institute

Individual vi Recordedconditions


2 . . .. = -473 354 939 265 or more ... ..... 496 14

Source: reference

Table 9. Number of recorded automobile accidents, bothinvolving personal injury and not, and percent not reportedin interviews, by time elapsed between accident and Interview

Time elapsed

Less than3 months .

months6-9

months942

months . .


Accidents with nopersonal injury

Recordednumbers

Percentnot re-,ported

Recordednumbers _

71

141

71


49 94

complete,-regardless of the interval sincethe accident; if personal injury was involved."

Other evidence of the relationship betweenmpact and reporting can be summarized briefly.

Hospitalizations that included surgical proce-dures were more completely reported than thosenot involving surgery. Conditions are more likelyto be reported if the respondent says he has painand discomfort, is limited in activity, takesmedicines or treatment, or is concerned abouthis health.

Tables 10 (HIF) and 11 (SRC) show theeffects of both elapsed time and impact onreporting. The cell totals for the chronicconditions in table 10 are small and the resultsshow some instability, but the previously notedpattern can be 'observed.

Table 10. Number of recorded service visits and percent ofnonchecklist chronic conditions not reported in interviews,by time elapsed between last visit and Interview: HealthInsurance Plan of Greater New York

Recorded visits

Time elapsed5-9

10 ormore

Less than 2 weeks .

2 weeks-4 months4 months or more . . . .

Pe cent conditions notrepimed

70 71 2583 79 6089 85 59 19


Table 11. Recorded duration of hospitalizations and percent ofdischarges not reported in interviews, by time elapsedbetween discharge and interview: Survey Research Center

Time elapsed

Duration ofhospitalizations

report. Furthermore, there is on in eractiveeffect of impact and time elapsed between theevent and the interview. Neither of theserelationships is new or surprising; they conformto earlier findings and to theory. What issurprising is the rapidity with which the curve ofunderreporting rises, especially for chronicconditions, and the strong effect of impact onmediating the effects of time on reporting.

Effect of Social and Personal ThreatUpon Reporting

Another factor that affects accuracy oreporting is the level of threat or embarrassmentthat the requested information ho!ds for therespondent. Much research by social psycholo-gists emphasizes the effectiveness of groupnorms in bringing about and maintainingapproved behavior among group members. Also,one's perceived self-image tends to censorcommunications so that the image is maintained.The study of hospitalization" has Some

2

findings on this issue.A "threat scale" was created for the

hospitalization study. The diagnostic classifica-tion was a 3-point scale which, in the judgmentof the researchers, described the -threat orembarrassment involved with the diagnosis. Alldiagnostic clusifications that, in the opinion ofthe raters, would be very embarrassing orthreatening were placed in rank 1. Rank 3included the groups judged neither embarrassingnor threatening. Rank 2 contained a mixture of--categories that were thought to be somewhatthreatening or that might be threatening to somepersons but not to others. Thus ranks 1 and 3were kept as, pure as possible, arid rank 2 -contained the uncertain categories. The resultsof this threat scale, shown in table 12, indicate

Table 12. Number d hospitI discharges and percentnot reported in interviews- by diagnostic threat rating:Survey Retearch Canter

iagnoztic threat ratingPercentnot re-

rted

These data suggest that impact level of theevent is clearly related to the adequacy of

Very threatening . .. = .

Somewhat threateningNot threatening

235421

1,164

recorded hospitalizations end percent not reported in interviews, by lengthdischarge and intemiew, and diagnostic threat rating: Survey Research Center

thy, time elapsed between

Len h of s ay and time elapsed since dischargeRecordedhospitali-zations

Diagnostic threat rating

ostthreatening

Somewhatthreatening

Leastthreatening

ay o days

223

Percent

7

not

Discharged:1-20 weeks ego21-40 weeks ago 355 26 1641-53 weeks ago 219 27 27

Stay of 5 days or more

Discharged:1-20 weeks ago 021-40 weeks ago 2 541-63 weeks ago 273 22 17


that highly threatening or embarrassing informa-tion is reported significantly less often than isnonthreatening information.

From table 13, which shows a three-wayeffect of threat, impact, and time elapsed sincehospitalization, it can be seen that there is alow-level relationship between the threat levelmid completeness of the case of most recentevents. For less recent hospitalizations the threefactors combine to produce marked differences.

By matching diagnoses from SRC interviewswith hospital records two sources of reportingerror were found: (a) complete failure to reportthe hospitalization; and (b)_ reporting thehospitalizations but misreporting the diagnoses.A few diagnostic categories" showing extremedifferences between interview data and medicalrecords are examined in table 14. As predicted,those with the lowest reporting levels contain alilgh proportion of probably threatening diag-noses. There are, of course, reasons other thanembarrassment for not reporting a diagnosisaccurately; for example, the respondent may notknow the diagnosis. However, it is likely that thedifferences between the two groups are due toclifferences in threat rather than to other factors_

Undergi-aduates at The University of Michigawere asked about their hypothetical willingnessto report each of a group of diagnostic

Table 14. Number of diagnoses reported in interviews and per-cent of reported diagnoses compared with hospital records,by selected grouped diagnoses: Survey Research Center

Benign end unspecified neoplasmsInfectious and parasitic diseases .Ulcer of stomach and duodenum .Diseases of the gall bladder . . .Other digestive system conditionsFemale breast and genital disordersDiseases of nervous system and sense

organsMental and personality disorders .

Recorddiagnos

Percentreported

comparedwith

hospitalrecords

872336467252

47

+51+45+12+1037

67

Coded according to Manual of the Internatio al StatisticalClassification of Diseases, Injoris, and Cause4 Of Death, 1955revision (World Health Organization,1957).


conditions.2 6 In table 15 these data arccompared with what respondents actually

_reported . in the HIP study. The diagnosticcategories that the students were most willing toreport were surprisingly,similar to those actuallyreported best in the HIP study.

Table 15. Hypothetical willingness of students t0 report certainmedical conditions and percent of actual interview reports ofthese conditions, by medical condition

Percentwilling

to report(79 students)

Percentvalid

reports inhouseholdinterview

(HIP)

More serious conditions:Asthma . . -

Heart diseaseHernia . . . . . .

Malignant neoplasmMental diseaseGenitourinary disease . .

s serious conditions:SinusitisIndigestionHypertension . . . .

Varicose veinsHemorrhoids

84585531

1914

8988836521

respondents' reporting behavior. Survey reportsare easily susceptible to serious biases in thereporting of health events, and differentialreporting bias can result in misleading conclu-sions. By understanding th problems involvedn underreporting and distortion, one can designstudies to improve reporting in the interviewsurvey.

UNDERREPORTING AND CHARACTERIS-71 TICS OF RESPONDENTS6054

2522

41

464238


mmary

The data cited here present consistentpatterns of reporting; there is a predictable andsignificant relationship between some character-istics of the information sought and the

Table 16. Numbe

In the remainder of this section somerelationships between reporting and respondentcharacteristics are analyzed. Are particularrespondents most likely to underreport healthevents? If poor reporting is characteristic _ofsome respondent groups, then the reasons fordifferential reporting can be examined andexperiments can be designed to discover ways tomprove reporting. The variables selected, for

study were those found to differentiate attitudesand behaviors in other studies and which might,therefore, be expected to show differences inthe reporting of health events.

Age of Respondent

Data in table 16 suggest that there is an ageeffect in the reporting of hospitalizatiohs:

hospitalization, including and excluding deliveries, and percent not repotype of respondent: Survey Research Center

n interviews, by age Cnd

Age of respondentRezordedhospitali-zations

Type of respondent

Allrespond-

ants

Proxychil-

dren1

Proxyadults

Self-respond-

ents

All hospitalizations Percent not reported in intervie

18-34 years 792 i- 435-54 years 691 10 ll- 1155 years and over 50 1_ 22 10

Hospitalizations excluding deliveries

18-34 years 7 12 .16 1635-54 years .. . . .............. .... 11 1 1255 years and Over 9 22 10

,I Defined by relationship to hew of the household, not by age.Source: reference 17.

2 2"--_ 11

younger respondents showed less underreport-ing. The apparently large difference in reportinghospitalizations for children (defined by rela-tionship to the head of the household, not byage) is not meaningful since the 35-to-54-yearage group reported so few.

Self-respondents tended to be predominantlyfemale and those who have proxy respondentswere predominantly male. Since the bestreporting was for younger self-respondents, itwas hypothesized that this superiority mighthave been clue to the fact that hospitalizationsof these respondents might have been heavilyweighted with normal birth of babies, a categoryalmost perfectly reported. Deliveries accountedfOr nearly one-quarter of all the hospitalizationsin this first SRC study.17 The lower part oftable 16 shows the underreporting exclusive of

_rmal deliveries.The overall trend for increased underreporting

of hospitalizations with age disappeared whendeliveries were excluded. Self-respondents under35 still showed superior reporting and adultsover 55 with proxy respondents showed lessaccurate reporting. A second study' 8 ofhospitalization revealed similar age patterns:younger self-respondents showed less under-reporting and older persons with proxy respond-ents more underreporting.

From the aurd SRC study,1 a high rate ofnonreporting of visits to physicians hy respond-ents 55 years of age and over is shown below:

Age of respondentRecorded

visits


8-34 years 121 2035-54 years ' 200 2055-74 years 79 34

In the SRI study,3 in which all personsreported for themselves, chronic conditions werereported more accurately by older _respondents(65 years and over) than by younger respond-erits. This is true for both male and fern'alerespondents (see table 17). A second study ofthe reporting of chrOnic conditions (HIF)confirms this pattern.4

These contrasts in response patterns suggestthat the p'toblerh of Underreporting is not one ofmemory- which usually deteriorates with age.

Table 17. Percent of recorded chronic conditions not reported ininterviews, by age and sex of respondent: Stanford ResearchInstitute

Age of respondentoth

sexesle Female

Total, all ages

Percent

45

not repor

17-24 years 48 35 725-34 years 4535-44 years 4 48 4845-54 years 4 40 4955-64 years 48 49 4765-74 years 36 40 3275-89 years 37 41 31


The seriousness or impact of the condition, thenumber of conditions, or the frequency ofphysician visits may influence the reportinglevel.

The SRI study demonstrated that the numberof visits made to physicians during the year washighly correlated with the probability that aknown chronic conditiOn would be reported.3The natu of the task is another factor whichmay explain some of the demographic relation-ships with differential reporting of hospitaliza-tion and chronic conditions. For hospitaliza-tions, the respondent was asked whether or notany member of the family had been in thehospital at any time during the past 12 months.For conditions, the respondent was given a listand asked whether or not he had had any of thelisted conditions at any time within the past 12months. Since the conditions were chronic, the _

probability is high that if a respondent had hadany condition at any time during the past year,he would still have hal it on the day of theinterview. Thus, where-bo'h questions appear toask for recall, the am age .z:lapsed timespan wasactually much longer kr hospitalizations thanfor chronic conditions. This may explaih whythe data show 'a decreal_e in reporting . ofhospitalizations over the ycars, but no similareffe-ct fOr the reporting of -r-hronie conditions.While these patterns also reflect the effect's' ofother variables, it seems clear that a respondent'sage initself will be a predictor of whether or nothealth events will be reported.

Sax of Respondent

It_ has been suggested by_ some investigatorschat illness is perceived as some sort of weaknessand is more appropriate to the female than tothe male role. Admitting to illness may threatena man's self-image and, therefore; he mayunderreport illnesses.

Maintenance of the family health is perceivedthe role of wife and mother. It can be argued,

then, that if one is to use a single respondent toreport about the family's health, the wife shouldbe chosen. (This assumes that illness of otherfamily members is not perceived by the womanas a failure in her role performance, whichwould lead to the prediction of greaterunderreporting on her part.)

On the reporting of chronic conditions theSRI study3 showed that males failed to report44 percent of their own conditions, whilefemales failed to report 46 percent. Similarly, inthe HIP study,4 which compared male andfemale respondents reporting for themselves orfor spouse and children, the reporting differencenever exceeded 2 percent.

Male and female respondents showed ahnostno difference in their reporting of hospitaliza-tions, reporting either about themselves or aboutother adults or children in the family. Any slightdifferences were in the direction opposite fromthat predicted. Similarly, there was no differ-ence between male and femOe reports ofphysician visits in either the SRC or the HIPstudy. In the HIP study, male and female-respondents reported-as accurately for proxyrespondents as for themselves.

One should not make too many generaliza-tions from these results. It must be rememberedthat the interviewer queried all adults who wereat home when she called. A proxy respondentreported for those not at home. Sinceinterviewers usually worked during the day, onlypeople at home during the day were likely to beinterviewed. The usual persons at home werehousewives, retired or unemployed men, or menwho were at home because of illness. The strongpossibility existed that these men would bebetter reporters of health events both forthemselves and for others in the family thanmales who were not interviewed, that is, thosewho were neither retired nor sick.

Education of Respondent

Since it has been found in some research thatpersons with more years of education are betterrespondents, reporting patterns were examinedby educations! status. The first SRC study ofhospitali7ation showed an interesting pattern:the best reporters were high school and collegegraduates (table 18). The'Sicond hospital studyshowed the same pattern for high schoolgraduates but the sample size was too small toallow separate consideTation of college grad-uates. Respondents who attended college butdid not graduate were poorer reporters thanwere those in lower educational groups. Whetherthis pattern is meaningful or is a chancephenomenon is unknown. Onc could hypothe-size that persons who were diligent enough tocomplete successfully their college educationsmay also be more diligent in fulfilling demandsof other tasks, and thus would be betterrespondents. Neither study shows a particularlystrong tendency for higher educated respondentsto report more accurately than respondents withless education.

Table 18. Number of recorded hospitalizations and percent notreported in interviews, by education of vzspondenz: SurveyResearch Center

Education of respondentRecordedhospitali-zations


Less than high school graduateHigh school graduateSome collegeCollege graduate

646180155

13 _

715


In the SRC study of repor mg physicianvisit,:.1 the high school graduate group did notshow the same pattern As seen in table I9, thecollege group showed much lower underreport-ing than did other groups, but the sample wasnot large enough to warrant any conclusion.

Table 20 shows the underreporting of chronicconditions in the SRI study3 by educationalgroup. In this study education was reported forthe head of the household rather than for therespOndent. in this table the pattern of superiorreporting with increased education is notapparent. There is no indication here of less

2 4

Table 19. Number of recorded physician viiits and percent notreported in interviews, by education of respondent: SurveyResearch Center


0-8 years . . . . . . . . . . .

1-3 years high school4 years high school . .

1 year college or more


12113211333

. -Table 20. Number of _recorded chronic conditions and percent

not reported in interviews, by education of head of house-hold: Stanford Research Institute

Education of respondent Recordedconditions


Less than college

0 years1-4 years6-8 years9-12 years

College

1-2 years3-4 years ... .

5 years .. . j . .

2.608

2,040

4051

41

43

851640549

47


underreporting by high School and collegegraduates as AVM found in the hospitalizationstudy.

The one conclusion from these SRI data isthat respondents with less than college-leveleducation reported more of their chronicconditions than did those who attended college.In contrast, the HIP study showed no consistentpattern of reporting by educational level.

Data from these studies point to no definiteconclusion. One cannot generalize that respond-ents with more education are better at overallreporting tha_n are those with less education.Why the patterns differ for the studies ofhospitalizations and doctor visits from thosefound in the reporting of chronic conditions isnot apparent.

14

In 1965 Fowler" made an intensive analysisof reporting by educational groups in the HealthInterview Survey.- Based on systematic observa-tion of interviewer and respondent behavior heconcluded that less highly educated respondentsneeded more help from the inteniewer toperform adequately. They were less skilled atthe respondent role. There was also thetendency for the Less educated to have lessinformation about the purpose of the survey andwhat was being sought in the interview.Interviewers tended to be more active ininterviews with less educated respondents,helping them to perform more adequately.Fowler considers that the effect of educationmay be in the skill level it represents. However,why respondents would show greater skill inreporting hospitalizations than cluonic condi-tions is unclear.

Family Income of Respondent

In other reseazch it has often been found thatfamily income level is a better predictor thaneither age or education, since income frequentlyreflects both these variables as well as additionalmotivational components.

In the first SRC study, hospitalizations werebetter reported as family income increased (seetable 21); in the second SRC study ofhospitalization reporting, the same pattern wassuggested. The reporting of visits to physiciansshowed no such trend; although the bestreporters seemed to have annual family incomesof $10,000 or more, the sample size was toosmall to yield firm conclusions.

Table 21. Number of recorded hospitalizations and percent notreported in interviews, by annual family income: Survey

=_.

-Research Center

In the SRI study on the reporting of chronicconditions, the best reporters were in the lowestincome group, with no other pattern apparent(see table 22). The ,HIP study of chroniccondition reporting als9-showed that persons infamilies with annual incomes of less than $4,000were the best reporters and here again, no otherpattern was observable. As with education,differences in reporting by income groups arenot consistent among studies.

Table 22. Number of recorded chronic conditions and percennot reported in interviews, by annual family income:Stanford Research Institute

PercentRecordedconditions

not re-ported

_Less than $3,000S3,000-$4,999$5,000-$6.999$7,000-$9,999$10,000 or more

Source: .referenee

46484647

Color of Respondent

The most consistent finding on characteristicsof respondents, one which shows up in three outof the four studies, is that white respondentsreported significantly better than those of otherraces (see table 23). This was true whether therespondent was repotting for himself or for

Table 23. Percent of recorded hospitalizations, chronic condi-tions, and physician visits not reported in interviews, by colorof respondent

Color of respondent

Hospitalize-dons Chronic

condi-tionsSR I3

Physi-clanvisitsSRCstudy

34

SRCstudy

1

SRCstudy

22

WhiteAll °e .th

Percent not

10 16 I

16 27

reported

4550

2422

1Reference 17.2Reference 18.

3Reference 3.4 Reference 1.

other family -members. None of the studiesinvolved enough non-white respondents to

ipermit ntragroup analysis. One can onlyhypothesize about the reasons for the finding.On the surface the differences are too large toreflect educational or income factors; rather,they seem to reflect differences in behavior bycolor.

A similar pattern clid not appear in. the SRCstudy of visits to physicians. However, since thesample for that study came from participants ina voluntary health plan, respondents of colorsother than white who participated in that planwould be expected to differ in several respectsfrom a random sample.

There is no ready explanation for thesereporting differences. It may be that the whiteinterviewer provokes suspicions in respondentsof other colors. It may be part of the presentcultural pattern for these respondents (especiallyNegroes) to be unwilling to divulge information.The answers await further experimentation.

Reporting for Self Versus Reporting forOther Family Member

In the Health Interview Survey each person athome when the interviewer calls is interviewedfor himself, and a "responsible selected adult"reports for persons not at home and for childrenunder 17. As one might expect, the con,rilete-ness of reporting depends about whom therespondent is talking. It is ternpting, in terms oftime and cost, to use proxy respondents.However, the 'data suggest that the practice hassome real dangers in terms of quality ofresponses. -

Table 24, covering results of the first SRCstudy of hospitalization reporting, shows clearlythat the more distant the relationship of therespondent to the person about whom informa-tion was being reported, the poorer thereporting. The increase in underreporting aboutchildren as compared with "self" or "spouse"may be due to the nature of clildren'shospitalizations, which are generally shorter andinvolve less serious conditions than those ofadults. Data in table 25 front the HIP study onreporting of chronic conditions show a similarpattern of reporting for children.

Table 24. N mber of recorded hospitalizations and percent notreported in interviews, by relationship of respondent tosample person: Survey Research Center

Respondent relationshipto sample person

Recorded Percenthospitali- not re-=ions ported

Self-respondentSpouseParentOther relation ...

1,092 7

275 10386 14

78 22


Table 25. Percent of recorded conditions, by checklist status,which were not reported in interviews, by relationship ofrespondent to sample person: Health Insurance Plan ofGreater New York

Respondent relationship tosample person

Percent notreported

COndi.tionson

check-list

Condi-tions

not oncheck-

list

Self-respondent . . ..... 57 79Spouse 62 79Parent 72 82Other relation 68 72


Reporting of visits to physicians1 was bettefor children than for self-respondents, and wasabout the same for self-respondents and adultswith proxy respondents (table 26). The recallperiod in this study was only 2 weeks long, andan adult usually accompanied a child to theoffice; these factors may have accounted for therelatively good reporting of children's visits.c

cit methodoloecal investigation of the impact of the use ofproxy respondents in the Eletli Interview Survey, conductedafter the comple8on of this report is presented in Kovar, M. G.,and Wright, R. A., "An Experiment With Alternate RespondentRules in the National Health Interview Survey,"-1973 SacialStatistics Section, Proceedings of the American StatisticalAssociation, pp.' 311-516, arid Kovar, M. G., and Wilson, R. W."Perceived Health StatusHow Good is Proxy Reporting," 1976Social Statistics Section, Proceedings of the American StatisticalAssociation, Vol. II, pp. 495-500.

Table 26. Number of recorded physician visits and percent notmooned in interviews, by relationship of respondent tosample person: Survey Research Center

Respondent relationshipto sample person

Recordedvisits


Self-respondentParentOther relation .


Conclusions

One cannot leave these findings on reportingcharacteristics of respondents without atfempt-ing some explanations. The general picture thatemerges from these data is that characteristics ofthe respondent are not nearly as consistent, noras strong in their influence on underreporting, as

-are characteristics of the event.One finds effects of age, education, and

income which are not strong, but which areconsistent in the reporting of hospitalizations.The patterns also tend to be consistent for thereporting of chronic conditions. The striking andpuzzling fact is the divergent nature of thepatternspersons with higher education, higherincome, and of lower ages are better reporters ofhospitalizations and poorer reporters of chronicconditionsrAlthoughmanrof thedifferencesare not significant when viewed in isolation, thetotal impression is that the differences aremeaningful and cannot merely be attributed torandom error.

It is likely that these patterns reflect theeffects of other variables, as has been hypothe-sized here. The Lansing, Ginsberg, and Brattenstudy' 2 of underreporting of cac+, loans fromloan companies shows a marked ir ullie effect,with higher -income respondents being poorreporters of their loans. This finding can beunderstood in terms of soc. J acceptability.Higher income people probably perceive makingloans at sinall loan offices as contrary to thenorms of their 'group. Weiss" found thatmothers in lower socioecononi;- groups amore likely to report that their,- aildren wereforced to repeat a grade in school than aremothers in higher socioeconomic groups. Againthe report may be made to be consistent with

behavior perceived as acceptable. Anotherexplanation may be that lower socioeconomicgroups have more sickness; therefore, it hasgreater impact and is reported better. Hospitali-zations, on the other hand, tend to be singleevents and thus may be more difficult to recall.That the task requirements are different in termsof recall and motivation level are other tenablehypotheses. Research is needed to explain thesephenomena.

In the studies presented here there is noindication that special groups are characteristi-

cally poor reporters, with the exception ofpersons of races other than white who aresufficiently consistent in showing high under-reporting to suggest that special research bedevoted to them.

The general con elusion from these studies isthat research on iurproving reporting can mostfruitfully be devoted to the nature of events andthe factors underlying the characteristics ofevents. Problems of elapsed time, impact, andthreat or embarrassment appear to be the mostsignificant issues for research.

BEHAVIOR IN INTERVIEWS

Before effective theories about the cause-and-effect sequences in the interview situation canbe developed, there must be accurate descrip-tions and classifications of the material reportedin interviews. It is to help meet this need thatthe Survey Research Center has continuedstudies which describe the basic nature of theverbal interaction between interviewer andrespondent. Since both SRC observation studiesdiscussed here are available in full report form,this discussion will eliminate many of themethodological details and concentrate on themajor findings and their possible implications.

HEALTH INTERVIEW SURVEYOBSERVATION STUDY

The first SRC observation study4 6 byCannell, Fowler, and. Marquis was carried out incooperation with the National Center for HealthStatistics.d Five kinds of measurements wereaken for each respondent:

a. I n f ormation about respondent demo-graphic characteristics and family health inthe regular health interview;

b., A detailed account of the interviewer andrespondent behavior as recorded by a thirdperson observing the interview;

c. The interviewer's rating of the respondentfollowing each interview;

d. A reinterview with the respondent con-ducted by a second interviewer within 2days following the original health inter-view; and

c. A staff interview with each health interviewer following the completion of herassignment.

Complete data are available for 412 respond-ents from a cross section sample of the area eastof the Mississippi (excluding the extremeNortheast). About four-fifths of the respondentswere women, and about half of the respondentshad less than a high school education.Experienced female interviewers employed bythe U.S. Bureau of the Census conducted thehealth interviewing. Another group of women,also employed by the U.S. Bureau of the Census,carried out the behavior observation. Thereinterview with respondents was conducted bya Survey Research Ce:,ter interviewer, ratherthan the original health interviewer. The staffinterview with the health interviewer was alsoconducted by a Survey Research Centerinterviewer.

Health Interview Survey Data

The information that the respondent fur-nished about his own health during the regularhealth interview was used in this study to createa dependent variable. The dependent variable

dA report of this study may bc obtained from the NationalCenter for Health Statistics, Vital and Health Statistics, MSPub. No. 1000-Scrics 2-No. 26.

was the number of chronic and acute conditionsthe respondent reported for himself, withadjustment for gross differences in actualsickness which could be predicted from knowingthe respondent's age_ The previously cited fullreport of the study details the rationale for thechoice of this particular dependent variable.Evidence is presented which suggests that thenumber of chronic and acute conditions therespondent reports for himself is an indicationof the accuracy of other health data reported byhim.

Observation

During the health interview an observerrecorded what the interviewer and respondentsaid and did. A wide range of behavior classifiedin small segmmts of easily identifiable acts wasrecorded for both interviewer and respondent.In order to record different kinds of behavior,the interview was divided into segments, eachcontaining a specific set of questions. For eachsegment several particular kinds of behaviorwere observed and recorded. In this way, a widevariety of behavior could be recorded while thetask was kept within the observer's capabilities.

While the interviewer was still at the door, theobserver recorded such things as: the time ofday, how long the interviewer had to wait forthe respondent to open the door, what theirifefir-iewer said aTih-Fintrodticed herselfthe study, how many questions the respondentasked, and who took what kind of initiative toget the interview started. The observer also madetwo ratings about how receptive the respondenthad been to this point in the interview. After theactual interviewing started, the observer re-corded the occurrence of different kinds ofbehavior at different points in the interview.Special attention was paid to irrelevant behaviorwhich departed from the task of asking andanswering the questions on the questionnaire.Among the categories used to classify thisirrelevant behavior were: talking about the otherperson (such as giving praise), asking irrelevantquestions, and giving suggestions. Conversationabout the respondent or his family, friends, etc.,was -also considered irrelevant when it was notdirected to the specific question asked. Anothermajor category of irrelevant behavior was

humor, consisting of laughter, jokes, and othermeans of relieving tension. The observer alsorecorded the reaction which the other personhad to each instance of irrelevant behavior.Reactions were rated on a 3-point scale, from"very encouraging" to "very discouraging."Throughout the interview the observer kepttrack of the kinds of potential distractionspresent (children, other adults, TV, radio

During three separate parts of the intertiewthe observer concentrated on the question-answer interaction between the interviewer andthe respondent. Seven types of behavior wererecorded for the respondent:

a- Adequacy of answer;b. Elaborateness of response;c. Inadequacy of answer;d. Need for clarification or repetition;e. Checking with another person or with

records;f. Reference to calendar; andg. Doubt about the adequacy of an answer.

Five specific kinds of interviewer behaviorwere also counted. They were:

a. Repeating the answer from the question-naire;

b. Asking a question, not on the question-nai re f--whi chdidnots ugges t---ananswer(nondirective probe);

c. Asking a question, not from schedule,which might have suggested a specificanswer, or asking respondent if she agreedwith a specific answer (directive probe);

d. Clarifying the meaning of the question; ande. Suggesting that records, Calendar, or other

people be consulted.

Several other attempts were made to examinedifferent aspects of task-oriented behavior. Inone section the interviewer counted the numberof times the respondent paused before giving ananswer, the number of times the respondentasked for clarification or elaborated on ananswer, and the number of times the interviewerasked additional questions. During one particu-larly difficult part of the interview, specialattention was given to the frequency with which

the respondent had to ask for help, to theinterviewer's behavior when the respondentmade such a request, and to the effort made bythe respondent during this difficalt part of theinterview. Between sections of the interviewwhere the observer recorded task-relevantbehavior, she recorded her impressions of therespondent's reactions. For example, she ratedthe respondent's attitude (enthusiastic, bored,irritated), his understanding of the question, andthe smoothness of the interaction betweeninterviewer and respondent.

At the end of the interview the observerrecorded the length of time spent in conversa-tion after the last question was asked and triedto determine whether the interviewer or therespondent was more willing to continue thisconversation. After the interview was com-pleted, the observer filled out two pages ofratings on thc respondent and recorded her ownimpressions of the interview.

Interviewer Ratings of the Respondent

After the health interview, the interviewerrated the respondent by describing her ownperceptions of respondent attitude and her ownattitude toward the respondent. The ratingscales used by the interviewer were similar to theones used by the observer at the end of eachhealth interview.

Reinterview With the Respondent

A major attempt was made to ascertain therespondent's reactions to being interviewed byconducting a second interview within 2 daysfollowing the original health interview. Thequestionnaire used in this reinterview focused onthe respondent's feelings and attitudes about theinterview and interviewer: his level of informa-tion about surveys in general and this one inparticular, his motives for cooperating with theinterviewer, and his feelings about the questionsand about his role as a respondent. A specialattempt was made, using semiprojective tech-niques, to uncover any negative feelings that therespondent had about the original interview thatmight be eifficult to express directly to anotherinterviewer.

Interview With the Health Interviewer

After all her observed interviews werecompleted, each of the 35 health interviewerswas in turn interviewed by an SRC interviewer.The health interviewer was questioned about her

udes toward her job, her feelings concerningthe interviewing of different kinds of respond-ents, hcr reactions to specific aspects of herwork, and her reactions to the questions on thehealth interview schedule.

Results

It was originally hypothesized that severalkinds of respondent psychological variableswould have a major effect on the quality of datareported. Specifically, it was felt that reportingaccuracy would depend on the amount ofinformation which the respondent had about theinterview and its sponsors in combination withthe respondent's general attitudes, motivationpatterns, and particular perceptions of theinterview. It was expected that the informationlevel, attitude, motivation, and perceptioncharacteristics of the respondent would also bereflected in the behavior observed in the originalinterview.

This attitude-based interpretation of thecauses of accurate and inaccurate reporting isnot new. Experience has been accumulated over

_man_y_years_Aboth_imim the psychologicallaboratory and from the world of advertisingand marketing) which would enable theresearcher to design techniques to changerespondent attitudes, motivations, and percep-tions and to supply information or correctmisinformation. Based on the assumption that

oor reporting was due to such variables as lowlevels of information and inappropriate atti-tudes, new studies were designed and testing wasstarted on some attitude- and information-level-change techniques. However, as the main databecame available, it became clear that thehypotheses concerning the causes of poorreporting were unsupported. There was practi-cally no correlation between the dependentvariable (a measure of reporting quality) and thecomplex indexes of information level, attitudes,motivation, and perception. (See Mueller,Sehuessler, and Costner47 for further informa-tion.) These cogniti% e variables were for the

most part also unrelated to behavior. Some ofthese data are reproduced in table 27.

As more and more of the data were analyzed,it became apparent that the actual behavior in

Table 27. Relationship of respondent cognitive variables toreporting index and behavior indexes

C-ognitiva variables

Gamma coefficients ofassociation1

Reportingindex

Behavior index

Task-oriented Irrelevant

General feelingstoward interview

Direct questions ,Semiprojective ques-

tions

Citizen's duty . . .Desire to talkPersonal benefitOpportunity for a break

in routineConcern about health

Stated reasons for nottrarmits.

-01

.03

.08.05.13

.08.07

.01

.03

.02.21

.14

.07

,12

the interview was the main variable thatcorrelated with the index of reporting quality.Thus it appeared that if changes were to bemade in the accuracy with which respondentsfurnished data about their health, behaviorpatterns in the conduct of the interview wouldhave to be altered; some kinds of behavior mightbe more conducive to good reporting than otherkinds. To test this, correlations were obtainedbetween the reporting index and the frequencyof various kinds of behavior which took placeduring the interview. The preliminary resultssuggested that the kinds of behavior normallyconsidered task oriented (asking for clarifica-tion, giving elaborations upon answers, andconsulting records, on the part of the respond-ent, and probing by the interviewer) were morehighly correlated with the dependent vatiablethw the kinds of behavior which are consideredto be less relevant to task performance, such astalking about self or joking. To illustrate, therelationship between respondent behavior andreporting3

,14 Respondent behavior index

.20.20

Gamma,reportingindex1

Teek-oriented . . . . . . .. . . . . . . .Interpersonal . . . . . . . . . .. . .

Concern about timeConcern about the ques-

tions

Perceptions of therespondent task

Interviewer wantedexact answers ....

Interviewer wantedverything to be

reported .........Reason for collection

of information(statistical

.01

.15

.18

.10

.22

.12

.07

.13

.02

1None of the coefficients of association is significantlydifferent from :far o at the 5-percent level. Gamma Is a

nonparametric measure of association based on rank order thatranges from 1 to +1. Near zero, it indicates little associationbetween the variables being tested. For further information seereference 47.

1Both coefficients are significantly different from zero (p

Further investigation revealed, however, thatthe frequency with which any one category ofbehavior occurred in the interview was highlycorrelated with the frequency with which anyother kind of behavior occurred. Thus it was notfound that some interviews were predominantlytask oriented and others predominantly inter-personally oriented. What was found was ageneral behavior activity level characterizing aparticular interview. The higher the behaviorlevel (the more frequently each kind of behavioroccurred for both interviewer and respondent),the 1-4her the score on the reporting index.

Since it was impossible to determine which ofthe behavior categories, if any, determined thisgeneral activity level, it appeared logical toascertain who was responsible for setting theactivity levels. Since the data are correlational, it

is difficult to determine directly whether therespondent or the interviewer had majorresponsibility for determining the amount orlevel of tehavior in the interview. However, itwas determined that interviewers themselves didnot have 'a characteristic behavior level for allinterviews. The data also indicated that therewas an extremely high correlation between thelevel of behavior of the interviewer and the levelof behavior of the respondent:

Interviewer behavior index

Respondent be-havior index1

Task-orientedInterpersonal

o hzero to < .05).

cients are signi icantly different from

Thus it appeared that the amount of behavior inthe intemiew tended toward some sort ofbalance. If the interviewer engaged in a highlevel of behavior, so did the respondent, and viceversa. It was also noted that the balance wasmost likely to occur when the behavior levels ofthe intei-%iewer and respondent were eitherespecially high or especially low. A specialstatistical treatment of the behavior index datawhich shows the high probability of ba-;ance atextreme-behavior-levels-is-given-in-the originalresearch report by Cannell, FLiMer, andMarquis," pp. 26-27.

The major conclusion from this observationstudy was that the original hypotheses about theeffects of such variables as information andattitudes on the quality of reporting wereprobably wrong. The major causes of good andbad reporting are probably to be found withinthe interview itself, particularly in the behaviorof the participants. It was unclear whichvariables determined the behavior of partici-pants. The interviewer could be responsible forsetting the behavior level, the respondent couldhave primary responsibility, both could shareequal responsibility, or the behavior level couldbe determined by some other variable orvariables. The data led to speculation on aprocedure referred to as a "cue search" model ofinterview interaction. It may be that the

household interview is a rather unique experi-ence for the respondent and that he really doesnot have a set of predetermined behaviorpatterns for it. The newness of the interviewsituation might mke it difficult to generalize hisassociated feelings, attitudes, and expectations.The respondent must look to the interviewer orto some other source for cues about expectedbehavior. On the other hand, the interviewermay be in somewhat the same situation. She haslearned from experience that respondents aredifferent: some will enjoy the interview andothers will be annoyed by it, some will havetrouble with certain sections of the question-naire while others will not. Therefore, theinterviewer will be attentive to subtle cues fromthe respondent to help her arrive at a strategyfor dealing with each particular interview.

This hypothesis implies that both interviewerand respondent search for cues from each otherabout appropriate kinds of behavior. Thiscue-searching process could account for thestrong tendency of interviewer and respondentto behave at the same level of activity in theinterview. This heavy reliance on cues from theother person to set the behavior pattern mayalso account for the fact that cognitiveorientations measured in this study were notpredictive of behavior or reporting level. Inaddition, the reciprocal cue-searching processmay explain why this research did not determinewhether-one person-sets-the-behavior-activitylevel and the other follows.

Subsequent research has shown that changingthe characteristics of interviewer behavior canhave marked effects on both the amount and theaccuracy of health data reported by respond-ents. These studies, while quite limited, supportthe general interpretation of the findings of thefirst observation study: namely, that changes inresponse accuracy are most likely to be achievedby changing the hiteraction process in theinterview itself. -These studies also show thatchanges in interviewer behavior will often beaccompanied by changes in both respondentbehavior and reporting accuracy. They do notrule out the possibility that data accuracy maybe significantly affected by the respondent andother sources of variation, but they show thatthe interviewer can have at least some beneficialeffect independent of other possible influences.

URBAN EMPLOYMENT SURVEYBEHAVIOR INTERACTION STUDY

With the cooperation of the U.S. Departmentof Labor, Urban Employment Survey, theSurvey Research Center conducted anotherstudy2 of the behavior of the interviewer andrespondent in the household interview. Thisstudy by Marquis and Cannell differed in anumber of ways from the observation studydescribed above. Data on the verbal behavior ofthe interviewer and respondent were obtainedthrough a tape recording of the interview ratherthan through the recording of impressions by athird-person observer. The tape-recording proce-dure substantially reduces data collection costsand allows a much more detailed and refinedcoding of the verbal behavior that occurs duringthe interview.

In this study, a cross-sectional sample of 181employed male respondents residing within thecity limits of Detroit were interviewed. Therewere four interviewersall of whom were white,female, middle-aged, and residents of thesuburbs.

New Coding Scheme

The study employed a revised coding, schemefor interviewer and respondent IcrbaI behavior.The coding scheme omitted all nonverbalbehavior. It also included more code categories-for--task-relatedbehavionseveralcategoriesreflected the way in which the question wasasked, and more detailed codes recorded theway the respondent answered questions. Becauseof the research on the effects of interviewerreinforcement which had taken place betweenthe first and second observation studies, a codefor interviewer feedback and respondent feed-back was included in the new scheme. Severalcodes dealing with irrelevant conversation weredeleted since they had not proved useful in theprevious study. A summary of the new codingscheme is presented in table 28. Additionalitems derived from more recent studies end thetable.

Main Findings _

Behavior balance.One of the main findingsfrom the original observation study was that the

behavior of .he interviewer and the respondentwas best described in terms of a "balance"model. That is, if one person was engaging in agreat deal of behavior, so was the other. This isin contrast to another pattern which might beexpected: namely, that a low level of respondentbehavior would be compensated for by a highlevel of interviewer behavior, and conversely, ahigh level of respondent behavior would beaccompanied by a low level of interviewerbehavior. In this second observation study it waspossible to control, both in the questionnairedesign and in the statistical treatment of thedata, the number of questions asked over theentire interview. Thus it was possible tocompute an index of interviewer behavior leveland respondent behavior level per question. Theability to control the number of questions madeit possible to remove one source of variationwhich might have accounted for the behaviorbalance phenomenon observed in the firstobservation study. The correlation coefficientbetween the amount of interview behavior perquestion and the amount of respondent behaviorper question was .77. This demonstrates againthe strong interdependence of interviewer andrespondent behavior during the interview. It alsoindicates that the variables or parameters whichhave a causal effect on reporting quality areprobably to be found in the behavior interactionwithin the interview rather than in the personalcharacteristics (e.g., attitudes) of either of theparticipants.

Question asking and probing.Because of theexpanded coding scheme and because theinteractions were tape recorded rather thancoded during the interview, the second studyprovided a much more detailed description ofthe kinds of verbal behavior that occurredduring the interview. The descriptive dataconfirmed that those interviewing proceduresfor which the interviewers were trained, such asquestion asking and probing, were carried outeffectively in accordance with accepted proce-dures. Interviewers asked the question in thecorrect manner more than 90 percent of thetime. Respondents gave many answers which didnot meet the objectives of the question, andinterviewers used many probes in attempts toget adequate information. The probes used weregenerally nondirective; that is, they did not

Table 28. Summary of revised coding scheme for interviewer and respondent verbal behavior used in the urban employment subehavior interaction study

Code item Description

Correct question

Inca elm question

Inappropriate question

Incorrect question ..

Repast question

Omitted question

Skip part

Nondirective probal

on previously

Directive probel

Gives clarification. . ......

Volunteers information

Interviewer behavior

Question asked essentially as written on the questionnaire

Part of a question correct as far as it goes

Question that was asked but should have been skipped due to a skip pattern

Question in which meaningful wordisi helve been altered or omitted

Question thin has already been asked and 's asked correctly again

Question omitted by mistake, contrary to the questionnaire instructions, and for which therelevant information has not bean obtained by means Of a preceding question

Question omitted because sufficient information to coda an adequate response has previouslybeen volunteered by the respondent in answer to a prior question

Question omitted because of skip pattern prescribed by the questionnaire instructions

A probe that neither suggests a specific answer or class of answers nor restricts the frame ofreference of the original question

A probe that suggests possible responses or implies that some answers are morethan others. It restricu the frame of reference of the original question.

Gives clarification upon request of the respondent regardless of whether the information SOP-plied is correct or Incorrect. Includes also rephrasing or explanations of questions.

Volunteers Infor ition relevant to the topic of the question or interview. Includes transitiOnstatements.

Adequate answer

Lane enswe

Don't know an e

Refuses answer

Other -answer

Respondent behavior

An adequate response to a correctly asked question that meets the objectives of the questionas stated in the Interviewer's Manual. Incorrect clarification does not rule out the occur.ranee of an adequate answer. May also occur as the result of a probe, provided the responsemeets the question objective.

An inadequate response to a properly asked question that does not meattives as stated in the Interviewer's Manual

ion objee-

Response to a correct question that indicates that the respondent does not know, only If nofollowed by an attempt to answer the question

Verbal refusal to answer question

Response Ito an incomplete or Incorrect question or a responsemeet the question objective

ElaboratiOn

prObel that does nOt

Gives reason fore response or supplies More informand Is relevant to the question topic

ion that required for en adequate answer__

Requests clarification of a question or question objective

Table 28. Sum ary of revised coding scheme for interviewer and respondent verbal behavior used in the urban employment surveybehavior interaction studyCon.

Feedbacic

n oing feedback. . .

Repeats answer .

lrrJevant conversation . .

Gives suggestion .. . . .

Polite behavior .

Interruption

Laugh

Other

Extraneous interaction

Modified question

Alternatives incomplete

Infers answer

"Anything else" probe

Invents task-oriented question

Additional response

Closure

Simultaneous answer . .

Behavior that indicates attention, approval, understanding, or how well the other person isdoing, only if not a response to a question or a probe (excluding "Thank you")

Ongoing feedback that indicates attention, approval, understanding, or a desire to interruptwhile the other is talking without successfully interrupting the speech of the other person

Repetition of response either exactly or as e summary, or utilization of previous respontransition to a new topic or for asking a question or a probe

Statements unrelated to the question or general field of the inquiry. Generally rapport-building or personal rather than taskriented behaviors.

Suggests new kind of behavior that will enhance, interrupt, or resume task behavior

Polite behavior or socially expected courtesies not specifically related to task and not in-cluded in the printed question on the questionnaire (e.g., "Please," 'Thank you")

Successful interruption. The other person must stop talking. Blocks can't occur at the and ofa sentence or at a pause which might be considered the end of a question.

Audible laugh, chuckle, or snicker that may indicate humor, teneon, or ridicule

Any significant behavior not elsewhere coded or unintelligible verbal behavior

Interaction of either interviewer or respondent with a third person during the interview

ADDITIONAL CATEGORIES2


Question worded essentially-as written but with unimportant modifications

Fails to read all or some of response alternatives because interrupted by respondent

Omits question because interviewer can infer the answer even tough it has never been stetexplicitly by respondent

Special case of nondirective probe

Invents new question to gain or confirm in ormation necessary to follow skip instructions

Respondent behavior

Given for each adequate answer beyond the first one for open-ended questions

Respondent indicate information to give on open-ended question

Respondent answers while interviewer is talking

1A probe is a question or statement used by the inte iewer to elicit further information. ft is a creation of the interviewer andnot on the questionnaire.

2Additional categories have been derived from more recent studies.

24

suggest to the respondent any particularanswers. This kind of probing, according to thetheory, helps to avoid the introduction ofinterviewer bias. Furthermore, the data suggestthat the probing theory, with its emphasis onnondirective as opposed to directive probing, iscorrect. In this study interviewers were muchmore successful in getting adequate answersafter nondirective probes than after directiveprobes. These results are tentative however,since it is possible that interviewers usednondirective probes when they expected thatthe respondent would have no trouble in givingan adequate answer and used directive probesonly when they anticipated a great deal ofdifficulty in getting an adequate answer. Naturalobservation studies of this type are subject tothis limitation on inference. Experimentalstudies are needed for a more refined analysis ofthe actual cause-and-effect relationships.

Feedback and nonprogramed behavior.Another major finding from these data is that alarge proportion of interviewer and respondentbehavior is nonprogramed behavior. That is,much that goes on during a household interviewis not considered in typical interviewer training.Two sets of data illustrate this point.

One discovery was that interviewer feedback,the interviewer's verbal reaction to the respond-ent's answer (such as "O.K.," "I see," "Good,"),occurs very frequently. In fact, in this studyinterviewer feedback accounted for about 23percent of alnnterve behaviOr coda. Theeffect which interviewer feedback may have onthe accuracy of these data is discussed in detailin the section "Use of Feedback To IncreaseAccuracy," of this report. These observationaldata indicate that feedback is very frequent andis probably used in a way that is nonproductive,or even counterproductive, of good data.Specifically, the data indicate that positivefeedback statements occurred just as frequentlyafter inadequate answers as after adequateanswers. Most surprising was the finding thatpositive feedback statements were used over halfthe time when the respondent refused to answer

. .ia question. The probability of the nterviewer

using a feedback statement after nine differentkinds of respondent behavior is shown in table29.

Table 29. Probability of interviewer feedback following respond,ent behavior, by category of respondent behavior

Category of reçpondent behavior

Adequate answer

Probabil-ity thatinter-

viewereedbackfollows

.28Inadequate answer .14Don't know" answer .18

Refusal to answer .55Other answer .34Elaboration .30Repeats answer .32Gives suggestion 33Other behavior (not classified elsewhere)

The significance of the pattern of feedbackuse demonstrated in this study is not entirelyclear, but results from this and other studies leadto some hypotheses. In another section of thisreport several experimental studies are describedin which different ways of using positive feed-back statements were tested (see "Use ofFeedback in a Nonexperimental Interviewthis report). These studies show that ifpositive feedback statements w,ere used by theinterviewer only after the respondent has givenari adequate answer, more accurate data werereported than when no feedback statementswere used. The data in table 29 indicate thaLthe_interviewers used feedback in a random fashionor when they felt some tension was developingor about to develop. Neither the effects ofrandom feedback contingencies nor of tension-reduction contingencies have been evaluated inthe household interview setting. An educatedguess at this point is that these strategies are lessproductive of accurate reporting than the usuallaboratory strategy which provides verbalreinforcement only after desired respondentbehavior. Further research is planned in thisarea.

Another technique for determining if there ismore to the personal interview than askingquestions and giving answers is to divide thebehavior data into two parts: (a) the averageamount of behavior needed to get an adequateanswer to a question, and (b) the average

25

amount of behavior wlvich occurs after anadequate answer and before the next question.Data in this study show that, on the average,one-third of all behavior occurred after anadequate answer uid before the beginning of thenext question. Computer programs are beingmodified to explore in greater detail the kind ofbehavior that takes place after an adequateanswer. Although these results are not yetavailable, it is possible that this "extra" behaviormay represent a large potential either for bias inthe data or for cues which lead to even moreaccurate information.

E ect of type of question on behav onThebehavioral data from this study make it possibleto explore how different kinds of questionsresult in different kinds of behavior patterns. Inhis doctoral thesis, Thomas deKoning4 8classified questions on two independent dimen-sions: open-closed and fact-attitude. An openquestion was defined as any question to whichthe respondent must formulate his own answer,while a closed question was defined as one towhich the respondent might either anwer "Yes"or "No," or respond according to the alterna-tives supplied in the question. DeKoning'sclosed question was similar to what others call aforced-choice question. The dependent variablewas the average number of behavior codesassigned per question. Results which aresummarized in table 30 indicate that, as mightbe expected, them was more behavior recorded

--for-operiqueitiOns Wan for cIöd quesiions.When these data are split into specific behaviorcategories, it appears that interviewers probeabout three times as often and provide abouttwice as much feedback for open as comparedwith closed questions. On the other hand, therespondent is about six times as likely to give an=acceptable answer to an open question as to aclosed question. This pattern of results suggeststhat open questions present the respondent withmore difficulty in meeting the objectives of thequestion. This conclusion is supported by datapresented by "Question Length and ReportingBehavior," in another section of this report, andby the' results of an experimental study byMushall, Marquis, and Oskamp49. Open ques-tions require the respondent to retrieve informa-tion from memory with a minimum ofstimulating cues. On the other hand, closedquestions involve only recognition memory.

Table 30. Rate of recorded behavior for open and closed ques-tions, by type of behavior

Type of behavior

atehavior perquestion

Openques-tions

Closedes-

lions


Total 3.57 I 2.13

Correct question 0.91 0.91Probing 1.05 0.32Feedback 0.86 0.46Other behavior 0.75 0.44

Respondent behavior

Total ........ . .. 2.76

Adequate answer .......Unacceptable answer Iinadequate, don'know, refusal) ........ .. . . . .36

Other answer . .63Elaboration ....... .39Other behavior .37

That is, all the respondent is required to do inresponse to a closed question is to decidewhether the stated characteristic_isiruc_or_false,_good or bad. While such questions do have_advantages of clarity and case of recall, it isoften necessary to ask many of them in order tocover the same material as is coveted by oneopen-ended question.

DelConing48 also showed that there is ahigher behavioral level in getting an answer to anattitude question than .to a question of fact.However, the differences are not quite so largeas for open and closed questions. For attitudequestions interviewers are more likely to probe,to give feedback, and to engage in irrelevantbehavior and laughter than for fact questions.Respondents are more likely to give unaccepta-ble answers, to ask for clarification, and toelaborate upon their answers when respondingto attitude questions. These data suggest thatattitude questi6ns are somewhat more difficultand cumbersome to handle than are factquestions, but this difference is small and may

be due to other variables confounded with theattitude-fact distinction.

Diagiosing specific question problems.-Oneof the intriguing motives for using the behaviorcoding technique is to arrive at a systematicevaluation of the adequacy of specific questionsas they appear on the interview schedule_ Thiskind of evaluation procedure may be very usefulin the pretest phase of questionnaire construc-tion. Social science strives to be a scientificdiscipline, but the procedures used by socialscientists to develop and validate questions andquestionnaires are generally crude. One usuallysends a group of interviewers into the field withthe questionnaire developed in the office. Thereis then a meeting (or series of meetings) duringwhich interviewers and the researcher discuss thequestionnaire. One hears familiar statementssuch as "This question seems to work well," or"This question seems to do what we want it tobecause we have the distribution of responses."The interviewer might say, "I don't think therespondents really understood this question," or"This question irritates people." It is on thebasis of such highly subjective evaluations thatquestionnaires are jr,loped.

In preliminary work done -r7- tile U.S.Department of Labor, are biaN!or-codir:method has shown conside..: promise forevaluating some aspects .ri ,:test ;:iterviewing.

There seem to be ie :A problemswith questions: (a) tho:e atiributable to theinterviewer, (b) those '-lickt T-eti4 -Crrevon-dentdifficulty, such as failure to understand thequestio:i or trouble in recalling information, and(e) problems caused by the questions them-selves, such as poor syntax, "skip" instrrietionsthat are difficult to follow, or ohs r. areplacement on the interview schethoe Asmall-scale attempt was made to trace que5problems to one or more of these sourc,,:susing a small number of the codes obtained inthe behavior observation study.

For example, those questions that were askedincorrectly by the interviewers at least 15percent of the time were identified. A questionwas coded as "incorrectly asked" if importantwords or phrases were changed or omitted.'" The

eMore s ringent criteria were 115C.; She last section oquestiormaire which contained mosey idc questions.

list of incorrectly asked questions revealedseveral items which contained parentheticalphrases, others wi-uch contained difficult syntax,and still others which were extremely cumber-some to handle in verbal form. The first set ofquestions pointed to the fact that duringtraining, interviewers had been given incon-sistent rules about handling parentheticalphrases. The second group of questions,containing awkward syntax, pointed to aproblem that has been overlooked by manyquestionnaire designers: When questions areextremely long and complex, respondents ofteninterrupt at the end of a clause to answerwithout allowing the interviewer to finish thequestion.

Another set of 39 questions was identified ashaving been answered inadequately more than14 percent of the time even though they wereasked correctly. From this list there appeared tobe two reasons why a question would receive aninadequate answer code a high percentage of thetime: (a) the respondent was unable to answerbecause the required information was not easilyaccessible from memory, and (b) the interviewercould not discriminate between an adequate andinadequate answer, and therefore mistakenlyaccepted the inadequate answer as meeting theobjective of the question.

Another analysis of question problems wastried in which two kinds of interviewer omissioncodes were combined to show different kinds ofquestiön-probrehis. TM:logic 01 tliat-Wilysicir-as follows:,

Nature of problem

Interviewer errorQuestionnaire redundancySkip pattern or format problemNo problem . ..

Code (N)1

H igh

H igh

LowLow

Code (.12

HighLowHighLow

1Code N indicates a question was omitted because the answerwas already given.

2Code indicates that the question was omitted by mistake.

NOTES: "High" indicates the question was omitted 10 per-cent of the time or more.

-Low" indicates the ques inn was omitted less than-10 per-cent of the time.

he Thereomitted

c 27 questions identified as beingtimes, either because of error or

- 27

because the interviewer thought the answer hadalready been given. The data suggest thatOmission problems Hke these might be overcomeif interviewers received better instruction as towhat constitutes an adequate answer. Thirteenquestions were identified as belonging to groupthree, questions which were omitted often bymistake but were not skipped because an answerhad already been obtained. This finding alsosuggests the need for better irn..-:rviewer trainingsince these questions are often skipped becausethe interviewers assume they have been an-swered adequately through previous questionswhen indeed they have not.

On the other hand, better interviewer trainingconcerning the objectives of each question maynot entirely solve the omission problem. Otherdata indicate that omission rates are above 10percent only when a question is to be asked of asubsample of respondents. Questions whichmust be asked of all respondents are almostnever subject to high omission rates. Of the 71questions in this interview which were to bcasked of all respondents, only 1 was omittedmore than 10 percent of the time, while.pf the102 questions to be asked of subsamples ofrespondents, 55 (54 percent) were skipped morethan 10 percent of the time, either by mistakeor because an adequate answer had already beenobtained. Thus, omission problems may betraced to skip instructions and other subsam

ues. While these _procedures areoften necessary,_ the questionnaire designershould be aware of the potential for intervieweromission error whenever subsampling techniquesare used. The subsampling omission bias may beespecially acute in an interview such as the onetested, in which skip patterns occur frequently.

Other procedures to identify question prob-lems have been or will be tried. The possibilityof obtaining systematic data on questionproblems in the pretest phase of a survey studyremains intriguing. Much work is still to be donein devising procedures relating to questiondesign. The Survey Research Center methodol-ogy program is working to develop additionalkinds of logical analysis of question problems, aswell as to reduce the cost and time involved inobtaining such data. When some of these latterproblems are solved, other survey research

agencies should be encouraged to experimentwith these appropriate procedures. There areenough experimental and observation studiesnow in the literature to indicate that questionvariables such as structure and content areimportant determinants of data accuracy insurvey interviews. While many factors maypotentially affect data accuracy, the questionvariables probably have a much greater potentialeffect on overall accuracy and completenessthan does any other single class of variables. Athorough understanding of how question con-struction and question content affect dataaccuracy should do a great deal to advance theusefulness of the survey interview for researchpurposes.

Effect of respondent age and race onbehavior.Social scientists and those responsiblefor the conduct of cross-sectional sample surveysoften hypothesize that the demographic charac-teristics of the interviewer and the respondent,such as age, race, education, and income, willhave some effect on survey data accuracy. Forexample, earlier research has shown that whiterespondents are reluctant to admit prejudicetoward blacks when the interviewer is black.Such results are interpreted as being reasonablein terms of cultural norms concerning prejudice.

The question remains, however, whether theresults of these studies are applicable to thereporting of all information in all surveyinterviews. For example there is, evidence tosuggest that the accuracy of data obtained aboutsuch things as physician visits or financial factsdoes not differ by race of respondent. Thissecond SRC observation study included anexperimental design which provided for theinvestigation of interaction patterns by respond-ent age and race. White middle-class femaleinterviewers interviewed employed male re-spondents. There were four experimental groupsof respondents: (a) white, 18-34 years (N=47);(b) black, 18-34 years (N=-44); (c) white, 35 64years (N=43); and (d) black, 35-64 years(N=47).

With respect to the effect of age on behaviorduring the interview, the data confirm theresults of the first observation study. Olderrespondents were much more likely to engage ina large amount and wide variety of behavior

during the interview. The proportion of theirbehavior devoted to good task performance wasmuch lower than that of younger respondents.In addition, when interviewing older respond-ents, the interviewers displayed high frequenciesof a variety of behavior. Thus, an interview witha younger respondent was quite different froman interview with an older respondent. Theformer appeared to be task oriented while thelatter was characterized by a great deal of extrabehavior which may have resulted in keeping theinteraction at a relatively tension-free level.

The effects of respondent race on the kind ofbehavior shown in the interview were not clear.Two attempts have been made to analyze andexplain these data,2 '4 8 and both produced asomewhat similar set of- inferences based for themost part on nonsignificant statistical trends.The oveniding conclusion is that when age wascontrolled, the effect of respondent race on thekind of behavior in an interview was not markedfor the kind of information contained in theinterview. However, the data do suggest that ifthere was a race effect, it was in the areas ofinterviewer probing behavior and respondentinadequate answering behavior. Although thedifferences were not always statistically signifi-cant, it appeared that the proportion ofinadequate answers (answers which normallyrequire probing) was higher among black thanamong white respondents, and that interviewersprobed_rnore_with_black_thanwith_whitrespondents. Also, black respondents tended togive more "don't know" answers, repeat more oftheir answers, and ask for clarification morefrequently than did their white counterparts.These racial differences are very small but mayindicate slight differences in difficulty with thequestions. The pattern viewed as a whole doesnot indicate any active resistance or lack ofmotivation to cooperate.

There was a slight tendency for whiterespondents to exhibit more ability to giveadequate answers than did their black counter-parts, and interactions with the female inter=viewers seemed to be less task oriented amongwhite respondents. For example, white malerespondents engaged in slightly more politebehavior, feedback, and elaborations than didblack males. White males also seemed to show

more resistance or more dominance whileperforming their task, as indicated by a slightlyhigher percentage of refusals to answer,suggestions to the interviewer, and unsuccessfulattempts to interrupt the interviewer.

In summary, the age effect was found to befairly reliable. It was quantitative rather thanqualitative. Older respondents in comparison toyounger respondents engaged in a higherpercentage of almost every kind of behaviorexcept providing adequate answers and makingrequests for clarification. The race effect wasmuch smaller than was the age effect. Blackrespondents showed a pattern of behaviorcharacteristic of well-motivated performance ona difficult task. White respondents seemed tohave an easier time at the tmk, interacted moresmoothly with the interviewer, and showed aslightly greater tendency toward dominance orresistance. It seems likely that whateverdifferences exist may reflect variables such aseducational backgrotInd of respondents ratherthan race as such.

Interpretation problemsThe iftrying to interpret the nonsignificantbetween the two racial groups poinapparent problem or handicap in the cnrrentbehavior ob,.:.rvation scheme._ The problemseems to be that the readily coded behaviorcategories such as "asks question correctly,"refuses to answer," and "laughs" arc difficult

to_dgfine_in an_abstract sense. Social_ scientistsare accustomed to dealing with abstractconcepts about human interaction, such as:"shows hostility," "is annoyed," "interactssmoothly," or "is having conceptual difficulty."At an even higher level of abstraction theseconcepts might be: "is ingratiating," "showslack of rapport," "enjoys the interview," or "ismotivated." A point to be made in defense ofthe observation technique and its scheme ofcategorizing behavior into small units is thatextrapolating behavior codes to a higher level ofabstraction does not really provide much more_meaning to the data. The theories of humaninteraction to which some have attempted to fitthe existing data have not themselves beenvalidated to any great extent. Interpreting thepresent data in these frames of reference will

probably not yield any greater understanding orpredictive power.

It is suggested, however, that the problem ofattributing meaning to the observational datamay be carried out in a different way. If it isrecognized that the problem of the surveyresearcher is to obtain complete and valid data,the strategy for assigning meaning to the variousbehavior observation codes becomes fairly clear.Empirical research is needed to establish thecause-and-effect relationship between the occur-rence of different kinds of behavior or patternsof behavior and the validity of data reported.

Thus, the next research steps might includeseveral kinds of studies which record and codethe behavior that takes place in an interview andwhich, in addition, obtain independent verifica--tion of the accuracy and completeness of thedata the respondent has reported in his answers.Correlations between the accuracy measures andthe behavior measures would then yieldsignificant insights into the meaning of thebehavior codes or combinations of behaviorcodes.

INTERVIEWER PERFORMANCE DIFFERENCE:

SOME IMPLICATIONS FOR FIELD SUPERVISION AND TRAINING

Most survey researchers believe the adage thatpractice makes perfect or at least makes for'improvement. Thus, they expect seasoned inter-

_vieWers __when _compared with less experiencedones to be more skilled at adapting to variousinterviewing situations, more at ease in interact-ing with respondents from various social classes,

-and more proficient in using nonbiasing proce-,Aures. Interviews conducted by an experienced

. staff -usually present fewer_ problerns in coding;the Jesponses are clearer and more adequate to

:the objectives, contingencies are followed more-strictly, ,and, witn a minimum ot omitted ques-tions, there are fewer noncodable replies.

Data from methodological studies which wererecently examLned raise questions about thepositive effects 'of experience in interviewing.Since the analysis of interviewer behavior wasnot planned as part of these studies, the researchdesigns are inadequate to produce conclusivefindings but are sufficiently intriguing to en-courage further study. As an incidental analysisin one study, the data on failure to reportknown hospitalizations were tabulated for eachitem separately. These data showed a surprisingtrend: The larger the number of interviews takenby a single interviewer, the fewer the hospitaliza-tions reporte.d by the respondents. Althoughrandom assignment of intervicwers was notmade, and the results therefore might reflect aiifference in types of respondents interviewed

10 -

by each interviewer, the finding was sufficientlyinteresting to raise questions about the positiveeffects of experience in interviewing and to leadto an analysis of interviewer performance-inother studies.

The first SRC study' 7 was designed toinvestigate the underreporting of -hospitaliza-tions. A sample of approximately 2,000 hospitalrecords was selected from patients who weredischarged from a hospital within the 12 months

i iprecedng the nterview. The sample was takenfrom hospitals located in counties that were apart of the Health Interview Survey (HISregulai national sample. The- family name andaddr-qc of the person discharged were given tothe .Ircau of the Census interviewers whoregularly conducted the- interviews for theinter-view survey in that county. Twenty-seven ex-perienced Bureau of the Census interviewerswere included in this study. All interviewersworking in the areas in Which sample hospitalswere located did some interviewing. Because the _

number of sampled discharges varied consid-erably by county,- interviewers received varyingnumbers of sample addresses each week.

The questionnaire and interviewing proce-dures used were identical to those of the regular

For details offindings cf ri Health

sample, the procedufes used, and theerview Survey, see reference 1,7

Health Interview Survey. Interviewers were in-formed by the census field office that they wereto undertake a special study and were givenspecial sampling instructions. They were toldthat the regular questionnaire and interviewingprocedures were to be followed. Interviews wereconducted with each adult member of thefamily found -at home at the thne of the visit.For those not at home and for all children, aproxy "responsible adult" was interviewed.

While the interviewers were aware that thiswas a special study, the purpose was notdivulged. In order to attenuate the number ofhospitalizations reported, a sample of names andaddresses drawn from a telephone directory wasadded. However, interviews at these addresseswere not used in the analysis.

Following the usual procedure, each inter-viewer was given a weekly assignment which shewas to complete during that week. The inter-viewing extended over 3 months, and the inter-view reports were matched with the hospitaldischarge records. Table 31 shows the rate atwhich hospitalizations were underreporterls" bythe number of interviews taken by groups ofinterviewers. The data show a tendency for therate of underreporting of hospitalization toincrease as the number of interviews increases.

gThe reader is cautioned not to interpret the figures in thispalier as a measure of net reporting bias, since only underreport-ing is included. An estimate of the rate of overreporting is notpossible with this sample, as it requires a different researchdesign.

Table 31 Number of r

The rank order correlation of the number ofinterviews taken and the failure to report thehospitalization is very high.

Attempts to understand these results bylooking for differences in characteristics ofhospitalizations were not fruitful. The overallresponse rate for this study was 95 percent; thusdifferences could not be attributed to lowresponse. Interviewers were given a weeklyassignment depending on the number of sampledischarges in their county or that part of thecounty in which they worked; thus they had nochoice in the number of interviews to beconducted.

The most tenable hypothesis to explain thesefindings is that interviewers lost interest _andenthusiasm for the work. It may also- be thatinterviewers had some feeling that an'aim of thestudy was to check on their performance,although much reassurance was given that thiswas not the case and they gave no indication of _

such concern in interviews conducted with themafter the completion of their interviewing. Thedata indicate that whatever the motivationinterviewers behaved differently in the earlierinterviews than in later ones.

Data from other validity studies of hospitali-zations and physician visits were then analyzedto see whether the pattern was.replicated and togain greater understanding of the finding. Datawere available from another study on under-reporting of visits to physicians during the 2weeks preceding the interview. A sample was

orded hospital discharges, median number of reported discharges per interviewer, andby number of interviews per interviewer: Survey Research Center

Rank orde of interviewers by numbers of intervie(lowest to highest)

taken

1-5

...... . . = .. . . .

16-2021-25

1Difference between first

Source: reference 17=

e and last five interviewers significant at p

4 2

ate of underreporting,

Recordeddischarges

inn numberof dischargesreported perinterviewer

Rate ofunderreportingdischarges per 1

interviewer -

Median Mean-

54 4 .4237. 47 10 10=5

330 65 12 12.4506 100 15 15=6

697 . 137 12 112.5

31

drawn from the records of a large subscriptionmedical care plan in the Detroit area. A sys-

matic sample of visits to clinic physicians wasdrawn weekly for 8 weeks. With _each weekcomprising a random sample of those visiting theclime during that pw-ticular week, a total of 275

_interviews were conducted: Since many respond-ients had multiple visits, these nterviews_ ac-

counted for a total of 403 visits for the 2 weekspreceding the week of interview.

Ten interviewers were hired by the Survey_Research Center for this study. All were corn-paratively inexperienced. Some had worked'briefly on a Methodological study; others hadworxed- for a short time on the U.S.- decennialcensus, but most had no previous interviewingexperience.

Special training- manuals and material wereprepared and a supervisor with several years of

n this study, although interviewers were notgiven random assignments, the total sample foreach week was an independent random sampleof visits. It is, therefore, possible to makecomparisons of underreporting rates for allinterviewers for each of the 5 weeks in whichinterviewing took place (see table 32).

Table 32 shows a significant increase inunderreporting over the 5-week period of thestudy. The difference between week one andweek five is significant at the 5-percent:=Ievel.There is a decrease in validity of reporting overtime. This finding tends to confirm the resultsshown in table 31. In the first study reportinggot progressively worse as the number of inter-views taken by an interviewer increased, and inthis study underreporting increased as timeprogressed.

interviewer training experience conducted the Table 32. Number of recorded physician Visits and perceA! notraining, assisted by two other experienced field reported in interviews, by week of interview: Su«.e'eY:supervisors. Three weeks of training were com-pleted before the actual interviewing started.Classroom training and field assignments wereconducted during the -first week, and-dunng thesecond week each interviewer was observed asshe conducted interviews at nonsampleaddresses. The third week consisted of inter-viewing assignments which interviewers thoughtwere part of the regular study, but which wereactually addresses from the telephone directory.During the .fieldwork, questionnaires were re-viewed in the office arid errors were discussedwith-interviewers.

The questionnaire was nearly identical to thatused in the Health Interview Survey. The ques-tions about visits to physicians were as follows:.`List- week or the week before, did anyone inthe family talk to a doctor or-go to a dOctor's)ffice or clinic?" Three probe questions wereidded for specific types of visits which respond-Tits might consider-to be outside the scope ofbe:question: (a) "At the time of this visit washe doctor asked for any medical advice for anyAlter member-of the family?" (b) "Did anyoner thefamily get any medical advice from a

ctor over the telephone last week or the week;efore?" (c) "Did anyone in the family see aiurse or technician for shots, X-rays, or otherreatment last week or the week before?"

,search Center

Intervie week_ _ _ _

7810295

Source: refe ence 17

However, even though underreporting in-creased as time progressed, the rate of under-

,reporting was actually lower in the second studywhen a comparison was made of the rates -ofunderreporting by the number of interviewstaken by each interviewer (see table 33). Thisfinding contradicts the conclusions drawn fromthe data of tables 31 and 32. An explanationmay be found in the reasons why interviewersconducted more or fewer interviews in the twostudies and in the way the assignments werecarried out. In the hospitalization study, inter-viewers were given weekly assignments by theoffice and had little to say about the numbergiven because each interviewer worked only inone geographic area and had to take all inter-views in that area. Therefore, some had a heavier

A ri

Table 33. Number of recorded physician visits and percent not of discharges was _selected from 18 hospitals inreported in interviews, by number of weekly interviem the Detroit metropolitan area. ,The design _con-conducted per'interviewar: Survey Researin Center sisted of four orthcigonal randomized latin _

Individual number ofinterviews per week

PercentRecorded

not re-visits

ported

1

23.

456789

10 .

Sou : reference 17.

weekly load than others, regardless of theirwishes. In the physician visits study, all inter-views were taken in a single area, and eachinterviewer could take interviews in any part ofthe area with little increase in cost. Withinlimits, interviewers Were perniifted fa Cli6oielhenumber of interviews they wished to take eachweek. Thus, the choice rested with the inter-viewers. Their choices may reflect a greaterinvolvement or interest in participating in thestudy, or, since interviewers were paid on anhourly basis, it may reflect a desire to earn moremoney. Thus, the difference in underreportingrates between interviewers in the two studiesmay reflect a difference in their motivation.Since interviewers were generally consistent intheir choice of a large or small nuinber ofinterviews each week; the original finding of anincrease in underreporting is understandable

ain. in terms of interviewer interest and en-thusiasm. Those with low motivation for the joblost interest early, and even the enthusiasm andreporting accuracy of more highly motivatedinterviewers waned over time in the course ofthe fieldwork.

Another validity study of hospitalizationreports by Marquis and Cannel5° utilized threedifferent field procedures and thus permitscomparison of interviewer performance. For thisstudy, which differs in several respects from thehospitalization study reported earlier, a sample

squares. The major sources of variance were fiveinterviewing weeks; five regions of the city; andhree field procedures one control and two

experimental. This design provides a base forcomparison of interviewer performance that isless confounded with other variables than thatof previous studies.

The three field procedures were as follows:

Procedure A.-This was applied to the controlgoup. This procedure involved essentially thesame standard Health Interview Survey question- 'naire that was used in the other two studies.

Procedure B.-The questions and procedureswere the same as in procedure A, except forhospitalizations. The reference period forquestions on hospitalizatiOns was a year and ahalf instead of a year. -Probe questions were'added and special introductory statements wereincluded.h

Procedure C.-The interview questionnairewas identical to that_uscd in procedure A _except,ithat no questions on hospitalizations were askedduring the interview. A form to be filled out bythe family was left with the respondent. .

Nonresponses were followed up by mail and_ -

telephone.

=

Each interviewer was assigned to twoprocedures. One group used procedures A and C;the other group used procedures B and C.Twenty interviewers were employed (most ofwhom had very limited interviewing experience).Inexperienced interviewers were used so thatthey would not know that the- varionsprocedures were different from those cus-tomarily used in the Health Interview Survey.The training was conducted by experiencedtrainers using standm-d interview methods. There _

one full week of training asird-practiceinterviews followed by field observation of each

ewer. A comprehensive manual wasp epared -that -specifi-' all- techniques:The

This procedure also utilized a mail followup questionnairedesigned to pick up further hospitalization reports. The datapresented here do not include results of the followup and consist: =

only of reports Oven in the interim.

_ A A: -_-

training was conducted in two groups and bytwo trainers, one for those using procedures Aand C and the other for those using proceduresB and C.

Table 34 shows a pattern percentage ofunderreporting of hospitalizations for each ofthe 5 weeks for each of the three procedures.Procedure A (control group) shows a pattern of

.underreporting much like that found in thestudy of physician visits. Reporting was pooreras the fieldwork progessed. Procedure B(experimental interviewer) showed a similarpattern with small -differences. It may be thatthe effect of the experimental procedure was todiminish or eliminate the effects of time oninteniewer performance. Two factors, theadditional probes and the supplementary state-ments tea respondents about the study, mayaccount for this effect. Procedure C (theself-administered form) does not show thispattern and would not be expected to since theinterviewer merely handed the hospitalizationform to the respondent, asking that it becompleted and mailed in. Because the design

ed for approximately equal numbers ofrviewz: per interviewer, it is not possible to

compare reporting rate by number of interviewsconducted.

Table 34. Percent of recorded hospitalizations not reported ininterviews when procedures A. B. end C were used, by week

low: Survey Research Center

Week of the interview

Hospitalizationn erviewing procedure

c

Percent of hospitali-zations not reported

711.016.822.123.7

8.38.69.28.70.0

14.416.021.210.516.1

'Source: reference 18.

There is another bit of evidence in this studyrhich supports a motivation hypothesis. Eachiterviewer used two procedures: procedure C,-hich involved the self-administered report of

4

hospitalizations left with respondents to corn-plete and mail, and either procedure A or B,both of which obtained a report of hospitaliza-tions in the interview. The average underreport-ing rate obtained for each interview can becompared with the average underreporting ratebased on the mailed return. The productmoment correlation for the reporting rate byinterviewer for the control procedure and mailedresponse procedure is .65; that between theexperimental procedure and mailed procedure is.56.

These relationships are surprising; particularlybecause one reason for using a self-administeredprocedure was to avoid the interviewer'sinfluence. The relationship is, of course, basedon the performance of only 10 interviewers ineach group. One interviewer who was singularlyunsuccessful in obtaining reports of hospitaliza-tions in either procedure contributed dispropor-tionately to one of the correlations. However,these data indicate, as do previous results, thatinterviewer behavior varies uld that thisbehavior affects responses. Interviewers whowere More sikeeisful in stimulating respondentsto report hospitalizations during the interviewwere also more successful in stimulatingrespondents to better performance in filling outa self-administered form and mailing it to theoffice.

Another interesting behavior pattern wasdiscovered. The questionnaire for the study -ofphysician visits contained one main question and,three fellowup probe questions designed tomake Au_re that the respondent's concept ofphysician visits was the same as the interviewer'sand to stimulate recall of eaSily forgotten events.An overall 7-percentage-point improvement inreporting w2o achieved through Use of thesefollowup probe questions, but the data demon-stráte some interviewers used the probequestions quite differently than did others. Notonly are there large differences in the amount bywhich the probe questions improved reportingfor different interviewers, but meaningfulpatterns are present.

The rates of underreporting for the 10interviewers according to the total number ofinterviews conducted are shown in table 35. Theimprovement in reporting through use of theprobe questions is shown in the last column of

Table _: Number ecorded physician visits_reported in interviews, percent not reported when procedures A and B were used, and

percent of improvement in reporting rate, by individual Interviewer

Interviewer number

Number ofrecordedphysician

visits reported

Percent not reportedPercent

improvement Inreporting rate

(A 9)

Main questiononly(A)

Main questionplus probes

(8)

126 6 39

2 26 38 38 0

3 . 28 29 29 0

4 35 20 20 0

5 39 26 26 0

6 . . ........ . ... 43 28 26

7 44 23 11

8 49 33 20 .13

9 49 22 20

10 .. . . ... . . ...... . . . . . . . 20 14

NOTE: Data previously unpubbshed.

this table. When only _the main question wasused there was not a clear relationship betweenthe number of interviews taken and theunderreporting. rate,-(rank order. correlation_ of.52). When the probe questions were used therewas more of a tendency for interviewers takingari increased number of interviews to have lowerreporting rates (rank order correlation- of .83).The most interesting- finding was related to theeffect of probe questions. Only one of the_fiveinterviewers recording- fewer than 40 sampled

ivisits mproved the-reporting by using probequestions, but all of those who recorded over 40

_visits shoWelrnprovement in reporting whenthe- probes were used. The median rate ofunderreporting is within I-percentage point forthe main question when the tOp five interviewersare compared with the bottom five_ interviewers.

'When all the probe questions were used, therewas no change in reporting -for-the _first fiveinterviewers, and a 6-percentage-point improve-ment in reporting for the five interviewershaving the greatest number of reported visits.

Two conclusions axe suggested by these data.The first is that interviewers differ in the waythey use questions and probes. Some apparentlymake little use of the probe questions, eitherfailing to ask _ them or asking 'them in ariincidental manner. For four interviewers theprobes failed_ tp elicit any additional information. In contrast, interviewer number I, who

seemed to have placed most of ber reliance- onprobe questions, experienced a 39-percent .

*mprovement in reporting when she used the_probes. ,The training _and_ supervision failed t_p_rl:obtain uniformity in the use of the probequestions.

The second conclusion is that, except for theunusual behavior of interviewer number I, theinterviewers who had the laxgest number ofsampled visits (who may have been moreinterested and motivated) generally made better --.use of the probe questions. A major_ interviewer -

difference in this study_ and the hospitalization --study is the amount of experience in interview-ing and in using the Health interview Surveyquestionnaire. For, the experienced interviewersthere was a drop Over time in the reporting ofknown events, suggesting that additional experi-'ence did not make_ for improvement in skill,since skill level was already_high. The drop can':be accounted for in motivational terms; thenovelty of the studies wore off. In the phsicianvisit study, the interviewers were inexperienced; ,

Those who conducted a large number of_int-er2views improved, suggesting that they did-profitfrom longer experience and did gain in skill. Thisfinding plus the earlier motivation hypothesissuggest that interviewers who are motivated _improve their performance with- addedence. For the less highly motivated interviewersthis relationship is not as strong. ThiS inte

6

tation, based only on speculation, would helpexplain the conflicting results in the dinestudies.

One final study is relevant. In this study,"which did not have validity measures, thedependent variable was the number of healthconditions and behaviors reported. Since pre-vious studies suggested that the major problemin_ reporting health events is underreportingrather than- overreporting, the working hypothe-sis in this study was that high reports are likelyto be more accurate than low ones. The studyhad two experimental interviewing proceduresand a control group. The control procedure wasfairly similar to the questionnaire used in theHealth Interview Survey. Only the control group

used for this analysis. Because the experi-mental procedures required interviewers to fol-low rigid rules of interviewing techniques, it wasnecessary to institute special field supervisoryprocedures. Each interviewer was observed dur-ing her interviewing every week during most ofthe fieldwork. Attention was focused on inter-viewing techniques and each interview was dis-cussed with the interviewer.

Average reports for the number of conditions,symptoms, and physician visits respondents re-ported for themselves and for other familymembers are shown in table 36. In contrast todata of other studies, these data show higherreporting of events in the second half of thefieldwork, for all but one item. Self-reports forchronic and acute conditions and symptomswere significantly higher during the last 3 weeksof the study than during the first 3 weeks.

Again a motivational hypothesis explainsthese findings. In this study, in contrast to thepreceding ones, attention was paid to the inter-viewer's performance in interviewing. In pre-vious studies interviewers were rated on thequality of the completed questionnaire, whilefor this study the reward was for good interview-ing performance. It seems likely that bothdifferent criteria and greater personal attentionled to greater skill and increased motivation inperforming the interviewing task.-

CONCLUSIONS AND DISCUSSION

The evidence from these data indicates thathe interviewers generally did not improve their

Table 36. Average frequency of health reports during first 3weeks end second 3 weeks of the interviewing period using _

procedures A and 5, and improvement in reporting duringsecond 3-week period, by respondent status and health avant

Average events reported

espondent statusand health event

Respondent reportingfor self

Chronic and acuteconditions

Symptoms . . .

_ Physician visits .

Respondent reportingfor other family

members

Chronic and acute_

conditionsPhysician visits .

Firstweeks(A)

Secondweeks(8)

Improvementduring second

3 week;(Et A)

1.874.880.34

2.635.830.60 .26

0.141.260.26

.24.12

1 4 01P - -

performance as they gained experience, at leastnot during a single study'. In fact, performancebegan to deteriorate almost immediately aftertraining and, at least in some cases, droppedsignificantly within a few weeks. Contrary toexpectations, this deterioration occurred arnongboth experienced and inexperienced inter-viewers.

The fragmentary data are susceptible to vari-ous interpretations. One explanatibn is thatreporting bias is related to the interviewer'sinability or lack of incentive to encourageenthusiastic performance by the respondent andthat, over time, interviewers become less con-scientious in their use of interviewing tech-niques. Because performance dropped as theinterviewing period progressed, it would seemthatPerformance is related more to motivationallevel than to ability to perform the task. Eventhough the nature of the behavioral change inthe interviewer is not known, there is evidencethat the manner in which the questions areasked may change over time. Whatever themechanisms are, the interviewer's behaviorapparently has an effect on the respondent's

47 _-

behavior, both in the reporting during theinterview itself and in the respondent's subse-quent performance in filling out and mailing aself-administered form.

It is, of course, no surprise that iack ofincentive or interest in performl.,g the interview-ing task results in poor performance, but it issurprising that the interviewer's performancedropped as rapidly as these findings indicate.

It appears that interviewers were beingreinforced for part of their role performance,and that the reinforcement brought aboutmprovement. However, adequate performance

in stimulating respondent reporting behavior andin proper use of interviewing techniques was notreinforced and, accordingly, deteriorated. Per-formance in interviewing was not reinforcedbecause observation of interviewers was notcontinued after training, and good- or poorperformance could not be identified by a reviewof the completed questionnaire. (It is instructiveto note that in one of the studies reported herethe interviewer who was among the poorest inobtaining known hospitalizations and whoseunderreporting rate rose sharply over theinterviewing period was selected as the bestinterviewer and was promoted to an officesupervisory job because of her "excellent"work.)

, There are two major implications in thesefmdings. The first is that research is needed toidentify explicitly those interviewer behaviorsthat relate to adequate reporting. Studies thatconcentrate on interviewer-respondent interac-tion, especially through analysis of recordedinterviews, can provide information for better

understanding of some of these factors. Asecond implication is the need for training andsupervisory methods that will reinforce adequateinterviewer performance in the application ointerviewing techniques.

Studies frequently show a relationship be-tween the information reported in interviewsand the attitude of, or expectations of, theinterviewer. Interviewer bias studies abound, butall relate characteristics of the interviewer(background characteristics, attitudes, percep-tions, and expectations) to the reporting ofrespondents. The data presented here point tothe importance of the interviewer's continuedmotivation to do a conscientious job ofinterviewing. The suggestion is that as fieldworkprogresses, interviewers become less careful orconscientious in using the techniques they weretrained to use. Furthermore, there is evidencethat interviewers who performed well in theinterview also Mspired their respondents toperform well, as was shown in the adequacy ofrespondent reporting of hospitalizations in aself-administrative procedure. Good interactionimproves not only the technical aspects of theinterview, but also stimulates a high level ofrespondent role performance which extendi intime beyond the interaction itself.

Interviewer training needs to include devicesfor building the interviewer's enthusiasm for thejob as well as procedures for training in theproper use of teclmiques. Attention to perfor-mance cannot stop when interviewer training iscompleted. Training needs to be evaluated andgood performance reinforced continually duringthe period of production interviewing.

THE USE OF VERBAL REINFORCEMENTIN INTERVIEWS AND ITS DATA ACCURACY

The purpose of this section is to describeinterviewer verbal reinforcement as it occurs ininterviews, to conceptualize it in the samemanner as other major interviewer behaviors,and to look at some of the relevant researchstudies that provide information about howverbal reinforcement influences the accuracy ofinterview data.

The major category of interviewer behavior ofconcern here is interviewer feedback. Inter-

viewer feedback consists of evaluation state-ments that the interviewer makes after therespondent says something. The subject of feed-back may be approached through two basicquestions: "How can interviewer feedback beused to improve dafa?" and "How can it bearranged so as not to introduce unwanted bias?"The following discussion will be limited tointerviewer verbal reinforcing statements wl-achrepresent only one kind of interviewer feedback.

Probes and statements that are intended to buildrapport are not treated directly. While thisrestriction is undesirable, it is necessitated bythe sfact that research conducted at the SurveyResearch Center and elsewhere has dealt mainlywith task-oriented verbal reinforcing statements.

RESEARCH ON INTERVIEWERREINFORCEMENT

Since Pavlov's famous experiments, there havebeen numerous studies of the effects of rewardand reinforcement on learning and performance.Much of this research has been carried out withanimals and occasionally with children. It hasbeen assumed that the principles of reinforce-ment derived from these laboratory studies

-could be projected to adult human beings. Thisassumption has been made explicitly by manywriters, most notably Dollard and Miller51 andSkinner.5 2

Operant Conditioning Studies WithVerbal Reinforcement

-Outsidethe psychology laboratory twoimportant kinds of practices, programed instruc-tion (or teaching machines) and behavior modifi-cation therapy, demonstrate the powerful ef-fects of properly scheduled reinforcement onhuman performance. Some practitioners in thefields of education and psychotherapy have feltthat traditional methods -of producing learningor behavior change (e.g., lecture, introspection)are inefficient. They have met some success withnew techniques based on tightly controlledoperant conditioning techniques where re-nforeement is contingent on the respondent'sictual behavior.

These studies point to the importance ofedback on human performance under a variety

)f conditions. However, the experimental situa-ions mentioned above (sentence construction,ree assoCiation, and casual conversation) are notike a typicaLsurvey interview. Several studiesRave been conducted in interview settings. Theirst set of studies shows that interviewer re-riforcement haS great effects on the amount ofeSpondent behavior. The second group of stud-,!..s shows that interviewer reinforcement canroduce -an interviewer bias effect.Finally, thetudy by Marquis, Cannell, and Laurent53 indi-

cates that, under certain conditions, interviewerfeedback can be used to increase the accuracy ofinformation given by respondents.

Effects of Feedback on Amount ReportedA study by Marquis and Carmen") shows that

the presence or absence of interviewer verbalfeedback has a significant effect on the amountof health information reported by a respondentin a household interview abOut family health. Inthis study a probability sample of moderate-income females between the ages of 17 and 65and living within the city limits of Detroit wasinterviewed. They were asked questions abouttheir own health, about their use of clifferentkinds of medical services, and about the healthof another person in the household. Althoughthree experimental groups were used in thisstudy, two groups of respondents are of interesthere: (a) the group of respondents who werereinforced every time they reported a symptomor health condition either for themselves or forthe other person in the household for whomthey -were -reporting; and -(b) -the --group ofrespondents who were interviewed using thesame questions but who received no reinforce-ment or feedback from the interviewer.1

For the first experimental group the kinds ofreinforcing statements which the- interviewerused are outlined here. Words in parenthesescould be used, omitted, or interchanged by theinterviewer:

(Yes.) That's the kind of information we need.(Um-hmm.) We're interested in that.(I see.) You have (name of condition)

There were some other small differences in theinterviewing procedures for the two groups andthese are described in detail elsewhere.50

'The third group ww alto interbiewed without reinforce-ment. A list or symptoms, which appeared at the begisming ofthe other two kinds of interdews, was inserted r.ea: die end ofthe third treatment interview. The purpose of this proceduse wssto Jest the sensitizing effects of the symptoms list on laterreporting. No main effect was found on thc amount reportedlater in the iraertiew, although it may be necessary to begin areinforcement interview with a symptoms list which allows all

spondenti to report some sort of sickness and to receivereinforcement for their reporting.

The main dependent variable in the study wasthe number of chronic and acute conditionswhich the respondent reported for herself.Following the logic outlined by Cannell, Fowler,and Marquis" (appendix 11), it was assumedthat the more chronic and acute conditionsreported, the more valid the aoverall data. Thegroup of respondents receiving reinforcementreported an average of 2.74 conditions perperson. The nonrein forced group reported anaverage of 2.20 conditions. The data indicatethat the rcinfdieed--gioup of respondents'reported about 25 percent more chronic andacute conditions than did the nonreinforcedgroup. The difference is significant at the .02level of confidence. The same inagnitude ofeffect was obtained for reporting chronic andacute conditions for the other person in thehousehold. The number of conditions reportedby proxy averaged 1.88 in the experimentalinterview and 1.43 in the control interview. Theexperimental or reinforcement interviewobtained about 24 percent more chronic andacute conditions reported by the proxy

----reniondent- and the difference is significant atthe .01 level of confidence. Other resultsindicated that the reinforcement effect wasgeneral rather than limited to one kind ofcondition. For_ example, the reinforced respond-ents reported about 25 percent more medicallyattended conditions for themselves and about 55percent more of the highly embarrassingsymptoms than did the nonreinforced group.The group that received feedback from theinterviewer did not report, more visits tophysicians than did the nonreinforced group.This finding is , discussed in more detail in theoriginal report by Marquis and Cannel1.30

Kanfer and McBrearty3 2 interviewed 32female undergraduates about 4 predeterminedtopics. During the first part of the interview the -women were handed cards listing each of thetopics and were asked to talk about each one.The experimenter sat in the room but keptcompletely silent. During the second part of thesession the experimenter reinforced the respond-ent whenever she talked about a predeterminedtwo of the four possible topics. Reinforcementconsisted of a "posture of interest" includingsmiles and the phrases "I see," "Um-hmm," and"Yes." When the women talked about the

remaining two topics, the experim nter said anddid nothing. The subjects spent more timetalking about the topics that were reinforcedthan about those topics that were notreinforced.

These two studies indicate that interviewerfeedback can have a great effect on how much issaid during the questioning. It should be notedthat both of these studies involved maximumcontrasts in interviewer feedback; that is, in onegroup or at one time interviewers used verbalreinforcement and for another group or atanother time interviewer feedback was com-pletely nonexistent. While these studies showthat interviewer feedback can have importanteffects on the amount reported, they indicatelittle about the nature of the effects or whatkinds of feedback procedures are most effective.

Use of Feedback To Increase Accura

The study to be discussed here is one in whichthe data indicate that interviewer feedbackprocedures can have beneficial effects on surveydata -accuracy. It -should be -remembered- thatthis is a single study dealing with fact responses,and that the results are tentative and possibly oflimited generalization. This study by Marquis,Cannell, and Laurent33 was carried out undercontract with The National Center for HealthStatistics. In an unpublished study by Marquisand Laurent," a number of interviewingvariables were -tested but the discussion thatfollows is restricted to the effects of reinforce-ment on initial interviews and reinterviews usinshort (nonredundant) questions. _ The samplerespondents were white females between the _ages of 17 and 65, who were residents of thegreater metropolitan Detroit area. The respond-.ents were selected on a weighted basis from _persons who had come into a prepaid healthclinic during a 6-month period prior tointerviewing. During the time each patient wasin the clinic a physician filled out a checklistindicating which of 13 chronic conditions thepatient definitely or probably had, and aboutwhich of the checklist conditions the physicianhad no information. The physician obtainedinformation -about the chronic condition fromthe patient's record and from his ownknowledge of the patient's medical condition.

He did not interview the patient directly aboutthe specific conditions listed on the form.

The dependent variable used in the study wasthe accuracy with which respondents, in a seriesof subsequent household interviews, reportedhaving had or not having had each of the 13aforementioned chronic conditions. Two typesof respondent error were a) underreporting, inwhich case the respondent failed to mention acondition winch the doctor said she had; and (b)overreporting where the respondent mentioneda condition which the doctor said she did nothave.

It was _not expected that the respondent andphysician would agree entirely on the state ofthe respondent's tmli,h. However, what wasexpected, for the reporting of these 13 verycommon chronic conditions, was that anexperimental interviewing procedure that ob-tained low average rates of respondent-physiciandisagreement obtained more valid data than aninterview procedure yielding high rates ofdisagreement. The study included a reinforcedgroup of respondents and a nonreinforced

ou.p. All respondents received the same kind ofuestionnaire composed of straightforward shortquestions. The verbal reinforcement used in thereinforced group was similar to that described inconnection with the study by Marquis andCannel1.50 Half of the respondents werereinteiviewed approximately 2 weeks after theinitial interview. Respondents in the experi-mental group received reinforcement for report-ing all kinds of health events, not just symptomsand conditions as in the 1971 study just cited.The results obtained in the analysis of the 1972study data were not those initially antici-pated." They indicated that !he effects of7,-lifferent ways of conducting interviews weremediated by the level of education of therespondent. ProcedUres that increased accuracyand reduced bias in the low education group(had not completed the 12th grade) hadapposite effects in the high education group.

.

The reader should consult the full report of thisstudy, for a more detailed treatment of all datarid all experimental techniques.

'Copies of this study axe available from the National Centerit lietti Statistics, Vital and Health Statistics. PliS Pub-o. 1000-Series 2-No. 45.

The discussion of the results of this study islimited to four groups of respondents, thosereinforced and those not reinforced in each ofthe two education groups, all of whom wereinterviewed with short questions.k

The underreporting index scores shown intable 37 indicate that, for respondents who hadn ot finished high schoo 1, the use ofreinforcement in interviews significantly reducedthe underreporting of chronic conditions in theoriginal interview. However, reinforcementsignificantly incriased upderreporting error forthe more educated group. The drop from .387to .225 in underreporting brought about' byreinforcement in the less educated group repre-sents a decrease of about 42 percent; thepercentage increase of underreporting in thehigher education group due to reinforcement(from .443 to .575) was approximately 30percent. Both of these differences are significantat the 5-percent level.

The overreporting errors were generally low(7 to 14 percent) and were less affected byreinforcement The - slight trends in -the datasuggest that reinforcement reduced over-reporting in both education goups. One hypoth-esis in this study was that reinforcement by theinterviewer each time the respondent reportedan illness would result in the respondent "mvent-ing" sicknesses in order to gain further approvalof the interviewer. Since the reinforcement-interview produced fewer Use positiveresponses than the interview that did not includereinforcement, the trend was counter to thatexpected.

The total error index scores shown in table 37represent the unweighted sum of the individualunderreporting and overreporting scores. Sinceunderreporting scores had a higher mean andvariance than overreporting scores, the totalerror score reflects the effects of underreportingdue to reinforcement and education level. Forthe less educated group, reinforcement signifi-

reinforcement contrast so tested using aquestionnaire which contained a mixture of long and shonquestions. For reasons not discussed here, the effect ofreinforcrment with long questions was ambignous-lt is clear,however, that the combination of reinforcement with 1Png,----vquestions does not result in marked improvement in:dataaccuracy.

Table 37. Error rates for chronic condition ng in original interviews end reinterviews. by use of reinforcement end educationlevel of respondent

Respondent's education level

Original interview

ReinforcNot

reinforced

Num .r of responden

Low education level

701:11 error . .. ..... ..l.171:erreportin:

High education level

of respoidents

error

t11::.:moorting.norlirty

.225

.113

4

41

.530

.337

.14

.575

candy- the total error; for the moreeducated gl'uup, reinforcement significantly in-creased the overall error.

The trends in the reiritelNiew data were ingeneral similar to those observed in the data ascollected originally. The questions were approxi-mately the same as those asked the first time.Respondents who were in the reinforced grouporiginally were also reinforced during the re-interview. Similarly, respondents initially in thenonreinforced group, did not receive reinforce-ment the second time. Thus, the reinteniewdata reflect the cumulative effects of eitherreinforcement or nonreinforcernent.

While the results of this study are not as clearone might wish5 they do indicate that certain

kinds of reinforcement procedures can have amarked effect on the accuracy of the estimatesmade for a population from survey data. Itwould appear that less educated respondentsrely on interviewer cues to direct their reportingperformance. Thus, appropriate use of reinforce-ment, probes, and other feedback by the inter-ifiewer can aid recall and reporting accuracy.Perhaps because of the -perceived social statusdiscrepancy, feedback is accepted as appropriateand actually welcomed. Performance is generallygood in an appropriately conducted initial inter-

Reinterview

ReinforcedNot

reinforced

25 20

_- .230 .400.131 1

20 27

.615

.521 .495

.082 .120

view, but there is little additional benefit fromaremterview.

More educated respondents carry out thereporting task according to their -own under-standing of it rather than relying on cues fromthe interviewer. Reinforcement under such cir-cumstances may be perceived by the respondentas inappropriate, unnecessary, or even con-descending. The , more educated respondentappears to have a tendency to underreportchronic conditions, possibly dde to a stfEingerconservatism or social desirability bias. Whilethese findings are somewhat speculative, they dostress the importance of such "human!' 'charac-teristics as memory, recall, cognition; socialstatus, and infellectual ability_ that are oftenOverlooked in methodological studies.

The studies cited to this point indica that:

a. Interviewer feedback makes a difference inrespondent performance:

b. Interviewer feedback can bias respOndentanswers.

c. Intewiewer feedback can increase responsevalidity.

However, these findings are hmed on ekperi- _

mental interview studies with unorthodox ways

5 2.

-41

of programing feedback and do not indicate howfeedback is used in the normal householdinterview.

Use of Feedback in a NonexperimentalInterview

Marquis and Cannell2 obtained tape record-ings of 181 interviews with employed malerespondents about employment-related topics.The respondents included both black and whitepersons ranging in age from 18 to 64. The fourinterviewers were white middle-class femaleresidents of suburban Detroit.

The tape recordings of each interview wereanalyzed by identifying each instance of verbalbehavior on the part of the interviewer or therespondent and assigning it 1 of 36 possiblebehavior codes. Thus, each interview was trans-formed into a series of code symbols whichrepresented the kinds of behavior in which theinterviewer and respondent engaged. The dataindicated that 23 percent of all coded inter-viewer behavior consisted of feedback. Feed-back, _as_ a category of interviewer behavior,occurred as frequently as probing. The only kindof interviewer behavior that occurred morefrequently than- feedback was question-asking(this accounted for about 37 percent of allinterviewer behavior). These data indicate thatinterviewer feedback is, indeed, a large part ofinterviewer behavior.'

Data presented in table 29 are the basis ofdiseussion of the use -of reinforcement byinterviewers. The table shows a very surprisingfinding: The probability of feedback after aninadequate answer was almost the same as theprobability of feedback after an adequateanswer. Evert more surprising is that interviewerson the average reinforced over half of the

spondents' refusals to answer. This kind ofpattern discloses that interwiewer feedback maybe _used in an inappropriate manner. These dataindicate that interifiewers not only gave positive

Drt g preas, results from a small pilot31xervation -Study (N23), conducted on a different kind ofnterview (uiban problems) and with more experienced inter-dewers, indicate that -the proportion of interviewer behaviorlevoted to short, positive feedback statements was 10 to 15'anent. That-this percentage range is considerably lower the6he proportion obtained in the labor force study suggesu theiced for further research.

evaluation- responses when the respondent re-plied adequately, but they also sd positivethings in situations where they may have beenunder, some tension or may have felt the need tobuild rapport in the interview._ The rapporthypothesis:- about the use of feedback is sup-ported by the finding that feedback was givenafter refusals to answer and after answers thatdid not meet the objectives of the question.

The authors have never systematically dis-cussed these 'data with interviewers, but werethis to be done, it might be expected thatintertiewers would insist that the difficulty oftheir job makes it necessary for them to buildrapport with respondents by using positivereinforcing statements when tension is feltduring the. interview. The authors would prob-ably reply that at least one experimental studyhas shown that validity can bc improved byreinforcing only adequate answers and that thispattern should be followed. Possibly these twodivergent hypotheses about the effective use ofintersiewer reinfdrcing statements can be testedexperimentally. At this point there' is onlylimited empirical support for the authors' posi-tion and intuitive, common-sense support forthe interviewers' position.

Discussion

The data indicate that the survey researchershould be concerned about the feedback stylesused by interviewers. The remainder of thissection discusses ideas and research concernedwith reinforcement effects in the personal inter-view. Before going further into the hypothisizedeffects of verb reinforcement on respondent-

performance, a schematic diagram of variablesthat may influence reporting quality might be

RESPOND eNTKNOWLEDGEOF WWAT ISEXPECTED OFHIM

RESPONDENTWILLINGNESSTO RESPONDADIGUATELY

nesponourrABILITY ORSICILLIN 7

RESPONDINGADEQUATELY

. .

ADEQUACY OF REPORTING

RESPONDENTKNOWLEDGE

,

mow THE '-ADEQUACYDERESPONSES .

The diagram implies that there are four hypo-thetical personal characteristics (other than hav-ing correct answers available in memory) thatwill affect -the accuracy and completeness ofreporting. These consist of two knowledge (cog-nitive) variablesknowledge of what is expectedand knowledge of how well one's responses aremeeting these expectationsa skill variable, anda motivational variable.

This diagram implies that if the respondentknows what is expected of him, has the abilityto do what is expected, can t tell how adequatehis responses are vis a vis the task requirements,and:Wants to do well, then the data he gives willbe accurate and complete. The purpose of thediscussion that follows is to explore how verbalreinforcement procedures and alternative proce-dures affect each of these hypothetical variables.

The tentative conclusion reached is thatpositive verbal reinforcement contingent onadequate answers provides a wide variety ofdesirable effects on these intervening constructs,while other,procedures that might be used tendto have more limited effects. The effects ofverbal reinforcement at any particular stage arenot totally clear and derivations from' theorysuggest reinforcement may also have somecounterproductive effects in certain situations.

THREE KINDS OF INTERVIEWERVERBAL REINFORCEMENT EFFECTS

The effects of verbal reinforcement onrespondents can be divided into three categories:a cognitive effect, a conditioning effect, and amotivational effect. There can be a great deal ofoverlap among these categories but, for ease ofpresentation, the three kinds of effects will. betreated as if they 'are independent.

Interviewer verbal reinforcement has acognitive effect on respondents because itsupplies information about the expectations ofthe interviewer and how adequately therespondent is meeting these expectations by hisanswering behavior. For example, i f the

_ interviewer asks whether the respondent has everhad a headache, the respondent says "Yes," andthe -interviewer uses a reinforcing statement such

'Good, that's the kind of information weneed." The Inierviewer has indirectly told therespondent that she expects him to report his

minrr illnesses and that the response he just gavewa "ood one in terms of meeting theobj. of the health inteiview. Thisinfolii,Lon-giving aspect of verbal reinforcingstatements which changes (or -maintains) therespondent's intellectual understanding of therespondent role requirements and the adequacyof his responses is described as a cognitive effect.

The second effect that interviewer feedbackmay have is referred to here as "instrumentalconditioning." This effect is important in manystudies of the psychology of learning. In thesimplified model of the interview just described,the interviewer's evaluation comes immediatelyafter the respondent has given a response. Thissort of pairing of an evaluation with a responsecan have reinforcing properties, that is, it canalter the frequency of the 1,chavior thatimmediately preceded it. Thus, the evaluationprocess has the potential of strengthenjng orweakening the probability of eliciting thatbehavior or similar behavior on subsequenttrials. Also, through the process of differentialreinforcement of successive approximations, thereinforcement procedure can establish a newkind of response class or behavior in which therespondent would not normally engage on hisown.

The third possible effect of interviewer_feedback is motivational. In this case feedbackaffects the intensity or psychological effortwhich the respondent allocates to his reportingtask and to other behavior that may interferewith the adequate performance of kisrespondent task.

Cognitive Effects of Reinforcemen

Two kinds of knowing or understanding arethought to influence respondent performafice:understanding what is expected (e.g., knowledgeof the proper respondent _role) and knowingwhen answers meet and do not meet thoseexpectations. Two ideas are hypothesized: (a)such knowledge is sometimes very helpful bia isneither necessary nor sufficient for good:perforicance in some conditions; and (b)reinforLment plays an important role in helpingthe respondent to acquire both kinds ofknowledge and is a more effective procedurewhen compared to more conventional teachingtechniques.

43-

Before exploring the effects cf reinforcementon knowledge, it should be noted that manypeople maintain that the most effective way toincrease a person's knowledge and understandingis o teach him by direct (e.g., to lecture) ratherthan by indirect (e.g., using feedback) methods.Direct approaches to teaching which are notdependent on respondent performance_ are=notuncommon in survey interview settings. Forexample, before an interviewer arrives at thedoorstep, the potential respondent has oftenreceived a letter or a brochure that explains thesurvey and describes what is wanted from therespondent. Often at the beginning of theinterview, an explanation of the goals of theresearch is given, accompanied by specificappeals for accuracy, completeness, or candor._ knawn_about the,effects-of thiscommon, direct approach to teaching arespondent. However, if it were possible toextrapolate from the lecture analogy and toresearch these effects, it might be found thatoften the "students" had not learned orunderstood or would not be able to verbalizewhat they had been expected to learn. Twostudies obtained some relevant data. In a studyby Cannell, Fowler, and Marquis55 respondentswere reinterviewed 1 to 3 days after an initialhealth interview. Two questions were asked of412 respondents to find out how well they hadunderstood their role. The answers weredistributed as follows:

Q. 26. Did the interviewer want you to be exact inthe answers you gave or were general ideasgood enough?

Respondent's answerPercentdistri-butiori

"Exact--Some of each"'General-Not ascertained

Respondent's answer

"Everything""Important things"Not ascertained

Percentdistri-bution

76

5

A similar question was asked of 428 respondentsin another study by Marquis and Canne115° atthe end of the health interview rather than at afollowup interview:

Q. 17. Will people think we want them to report a 1their illnesses or only the important ones?

Respondent_Percentdistri-. .

bution

-Report all illnesses":Gave correct reasons"' . .

Gave incorrect reasons . .

"Report only important ones" . . .

The data indicate that despite introductoryletters and brochures, interviewer explanations,and the actual experience of the interview,between 20 and 40 percent of respondents inthese health studies clearly did not understandthe respondent role correctly.

Some data are available to indicate whattechniques are effective in changing respondentunderstanding. In an unpublished Survey Re-search Center study by Cannell, Fowler, andMarquis,56 different kinds of brochures andintroductions were used. The effect on respond-ent understanding of an unattractive butinformative control) brochure was comparedwith that of:

a. A brochure indicating the kinds oquestions to be asked, the Interviewer'srole, the respondent's role, and theImportance of accuracy in reporting;

b. A brochure mentioning uses of data andstressing the benefits that might result fromthe survey;

27. Did she ntervi wer) want everything, no n'These reasons indicated that reporting al illness was whatmatter how small it was, or was she interested the survey, interviewer, or Government wanted or that reportingonly in fairly important things? all illnesses would result in good data.

c. A calendar on which the respondent mightwrite family sickness information for twoweeks prior to the health interview; and

d. A set of actual questions to be asked duringthe interview, asking the respondent tothink about them and consult records andfamily members for accurate information.

Results of pretests were so disappointing thatfull scale evaluations of the effects of thesecommunication/teaching devices were notundertaken. The experimental materials,whether used singly or in combination, made nodifference in respondent knowledge or reportingperformance.

It is currently believed that teaching may bemore effective if methods of programedinstruction are used. In contrast to the brochureor lecture techniques-,- one essentialfdature ofthe method of programed instruction isimmediate feedback about the adequacy of eachanswer the sttkdent gives. One characteristic ofreinforcement is that it can provide immediateinformation to the respondent about theadequacy of his answer. The first SRCreinforcement study mentioned above55 gavepositive verbal reinforcement for adequateanswers (see report of that study for feedbackprocedures and definition of adequate answers)to one group and not to another group. Bothgroups received the same advance letter andexplanation of the study by the interviewer. Thefollowing data indicate that the reinforced grouphad a slightly better understanding of therespondent's task than did the group notreceiving feedback for adequate answers:

Q. 17. Will peopk think we want them to report alltheir illnesses or only the important ones?

_

Respondent's answerReinforced

(151 re-spondents)

Notreinforced(277 re-

spondents)

Total

"Report all illnessesGave correct reasons ..Gave Incorrect reasons

"Report only important ones"

Percent distribution

100 100

3728

272845

Therefore, a reinforcement procedure doeshave at least a small potential for teaching arespondent. However, reinf-orcement effectsappear to be stronger in the area of elicitingadequate responses than in teaching what isexpected, since the effects of reinforcement onthe actual amount reported are greater than theeffects on understanding the task requirements.

Two remaining points about verbalreinforcement and knowledge effects onreporting performance need to be discussed: (a)certain types of reinforcement procedures usedin the laboratory experiments mentioned abovemay be less effective than the procedures used inthe 1967 study; and (b) the effects of the verbalreinforcement procedures used in the 1967study did not appear to be mediated by thedegree of the respondent's _knowledge__ofexpectancies (awareness of respondent taskrequirements).

There is a lively controversy over whether ornot the respondent must know what is expectedof him in order to perform well. In other words,is complete awareness of what is expected anecessary, although possibly insufficient,condition for accurate reporting? Despite thefinding that reinforcement appeared to increaseunderstanding of the response requirements, thedata from the 1967 study mentioned aboveshow no correlation between reporting andawareness as measured by question 17. This istrue for the reinforced group as well as for thenonreinforced group. Yet the reinforced groupshowed superior performance in reporting theirhealth conditions. Cannel', Fowler, and Marquisreported a slightly positive, but not statisticallysignificant, relationship between awareness andgood reporting performance.4 6 Fowlerreanalyzed the latter data and found that therelationship between awareness (measured byone of the two questions) and the amount ofinformation reported was large for respondentswith at least a high school education." Thatawareness, however, did not predict reportingbehavior for respondents with less education,nor did data from the second awareness questionpredict behavior. Possibly the reinforcementeffect, if it is cognitively mediated, is producedby letting the respondent know when hisresponses are adequate rather than by giving himknowledge of general task requirements. This

implies that the respondent need not understandexactly what is expected of him to respond well.

Recent experiments in social psychology havetried to explore the relative effects of awarenessof task requirements and reinforcement ofperformance. A detailed treatment of thesefindings is beyond the scope of this report, butsome of the main issues and findings arepresented at the end of this section wherefurther experimentation is discussed. It appearsthat awareness can be helpful when therespondent (a) can tell when his single responsesare meeting task requirements, (b) has the skillto perform what is requested, and (c) has thewill (motivation) to perform.

Feedback used in the 1967 studies may haveprovided two types of information, and achievedthe increase in reporting frequencies because ofthis double-barreled effect. In some of thelaboratory experiments mentioned earlier, re-inforcing statements contingent on correctanswers consisted of "Yes," "Mm-hmm,""O.K.," or "Good." The 1967 studies usedverbal reinforcements which contained more in-formation, statements such as: "Mm-hmm, weneed to know that," or "I see, that's the kind ofinformation we need." Probably the two typesof statement convey different amounts andkinds of information. The simple statementsapparently convey information about responseadequacy, leaving it to the respondent to inferthe interviewer's goals. The longer statementspossibly make it easier for the respondent toarrive at some knowledge of interviewer expecta-tions. This hypothesis would be relatively easyto test in the laboratory or in field interviews.

These conceptions of how reinforceinentteaches the respondent to understand his properrole point to the desirability of using more effi-cient techniques which do not rely so heavily onthe respondent's ability to figure out what isexpected. Further research seems to be neededin order to test whether the respondent's under-standing of his role is necessary for good report-ing or whether knowledge of the adequacy of hisresponses is, in itself, sufficient. If knowledge ofexpectations is found to be important, it mightbe effective to develop a method that informsthe respondent of his expected performancequality and at the same time, through reinforce-

5

ment procedures, gives lum immediate inforrna-tion about the adequacy of his answers.

The effects of reinforcement on two kinds ofrespondent knowledge and the role of knowl-edge in producing good survey data have bantouched upon above. Other variables can alsoinfluence data quality. In the following sectionsreinforcement is discussed in the context of twoother variables, skill level and motivation.

Conditioning Effectsof Reinforcement

Experiments cited earlier show that giving re-inforcement immediately after some behavior in-creases the probability that such behavior willrecur. This pattern is defined here as an operantconditioning effect, or more simply a condition-ing effect. In the tradition of B. F. Skinner, the_ariangement of-reinforcement contingencies toproduce the conditioning effect minimizes con-sideration of the intervening variables that mighthelp explain the obtained results. Thus, the fol-lowing paragraphs omit speculation about howinterviewer reinforcement may affect cognitions _

and motivations to aChieve a performance effect.They relate to such problems as identifying thecorrect behavior to reinforce and the circum-stances under which a conditioning procedureseems appropriate.

When to use a conditioning procedure.Thefollowing discussion is based on the premise thatconditioning effects of reinforcement should beused (a) when respondents do not have a clearunderstanding of how to perform effectively andcannot be given this understanding by mere ex-planation or appeals for good performance, (b)when the respondent does not possess the abilityor skill for good reporting regardless of whetheror not he understands what is wanted, or (c)when the respondent is performing in a mannerwhich is less than optimal (for example, hewants to do something besides answer questionsaccurately and completely). The implication isthat if the respondent understands what he issupposed to do, has the ability to do it, knowswhen he is doing it, and wants to do it, there isno reason to introduce a conditioning procedureinto the interview. However, all of these condi-tions are seldom met in any personal interviewsituation.

The general conditioning pattern suggestedfor personal interviews is the scheduling of re-in f orcing statements only after adequatemiswers. In this way the frequency of adequateanswers can be expected to increase. For exam-ple, an adequate answer appears fairly easy toelicit with increasing frequency when the follow-ing instruction is given: "When I ask an open-ended question, please tell me all you can thinkof, using specific examples." This is accom-Fished by an interviewer using a reinforcing

statement every time the respondent gives anacceptable idea or example in response to anopen-ended question. The arrangements to pro-duce an increase in valid answers, especially toforced-choice questions, are not that easily con-ceptualized. Methods that change the reinforce-menu contingeney-rules to reward responses thatcome closer and closer to approximating somegoal have been shown to be effective and,theoretically, could be used in the personalinterview setting.

Another use of conditioning procedures is tomaintain an acceptable level of respondent per-formance. Once the respondent has been taughtappropriate response behavior, it is desirable forhim to continue to answer acceptably. Accord-ing to the operant conditioning literature, a pre-viously reinforced response (especially if it hasbeen reinforced every time it has occurred) willbe given less and less frequently if it is no longerreinforced. Thus, another function of a verbalreinforcement procedure is to prevent extinctionof appropriate behavior once it has been estab-lished. A schedule of reinforcement that usespositive statements after at least some of thedesired behavior is probably adequate to servethe response maintenance function. Proceduresthat omit reinforcing statements or use them ina -random way probably cannot maintain anyperformance -pattern that has bcen establishedearlier.

Problems of conditioning proceduies forteachingA conditioning effect of reinforce-ment involves the respondent learning how toperform or improve his performance on sometask. He may or may not be aware of his result-ing increase in ability. For a feedback techniqueto be c ffect ivc , the interviewer (or studydesigner ) must be aware of what is to be taught

or improved, be able to recognize when arespondent performs the desired behavior, andbe able to give appropriate verbal reinforcementimmediately after the respondent behavioroccurs. These are very stringent conditions, andin the usual interview situation they cannot bemet. One major obstacle is the difficulty ofdetermining immediately whether a particularrespondent behavior is desirable or not. Withoutknowing which responses to reinforce, a properconditioning effect cannot be obtained.

The dangers of trying to teach one conceptbut actually teaching something else can be illus-trated by a recent series of experiments. Cannelland Marquis originally diagnosed the problem oferror in reportin6 .thronic and acute health con-ditions as an underreporting or "failure to re-port"- problem.-- Subsequently, in-- their--I96-7-study,5° they reinforced every instance wherethe respondent reported a symptom or a chronicor acute health condition, and obtained a 25-percent increase in the number of symptoms andconditions reported. A subsequent look at un-published data about chronic condition report-ing from other sources suggested that over-reporting might be more of a problem thanunderreporting for this kind of health informa-tion. Therefore, the conditioning procedure usedmay have decreased the accuracy of the data inthat study because it led to increased overreport-ing errors. Therefore, Marquis, Cannel, andLaurent53 carried out a second study using inde-pendent records from physicians as indicators ofthe presence or absence of chronic conditionsfor particular respondents. Respondents receivedapproximately the same kind of reinforcementas in the earlier study. The results mentionedearlier in connection with the discussion of table42 did not show the expected biased condition-ing effect. If the conditioning effect were theonly effect operating, it would be expected thatoverreporting (reporting of nonexistent chronicconditions) would increase. The -fiata indicatedjust the opposite. The short-questiOn reinforce-mem interview had its major effect in reducingoverall reporting error by reducing the amountof overreporting. Exactly why thii-happened isuncertain. Possibly no conditioning effectoccurred, and the reduction in error was due toincreased knowledge or motivation effects.

Another possibility is that a concept other thana positive or sickness-reporting response set wastaught." The main point, however, is the diffi-culty in predicting what sort of consequenceswill arise when interviewers use reinforcement ina particular contingency schedule. While theabove example shows that reinforcement proce-dures had a beneficial effect on the validity ofhealth reporting, the effect may not have beenfor the reason originally hypothesized.

Substitute for conditioning procedures whenthe problem is one of low respondent ability.Probably one of the biggest problems in surveyinterviewing concerns the respondent's ability torecall factual information accurately. The mostdramatic example of a memory problem isshown by Cannell and Fow leri in the reportingnf_ visits to a physician during the 2_ weeks_pre-__ceding the interview. Using a standard questiondesigned to find out how many times therespondent had consulted a physician during thistime, 21 percent of the visits known to haveoccurred within the last week were not reportedand 38 percent of the visits known to haveoccurred during the second week prior to theinterview were not mentioned. Inability toremember increased over 80 percent from thefirst to the second week preceding the interview.

It has been hypothesized that a Conditioning3rocedure might help the respondent learn toiistinguish correct from incorrect memory rep-vsentations of an event such as a physician visit.

theory, this kind of teaching seems possiblemat it has not yet been demonstrated as feasiblen the interview setting.

A nonreinforcement interviewing approachnight be considered when ability to remember

most likely explanation for the data was how the re-dorcement vatiable was originally introduced to respondents.he interviewer began by asking the respondent if he had everad any of each of 17 very common symptoms. The probabilitytat the respondent had experienced any one of these symptoms

least once in his lifetime was very high. Therefore, a "Yes"tswer was most likely a true answer and was reinforced, and a

answer was most likely an underreport and was not re-forced. Thus, the concept initialy taught was probably, Givelid answers and don't underreport." This concept may haveen maintained by the reinforcement schedule used in the re-ainder of the interview and was possibly still in effect fortestions about chroruc conditions.

correctly presents problems for survey dataaccuracy. The recommended approach involves:

a. Making the recall task simpler so that mem-ory ability is less important for validity;and

b. Using repeated trials so that an initiallyfaulty recall decision may be corrected.

Evidence for the possible efficacy of this tasksimplification and repeated trials aporoach ispresented in another section of this report,"Memory and Information Retrieval in theInterview." It should be remembered, however,that if response error is thought to be due tovariables other than memory failure, e.g., mis-understanding or fear of embarrassment, a re-inforcement procedure might be more appro-

-priate.-- _ _

Motivational Effectsof Reinforcement

The use of feedback statements, regardless othe rate or the contingency rule for their use,may have motivational consequences. The moti-vational effects are probably not reflecteddirectly in the awareness or consciousness of therespondent, and hence may be difficult to detectby questioning.

One motivational consequence has been men-tioned previously. In standard interviews it wasobserved that feedback statements were usedoften in situations where there may have beentension or negative feelings. Implicit here is thehypothesis that positive statements by the inter-viewer can reduce the respondent's tension orhostile feelings about the interview. Thishypothesis should be tested.

Psychological theory suggests at least threepossible motivational effects of positive feed-back statements. These statements might:

a. Affect the general level of motivation drive;b. Strengthen or weaken levels of specific

motives which facilitate or inhibit reportingperformance; or

c. Affect the degree of approval the respond-ent has for the interviewer. The effect ofapproval on performance is ambiguous andis discussed briefly below.

General moduation.Some theories of moti-vation include the idea of "general drive" or gen-

era! arousal. This refers to a hypothetical con-cept about a nonspecific or general level ofmotivation which multiplies the strength of anybehavior tendency in a person. For example, if aperson is nmning a 100-yard dash, his speed willbe faster if his general drive level is high andslower if his general drive level is low. His speedis affected by other thinks, of course, such as hisability as a runner and his specific desires to win.It is hypothesized that reinforcing statements inthe interview increase the respondent's generalmotivation and thereby accentuate whateverresponse tendencies exist, regardless of whetherthe response tendencies are the right ones orno t.

A complete presentation of drive theory andsupporting empirical evidence is beyond thescope of _this section, but one fairly complicatedaspect of the empirical evidence is importantenough to mention. (For further discussion, thereader should consult Zajonc.5 7 ) There is a com-plex relationship between drive and behavior. Ifa task is easy, high levels of drive will result ingood performance. If a task is difficult, highlevels of drive interfere with efficient perfor-mance while low levels of drive are accompaniedby good performance. For tasks of intermediatedifficulty, the level of drive shows a curvilinearrelationship to good performance, with .moder-ate levels of drive producing good results andhigh and low levels accompanied by poor perfor-mance.

It may be that high levels of respondent drivereduce the underreporting problem in surveyinterviews and that this has been one effect ofreinforcement procedures in the studies alreadydiscussed. There is some evidence that leads oneto suspect, however, that respondent drive levelsmay have been too high in some of the experi-mental interviews. A schematic representation ofdata from Marquis, Carmen, and Laurent" isgiven below. Possibly, the addition of several

experimental techniques aimed at producinggood performance on the moderately difficultrecall task (remembering one's chronic condi-tions accurately) caused performance to sufferin the way the Yerkes-Dodson law predicts.

If an extensive interview, such as the onedescribed by Laurent, had been used in com-bination with reinforcement instead of loligquestions, the respondent task would have beeneasier and accuracy would have been increased.

Effects on specific motives.Specific motives,such as achievement or social approval, may beaffected by reinforcement. The tendency toachieve or seek approval, in addition to beinginfluenced by general drive, can also be height- _ened or dampened by the amount of socialapproval given at any particular time. This might_be made clearer by the following analogy: If aperson is hungry and eats a reasonable quantityof food, he soon behaves as if he is no longerhungry. It is said that the food consumptionreduced the desire for additional food. If, on theother hand, a person is not particularly hungrybut is allowed to eat a small quantity of food(e.g., one potato chip), he often behaves as if hehas become hungrier. Similarly, social reinforce-ment may increase the tendency to seek socialapproval (or avoid social disapproval) and task-oriented reinforcement may increase or decreasethe desire to do well.

In the Marquis and Cannell study,5° some evi- _

dence was obtained suggesting that the respond-ent's tendency to seek social approval (or avoidsocial disapproval) was reduced as a result ofreceiving reinforcement. These data are far fromconclusive, but if the social approval tendencyof respondents can be reduced, they have im-portant implications for response accuracy.According to Edwards," people tend to err byresponding in a socially desirable direction. Ifthere is some way to make people less concernedabout social approval or disapproval in the inter-view, interview data would presumably be morevalid since respondents would be less reluctantto report socially disapproved information aboutthemselves.

It may be that positive reinforcement, whichrepresents social approval given by an inter-viewer, actually reduces the respondent's ten-dencies to fear social disapproval and therebyreduces his reluctance to report socially dis-

approved information. Marquis and Connell"also found that reinforced respondents reportabout one and one half times as many highlyembarrassing symptoms for themselves as dononreinforced respondents. This finding is con-sistent with the specific motive reductionhypothesis discussed above.

On the other hand, it may be that an inter-view using only a few reinforcing statementsmay fall victim to the "one-potato-chip effect."It may be that fear of interviewer disapprovalcould be accentuated by having only a smallnumber of positive feedback statements pro-gramed into the conversation.

Feedback and establishing a relationship.Itis often stated that the success of an interviewdepends on the degree to which the interviewercan establish rapport with the respondent. Opel--ational definitions of rapport differ and areusually unstated. The concept often seems torefer to a relationship of personal understandingand approval between the interviewer and re-spondent which is thought to facilitate responseaccuracy.

The evidence is reasonably clear that feedbackor other positive statements which do not nec-essarily follow any particular contingency sched-ule result in the respondent approving of theinterviewer. What is not clear, however, iswhether the resulting positive feeling has any-thing to do with obtaining valid data.

The relationship between approval (producedby several types of feedback) and performance isfound to be nonexistent. Marshall, Marquis, andDskamp49 showed that respondents tended toLike interviewers who made positive commentsibout their performance more than they likednterviewers who made only negative comments.iowever, interviewer comment style was un-elated to accuracy of recall even in these ex-reme conditions.

Bales' data59 suggest that, for long-term inter-ictions, feedback statements indicating solidar-ty, agreement, acceptance, attention, and satis-action, or promoting the release of tensionerve a "maintenance" function. That is, theyerve to keep a team or group together andvorking on a task. Presumably, without thisind of interaction, task-Oriented groups wouldTeak up without finishing the job at hand. It isot clear, however, that a short-term interview

interaction requires this socioemotional kind ofinteraction to be successful.

Finally, Hyman et al."' have questioned theassumption that rapport is desirable in the inter-view. They point to the possibility that thesocial relationship implied in a high-rapportinterview may prevent the full disclosure ofsocially unacceptable information. Carmen"has proposed an experimental test of thishypothesis.

Thus, while positive feedback in the interviewseems to create a good relationship, this relation-ship per sc may have either no effect or negativeeffects on reporting accuracy. On the otherhand, it may be that the relationship somehowinteracts with other variables (such as a condi-tioning procedure) to influence response.accuracy. Further research is needed-in-this area.

EXPER I M ENTAL STUDY'

The social reinforcement effect for adulthumans is fairly well established. In a review ofthe literature, Krasner" pointed out that re-inforcement effects have been demonstratedunder numerous settings with a variety of re-inforcing statements, with many kinds ofresponses, and with different types of people.Currently, attention has turned to a considera-tion of other variables which may be in 'antin producing reinforcement effect:.performance.

One of the major concerns in the precedingdiscussion was the separation of the instruc-tional or cognitive effects of reinforcement fromthe more automatic conditioning effects. Thisissue is very important to survey interview plan-ners for several reasons and has some majorimplications for theories of humm behavior. Forexample, if the reinforCement techniques men-tioned above achieve their effects merely byinforming the respondent about what the inter-viewer wants him to do, it would seem thatthere are better ways of passing this informationalong to the respondent. Using only reinforcingstatements to convey information about what

°This section was written in collaboration with Ms. LindaWood, who has undertaken the major responsibilities of designand execution of the experiment described here.

the interviewer wants has the distinct disadvan-tage that the respondent can misinterpret themessage or possibly never get it at all. On theother hand, the reinforcement procedures mightbe producing better reporting which could notbe produced by other means. If this is true,reinforcement procedures should be consideredseriously for all personal interviews, as a tech-nique for improving reporting.

This issue of whether the reinforcement effectis purely cognitive or more than that is a majorconcern of experimental psychologists. These re-searchers have given a great deal of attention toa process they call "awareness," the respond-ent's knowledge that reinforcement is being usedand that it is used only after he gives certainkinds of answers, A number of researchers63*65maintain that the reinforcement effect can beobtained for a human subject only when thesubject is "aware" of the response-reinforcementcontingency or, in terms used here, is aware ofwhat the interviewer expects him to do. Themere fact that a reinforcing statement follows aparticular response is not in itself sufficient toincrease the probability of occurrence of thatresponse. An increase can be obtained onlywhen the respondent understands the relation-ship between his answers and the reinforcingstatements. The implication of this position isthat one may obtain high levels of respondentperformimce right at the start of an interviewmerely by giving clear instructions to therespondent rather than by using reinforcement..Furthermore, since reinforcement takes a longertime to achieve its effects, some respondentsmay never become aware of what is wanted.

Another group of researchers33 .3 7 ,6 6 main-tain that reinforcement can have a direct condi-tioning effect. They do not deny that awarenessor cognitive effects of reinforcement may con-tribute to the reinforcement effect, but say thatthese are not necessary in order for the re-inforcement effect to occur. The major thrust ofthe research of the I.,tter group is that reinforce-rnent effects can be produced without therespondent becoming aware of the expectationsof the experimenter or interviewer. Thesestudies do not show that reinforcement proce-dures are more efficient than direct instructionprocedures. They do indicate, however, thathuman behavior can be changed without the

necessity of the respondent having to thinkabout the change.

Earlier it has been shown that innovativeinstruction procedures such as brochures, calen-dars, and informing the respondent about keyquestions prior to interyieWing were not effec-tive in producing better thitiz Thus, establishingawareness through these initial procedures wasnot sufficient to produce desired performance,as Spielberger and others might imply.

However, in view of the large number ofstudies supporting each point of view, it may bethat both interpretations are correct but thateach is true under different circumstances. Itwould seem that there is some other variablethat helps to determine whether simple knowl-edge of what the interviewer expects is sufficientfor good responding Or Whether more compli-cated conditioning procedures are necessary, Acomparison of the two sets of studies _suggeststhat this variable is "task difficulty." Those :studies that suggest that awareness is sufficientin itself usually involve a simple task, for exam-ple, the selection of a firstTerson pronoun.2 7Those that suggest that awareness may be un-necessary invoie a soMewhat more difficulttask, such as giving "self-acceptance" re-sponses.34 The, difference in difficulty betweenthese two type's of tasks has been obscured bythe fact that as long as the interviewer's expecta-tions arc unstated, both types of task seem diffi-Cult.

It must be hypothesized, therefore, that theconditioning effect of reinforcement bringsabout changes in respondent ability level or skillin responding adequately (sec "Dependent Vari-ables," under "Design of an Experimental Inter-viewing Approach," this report) and that, toextent that a respondent's skills are low, theconditioning effects or reinforcement will begreater. That is, the importance of awareness,however obtained, is relevant ,to knowledge ofexpectations, but only reinforcement can changeskill. When skill needs to be iMproved, a condi-tioning procedure is necessary. On the otherhand, when the task does not require a high levelof skill, direct instructions should produce maxi-purn performance and a reinforcement pro&.dure wiii not produce any further iniprovement!

Some of the basic questions that arise in theconsideration of reinforcement effects should be

51

clarified in order to improve the quality of re-porting. For example,

a. How important is respondent kno ledge ofthe interviewer's expectations?

b. If this knowledge is essential, is it conveyedbetter by interviewer instruction, by re-inforcement, or by some combination ofthe two?

c. Is the recall skill of the respondent in'-. proved by reinforcement and does the

amount of this improvement depend on thedifficulty of the task?

d. Does reinforcement provide information tothe respondent about the adequacy of hisresponses and does the importance of thisinformation vary with the respondent'sskill?

MEMORY AND INFORMATIONRETRIEVAL IN THE INTERVIEW

The purpose of this section is to present somehypotheses about underreporting and to de-scribe an experiment designed to reduce under-reporting in a field survey. One-of the principalcauses of underreporting is the failure of recall;information is not reported because it is notretrieved from memory. Work in interviewingmethodology tends to support the assumptionsof McGeoch2 5 and modern interferencetheory24 that information does not disappearfrom memory but may become difficult to recallbecause of interfering associations. Only theaccessibility of information declines, resulting ina "lessening probability of retrieval from thestorehouse."25 Thus, underreporting is essen-tially a problem of retrieval, and reporting maybe improved by manipulating conditions underwhich retrieval occurs.

THE INADEQUATESEARCH HYPOTHESIS

Interviewing methodology indicates that theconditions of recall have a crucial effect upon

_the outcome of an interview. Some informationmay be unreported simply because the questionsdo not convey to the respondent an accuratenotion of what information to search for. Forexample, 19 percent of a sample of respondentsintemiewed in the 14- alth Interview Surveydeclared La a 1.0 interview that theythought the interviewer was interested only in"fairly important" things.55 By itself this ob-servation provides an alternative hypothesis to

interpret the underreporting of low-impactitems. They may be underreported only becausethe respondent does not consider them _to be _

relevant and therefore does not even search forthem.

Several early experimental studies had sug-gested that the nonreported material had notbeen repressed or deeply suppressed, nor had itvanished from memory, but often was simplynot elicited by the usual questioning procedures.It seemed likely that the use of different sets ofquestions and techniques by interviewers couldsignificantly decrease underreporting. For in-stance, over half the , hospitalizations not re-ported in a first interview were reported in asecond interview." An experimental procedurewhich Facluded a few extra questions, moreexplareaion of purpose to respondents, and amail followup also resulted in a significantincrease in the reporting of hospitalizations."Adding probes to major questions regardingvisits to physicians reduced the underreportingby 7 percentage points, from 30 to 23 percent)Another study by Balamuth, et al.4 indicatedthat checklists also seemed to reduce the under-reporting of chronic conditions. A laboratorystudy showed that the form of the questions hada great effect on the accuracy of reporting itemsfrom a movie.67 Finally, an experimental studyby Marquis, Cannell, and Laurent65 demon-strated a substantial effect of mere questionlength upon the validity of the reporting ofchronic conditions.

A major contribution of these studies is thedemonstration that experimental questioning

strategies can improve reporting by changing theconditions under wIdch the respondent is invitedto search for past events. While these studiessuggest promising avenues for methodologicalprogress, they are not concerned directly withthe cognitive processes involved in recall.

AN INTEGRATIVE HYPOTHESIS:COGNITIVE INADEQUACY OF

STIMULI QUESTIONS

Although one can hypothesize that failure torecall information is an important reason forunderreporting and also demonstrate that recallcan be improved, it is not immediately apparentwhat steps one should take to improve reportingin the survey interview. Why do customaryquestioning procedures fail to obtain adequatereporting? Why, for example, if a respondent isasked to report his dental visits of the past 6months, is he likely not to report them all? Atautologous answer is that for some reason thequestion was not adequate to stimulate retrievalof infcirmation located in memory. The in-

adequacy of a single question to elicit informa-tion suggests a consideration of how informationis copitively organized in memory.

When a person experiences an event, the eventis not merely recorded in its original form in themukner of a computer tape, but rather itbecomes organized in a perceptual field. As inthe old illustration of a blind man describing anelephant, the meaning of an event depends uponhow it is perceived and with what other events itbecomes associated in memory. Thus, what is asimple, single-dimension variable for the re-searcher may not be a simple item for therespondent. The dental visit the researcher seesas a simple item of information may beorganized in one of several frames of referenceby the respondent. The respondent may think ofthe visit in terms of cost, pain, or loss of timefrom work. From this point of view the singlequestion about the occurrence of the dental visitmay be ineffective; it may fail as a stimulus torecall the event.

As showri in figure 1, the respondent's cogni-tive organization and the researcher's d6ign ofa questionnaire can be viewed as two diverging

Cognitive state of prOceiSedintOrnsatiOn at time of inturview

RESPONDENT'S INFORMATION PROCESSINGTreneforrnod and renniclurld

0OZordlog to neer input

Integrated In cognitiveorpanimtion of the moisten

Experienced byrespondent

EMENT

Iowa:elanbetween

akin:due questionsod onpoodent's

cOgnitiveOrganization

informationretrieval

nrooanition

Conceptualized Moh yerlabla

If question is opproplit. icdvitcof resporedant's cnIth

RESEARCHER'S DESIGN OF THE GUESTIONNAIRE

Cvnyenad into a mital otirouIusund4r a nitration forrn

Informational I LIU! Ofnimulus question

nformation processi

6 4

n the Interview.

paths by which information is processed inde-pendently and quite differently before the inter-view. These two paths lead to two independentinformational statesmemory trace and stimulusquestionwhose interaction in the interview isexpected to produce the retrieval of the initialinformation. This model indicates that the prob-ability of proper recall is a function of theability of the stiumli questions to interactadequately with the respondent's cognitiveorganization. The appropriateness of the stimuliquestions is a function of the researcher's abilityto comprehend the nature of the respondent'scognitive path and to utilize this knowledge inthe framing of the questions.

The methodological objective becomes one ofredesigning- questionnaires to facilitate recall.One way of doing this is to incorporate into thedesign of questionnaires some of the thingsalready known about how people learn, store,and retrieve information. The experimentP de-scribed below is an exploratory attempt toascertain the validity as well as the feasibility ofthis cognitive approach for improving reportingin the household interview:

DESIGN OF AN EXPERIMENTALINTERVIEWING APPROACH

The study was designed to increase thereporting rate of acute and chronic illnesses,usually underreported in household interviews.The strategy consisted of an experimental ques-tioning procedure which would provide stimuliyelevant to the respondent's cognitive processingof health information. This procedure was ex-pected to improve retrieval from memory and toincrease the probability of reporting.

To find out about the sicknesses of a person,during the month preceding an interview, onemay ask a standard qUeStiOn%such as, "Were yousick at any time last month?" If the person hadhad influenza and had stored this experience inmemory as "being sick," there is some chancethat it would be reported in response to thisquestion.

PA detailed presentation and discussion of this experimentappears in a study by Laurent, Cannel, and Maxquis."

If, among other illnesses, the researcher isparticularly interested in collecting data aboutinfluenza, he may also want to use a simplequestion such as: "Did you have the flu lastmonth?" This question will be a powerfulstimulus only if the sickness has been experi-enced as the flu or influenza and was concep-tualized as such. There are clear limitations to astraight application of this recognition techniquein the interview. Aside from the old issue ofsuggestibility, an exhaustive list of all potentiallyrelevant items of information would not usuallybe feasible. Furthermore, the L.ecognition tech-nique relies much rnore than other retrievaltechniques upon the assumption that researcherand respondent share the same concepts. This isclearly shown in a medical interview. A physi-cian contends that his patients usually reportonly major surgery when asked about previousoperations, whereas they tend to report bothmajor and minor surgery in answer to a questionabout stitches. The recognition principle is basicto the recall process,68 but its application tointerviewing procedures is valuable only ifappropriate stimuli of recognition_can_be_de-signed.

All of this experience indicates that an eventmay be stored in memory under various infor!national states so distant from the initial infor-mational state that a stimulus merely tracedfrom the original event or from its straightconceptualization might not elicit the storedinformation. Last month's influenza may havebeen stored in memory in many different waysand not necessarily as being sick or having theflu. There is enough'evidencc from experiments,as well as from everyday experience, to showthat memory is not a simple recording devicebut rather a complex process in which informa-tion is transformed and organized. Thus, influ.enza may no longer exist in memory as asickness, but may be organized around a numberof other possible traces. For example, theinterference of serious chronic illnesses maycause the respondent to fail to consider theepisode of influenza as a sickness and no longerto store it as such. On the other hand, this minorflu may have prevented the respondent fromgoing to work on a cold day, thus reducing hisincome in a manner significant enough to have

impact on his budget. It becomes clear that thisrespondent may be less likely to report the flu inanswer to a single question about sickness thanto a question such as: "Have you lost anyincome because of any sickness during the pastmonth?" If the flu has caused this person to befeverish, a symptom that he very seldom experi-ences, another relevant stimulus might be:"Have you had a fever during the past month?"

Memory can process an illness in such a waythat it gets transformed in storage and becomesorganized around concepts such as pain, incapac-ity, cost, visits to doctors, hospitalizations,medication or treatment, symptoms, or moregenerally around other causal, circumstantial, orconsequential events. The initial perception maybe distorted to an aisociation with anotherperception in order to fit into some structure. Ifone asks a broad question such as, "Tell meabout your illnesses," the respondent has tomake the effort to review a multitude ofcooitive structures in order to recall properly.He has to invent appropriate frames of referenceto guide his search; he has to create his own cuesto reactivate traces of possibly weak salience.Altogether, the task is enormous and complexand the motivation to invest substantial effort init cannot be expected to be spontaneously high,especially in the context of an information-getting household interview that has noimmediate benefit for the respondent. Withinthis framework, underreporting is predictable;the broad question is not an adequate stimulusto the relevant frame of reference and therespondent is not going to work hard to recallinformation.

This framework, however, provides prospectsfor an experimental methodology. Instead ofasking one standard question essentially tracedfrom a simple conceptualization of the event tobe recalled, several questions may be asked thatare traced from hypothesized states of informa-tion after memory processing. In other words,instead of requesting from the respondent thedifficult task of building up his own cues andframes of reference, the researcher should at-

mpt to create these recall aids and to buildthem into the questionnaire. If the researchercan be successful in predicting and designing therelevant cues 'and frames of reference, then therespondent's recall process should be signifi-

cantly facilitated and the availability of informa-tion correspondingly increased.

The Extensive Questionnaire

The strategy of an extensive health interviewconsisted of designing a questionnaire containinga large number of questions that would providethe respondent with multiple ands overlappingframes of reference and cues (see Laurent,Cannell, and Marquis69). Medical informationwas asked within classical conceptual frame-works as well as in the language of the layman,and through standard questioning as well asthrough multiple behavioral cues or direct recog-nition of items. Transitions between sectionswere also used to -bring some relief in thequestioning style and to instill a deliberatelyrelaxed pace in the interviewing.

The questionnaire started with question aboutsymptoms, such as, "Do you have pains in theabdomen?" "Have you had any pain or sorenessin your joints?" "Have you had trouble breath-ing?" Every time the respondent gave a "Yes"answer the interviewer used the probe, "Do youhave any idea what causes it?" in an attempt toobtain the report of an underlying health condi-tion. Other frames of reference and cues wereused, such as asking for a medical history bymeans of queries related to childhood, adult-hood, 6 or 12 months previous, last week, or theweek before last. Then specific behavioral impli-cations of illnesses inch as diet, food sensitivity,restrictions of activity, medications taken, andvisits to physicians were all used to provideassistance in the retrieval process. Finally, achecklist of chronic conditions implemented adirect items-recognition approach.

Control Questionnaire

A control questionnaire was used in thecollection of information on the same majoritems of health information as the extensiveform, but it consisted of single direct questionsfor each variable. This procedure containedstandard questions comparable to those used inthe Health Interview Survey and the samechronic conditions checklist that was used in theextensive form.

6 6

Field Experiment

The study was designed to compare theffectiveness of the two questionnaires on two

main dependent variables: (a) the number ofhealth conditions reported, and (b) the impactof these conditions on the respondent.

To decrease the variance from factors otherthan those purposely introduced by the experi-mental design, the sample population washomogeneousa restricted segment of English-speaking, native-born, white females between 18and 65 years of age, all of whom were residentsof the city of Detroit and were of low-middleand middle socioeconomic status. Two clustersof three dwelling units were chosen at randomfrom each of 110 blocks, selected with probabil-ity proportionate to size in 16 census tracts.Only one person in each dwelling was inter-viewed; the wife in the household was the firstchoice. Six female interviewers were employedin the study and were assigned to geographicallyconvenient sections of the city. The assignmentof experimental questionnaires to householdswa; random within each sample block.

The study showed that 204 respondents wereinterviewed, consisting of 105 extensive and 99control subjects. Aside from the variations inquestions, the interviewing techniques were keptconstant for all subjects. (e.g., introduction ofthe survey, getting demographic information,probing procedures, interviewing style). Allrespondent demographic characteristics weresimilar, as were the response rates (87 percent inboth interviews). On the average, the duration ofthe extensive interview was 74 minutes and ofthe control interview, 40 minutes.

pendent Variables

Eligible health condition.To ensure the com-parability of the data collected through the twointerviewing procedures, precise criteria wereestablished to determine the eligibility of anyreported health problem as a condition. Every

me any health problem was mentioned ineither of the two procedures, standard struc-tured probing was used to ascertain its eligibilityas a condition. In order to be an eligiblecondition for data analysis, an illness had topresent either acute characteristicsthat is, tohave started within the 14 days preceding the

interviewor chronic characteristicsto havechronic implications for the person's health. Allconditions were screened by the interviewer andlater by the coder to include only those meetingthe criteria of the Health Interview Survey. Thenumber of eligible conditions reported was themajor dependent variable analyzed.

Condition impact.As suggested earlier,health events get transformed and organized inmemory within various clusters or frames ofreference. It is assumed that the number of suchcategories under which a health condition isorganized is a mark of its salience or impact anda predictor of its accessibility for retrieval. Acondition of low impact that is organized underonly a few categories is likely to be missed'by asingle question. Therefore, the hypothesis in thisstudy was that the extensive interview wouldpick up more conditions of low impact.

To create a measure of impact in bothtechniques, every eligible condition reported wasprobed according to a standard procedure. Addi-tional information was asked about the exist-ence of health behaviors associated with eachcondition occurring during the past 2 weeks(visits to a physician, medicine, treatment, spe-cial diet, days in bed, days of restricted activity,amount of pain or discomfort). An index ofimpact was created for each reported conditionon the basis of this additional information.

The two dependent variables were selectedupon the empirical evidence that underreportingrepresents a major problem in the health inter-view. As medical records were not available forthis research, a working assumption built intothe design of the study was that the moreinformation reported, the better the overallvalidi ty.

Hypothesis

The major hypothesis that the study at-tempted to test was the following: By proviclinga broad aid to the respondent's recall processthrough the use of multiple frames of referenceand cues, the extensive procedure is expected toincrease the overall number of reported eligibleconditions as compared with the number ob-tained in the control procedure. Since events oflow impact are more likely to be underreportedthan those of high impact, a significant increase

was particularly expected in the reporting ofchronic conditions of low impact within theextensive interview. As a consequence, the over-all impact level of the reported conditions wasexpected to be lower in the extensive than in thecontrol procedure.

Results and Discussion

As shown in table 38, the data clearlysupported the major hypothesis. A significantlylarger number of health conditions was elicitedby the extensive interview than by the controlinterview. Respondents reported an average of7.88 eligible conditions in the former and 4.42in the latter. The increase in reporting wassignificant for both chronic and acute condi-tions.

As. predicted, the increase in reporting wasobtained by eliciting conditions of low, hut nottrivial, impact that were not reported in thecontrol interview

i.As shown in table 39, the

-mean level of mpact for all conditions was

significantly lower in the extensive procedure(2.03) than in the control procedure (2.64), andthis was especially true for chronic conditions. Itwas further verified that the reporting of highermpact conditions wm similar for both tech-

niques.Several analyses have been conducted in an

attempt to identify the main sources of im-proved reporting in the extensive questionnaire.It was found, for instance, that 61 percent of allconditions reported in the extensive interviewwere elicited by the initial symptoms questions.These questions dealt with the presence ofspecific symptoms, aches, or pains and with thehealth conditions underlying them. The averagenumber of conditions reported under theseinitial questions in the extensive interview waslarger than_ the total average_ number _of condi-tions reported in the entire control interview.This observation Lndicates that a cue-givingapproach using symptomatic manifestations ofillnesses as a frame of reference is more produc-

Table 38 Mean number of health conditions reported per person and differencequestionnaire procedure

eans by type of condition and

Reporting variable

Mean number ofconditions reponed Difference

betweenmeans

P1Extensiveprocedure

(105 respondents)

Controlprocedure

(99 respondents)

All eligible health conditions

Chronic conditions . . . . , ... . .. . ....... . . . . . .

Acute conditions (in last 14 days)

7.88 4.42 3.46

6.290.82

3.990.33

2.300.49

.001

.001

1 Significance level of difference. Values were computed on the basis of the one-tailed r-stati

Table 39. Number and mean impact level of health conditions report by type of condition and questionnaire procedure

-

Reporting variable

Number of healthconditions

Moan impact levelDifferencebetween

meansp1

ExtensiveProcedure

Controlprocedure

Extensiveprocedure

Controlprocedure

All eligible health conditions

Chronic conditions . . . . . . . . . . ..Acute conditions (in last 1.1 day

661 399 2.03 2.64 0.61 .001

58373

a729

1.R7

3.342.464.93

0.591.59

.001

.025

I of difference. Values were computed on the basi he one-tailed r-

tive in e icitmg tl-r.e report of illnesses than arethe standard questions actually designed to servethis purpose.

Another observation of this kind concerns_ theeffectiveness of recognition checklists. At theend of boih interview procedures, a standardchecklist of 41 chronic conditions was utilizedand the reiative effectiveness of this device wasevaluated within each procedure. It was foundthat this last section was highly productive inthe control questionnaire (57 percent of allconditions were first reported there) and appre-ciably less productive in the extensive question-naire (where it yielded a total of 16 percent offirst reports). The high figure obtained under thecontrol procedure (57 percent) confirms tosome extent the effectiveness of an items recog-nition approach as .a basic tool for elicitinginformation in a standard type of interview. Thelower but still substantial figure obtained underthe extensive procedure (16 percent) demon-strates the adequacy of this recognition ap-proach for gathering information not alreadyreported through the various cues providedduring the course of art extensive interview.

A comparison of the differential effect of thisrecognition approach under the extensive andcontrol interviews for the reporting of chronicconditions is shown in table 40. In the extensiveinterview, 67_psrcent of the chronic conditionsin the final standard checklist were first elicitedbefore thl final checklist, as cdmpared to only26 percent in the control interview. In bothtechniques, the high-impact conditions werereported first, the low-impact ones last. This isespecially typical in the control technique wherethe mean impact level varies from 4.36 for

chronic conditions repo ted prior to the recogni-tion list to 1.61 for those reported on therecognition list. An analysis of variance carriedout on the impact level of all eligible conditionsreported by chronological sections of the ques-tionnaires in the two procedures led to the sameobservation; showing a significant lowering ofimpact from the earlier sections of the question-naires to the later sections.

The same finding was replicated on a questionbasis in the extensive interview. The extensiveinterview made some use of primary or standardquestions followed by additional or cue-givingquestions. The average impact level of eligibleconditions reported for each of tliese two typesof questions is shown in table 41. An analysis ofvariance based on these data showed that theaverage impact score varies significantly(p.05) according to whether the conditionwas reported in the primary or additionalquestions. Table 41 demonstrates the tendencyfor the average impact to be lower for condi-tions reported on the additional questions (2.25)than on the primary questions (3.66). This tablealso illustrates clearly the power of additionalcue-gisring questions to elicit information riotreported on standard questions. In this case theadditional questions are providing as many andeven more eligible --cbrichtions (36) than theprimary ones (32).

This general tendency for high-impact condi-tions to be reported earlier and for Iciw-impactconditions to be reported laterwithin an entirequestionnaire or within given questionsconfirms the idea that high-impact events aremore easily recalled and reported than arelow-impact events. Since they are easier to

Table 40. Percent of chronic conditions reported and mean impact level of listed chronic conditions by whether first reported prior toor in response to recognition list, by Questionnaire procedure

Questionnaire procedure

Listed chronic conditions

First reported priorso recognition list

First reported in..response torecognition list

Total listed chronicconditions reported

Mean levelPercentof impact

Percenean level

of impact Percent

ExtensiveControl

67- 26

2.4. 74

6 9

Mean levelof impact

2.022.32

Table 41. Number and impact level of eligible conditions, bywhether first reported In primary or additional questions ofthe extensive questionnaire

adequate stimuli to activate respondent recallbecause they may ignore the way in whichinformation is organized in memory. Thus anattempt was made to pattern an experimentalquestioning procedure after the processes thatthe respondent was expected to use in acquiring,storing, and retrieving information: This wasaccomplished by the use of multiple-frames ofreference and multiple cues integated into aquestionnaire. 4 substantial increase in informa-tion was obtained through these procedures inthe-areas of information where underreporting istraditionally observed. This improvement isinterpreted as the result of a greater corre-spondence between the questioning proceduresand the manner in which respondents organizehealth information in memory.

Several uncertainties remain, _however. Whilethere is evidence from the reported data that theadditional information elicited in the extensive _

procedure is not trivial, the validity of _ thisinformation needs to be ascertained by a studywhich would check- the respondent's-reportingagainst valid records. The amount of overreport-ing obtained with theexperimental techniqueshould be evaluated apd compared with theamount obtained using a Control technique.

On a more theoretical level although theresults of this study supported the hypotheses, itis not possible to infer from these data anysatisfactory statement of causality. Indeed, theinteractions between cognitive and motivationalfactors involved irs the interview situation werenot controlled in this study. Even though the

iapproach was cognitive, t seems that motiva-tional changes also occurred,that may have beeninstrumental in deterrnining the outcome of theexperimental treatment For instance, as theextensive procedure made the recall task easier,it also reduced the amount of effort required toperform it and thus reduced the motivationalrequirements of the task as well. Also, since theinterviewer_had to devote more time and displaymore behavior in administering the extensivequestionnaire than irkerninistering the controlprocedure, this increaser:1 activity may haveconveyed-the- idea-that-the-task-was-importa,nt,and may have thus heightened the respondent'smotivation to perform. Too, -the respondentmay have modeled his behavior after that of theinterviewer the interviewer did more talking in

Numberof

conditionsStandarddeviation

Both types ofqueStions

Primary questions . . . . 2.65Additional questions 1.87

NOTE: F 6.50 (p < .05).

recall, they are reported first; low-impact itemsappear harder to obtain from the respondentand, as such, require stronger stiftfuli (recogni-tion lists or cues) and more time. Thus therecent behavioral implications, or impact, of anevent strongly affect the likelihood of its re-trieval. This likelihood seems to increase as theamount of behavioral implications increases, andIhce versa.

The inadequacy of standard interviewing toelicit reports of' lower impact events appears,then, as a major cause of underreporting andincomplete data. The analysis presented abovehas pointed to the ability, of some interviewingdevices to improve the prospects of reporting forevents whose behavioral implications are weak.It is interesting to note that in spite of itslow-impact character, the additional informationcollected through the extensive technique wasstill significant in terms of the criteria of theHealth .Interview Survey, as it involved, forexample restricted activity or use of medicalservices.

Conclusions

The results of this study emphasize theimportant effect of question content and ques-tion strategy on survey data and suggest several-methods of reducing underreporting bias. The

--irietheidological -objective was- to deni-onstratethat the completeness of the reported informa-tion can be changed significantly by programingthe respondent's recall task more efficiently.Standard questions may not represent the most

the extensive interview and the respondent mayhave followed his lead by being more active andreporting more information). It is, unclearwhether increases in reporting have been ob-tained through direct cogniiive facilitation, re-duction of motivational requirements, indirectmotivational stimulation, or a combination ofthese factors. The major outcome was a prag-matic one; techniques designed in a cognitiveframework to facilitate recall have proved effec-tive in increasing reported infOrthation.

Finally, two main implications deserve partic-ular attention. First, it seems that the complex-ity involved in retrieving adequate informationin an interview tends to be underestimated. Evendata that appear relatively simple to obtain mayrequire the design of more elaborate interview-ing techniques. Efforts have been made in pastresearch to overcome various iliases in interviewresponses. However, in most cases they havebeen avoidance strategies rather than construc-tive approaches. Evidence from data such asthose presented above suggests that probablyvery little is known about the asking of appro-priate questions, so that reporting errors mayoften be the result of questioning errors. In viewof the wide utiliiation of survey methods tocreate new.. knowledge and to guide policies,there is great need for more basic research onthe interviewing tool itself. A major objectivefor these research efforts is simply improvementin the design of questions. The present rzsearchhas attempted to demonstrate that_ progress Canbe made in a-Lis direction by framing questionsin accordance with the respondent's cognitiveprocessing of the initial information.

A second implication of this research is thatthe experimental survey research interview couldprovide a new approach for investigations in the__field of cognitive_ psychology. If hypothesesabout the cognitive process can be introducedinto the design of interviewing experiments,

then the interview setting can serve as a labora-tory for the study of human memory and recall.It is interesting to note that most of theknowledge related to the memory process hasbeen developed in classical experiments bymanipulating the input or learning conditionsand evaluating the resulting output or recall.From this experimental design, inferences aremade about cognitive processing and memory.Thus, in the laboratory the focus is most oftenon input or learning conditions. Little attentionis given to recall, which is usually considered asan end result variable or as a test of learning andretention _after ,memory processing. Textbooksare more likely to discuss the psychology oflearning than the psychology of recall. A re-versed strategy was attempted in the presentresearch; the learning conditions _or input were _

kept constant or controlled by experimentalsurvey design and conditions Of recall weremanipulated. Recall was no longer consideredonly a test of learning but wai viewed as- apowerful intervening process itselfone whichmediates the effects of learning and memoryprocessing on survey interview reporting.

"The more questions one asks about a topic,the more information one obWns," is a state-ment made frequently by survey researchers."Recognition-list questions will obtain more

information- than those requiring free recall," iscommonly stated by survey practitioners. Thepresent study helps to provide some understand-__ing of the phenomena underlying these state-ments. It is not a simple matter of asking morequestions; nor_ are recognition-list questionsnecessarily better than other kinds of questions._

Bome basic prineiples of memory and retrievalcan be used to improve reporting. This studysuggests that further research will yield abetterunderstanding of the way in which informationis stored and will invent more effective methodsof retrieving that information.

QUESTION LENGTH AND REPORTING BEHAVIORIN THEANTERVIEW:_PRELIMINARY-INVESTIGATIONS

This section reports on two exploratorystudies of the effects of question length on theamount and validity of information reported inhousehold interviews. Both of these studies.are_ __ _

described in more detail in Vital and HealthStatistics, Series 2-Number 45,53and the find-ings of thesecond one have been discussedpreviously- in this report in relation to-the effeCts

of reinforcement on household interviews in thesection, "The tfs,- -of Verhal Reinforcement inInterviews and Its Data Accuracy."

Statements such as _the following can befound in questionnaire methodology mmuals:"Make the questionR as concise as possibleThe length alone miles_it practically impossibleto carry the question as a whole in mind."79 "It

generally best to keep _questions shortpreferably not more than 20 words."71 On thebasis of these statements it was assumed thatlengthy questions were an obstacle to clearcommunication. However, some empirical find-ings showing the presence of a verbal behavioralbalance in the household interview65 along withother findings repeatedly demonstrated a match-ing effect between interviewer and respondentspeech duration.7 2 From this _finding it was _concluded that long questions might have some

- value not previously anticipated.

Empirical Findings on Behavior Matching Inthe Interview

The Survey Research Center study byCannell, Fowler, and Marquisq showed a clear,positive:association between the overall amountof behavioral activity of- the respondent and thenumber of items reported. Furthermore, _a veryhigh correlation was found between the behavioractivity level of the interviewer and that of therespondent. These findings led. to speculationthat interviewer and respondent sought andperceived cues from each-other about the degeeof effort to put into their-reSpective roles. If thiscue-search process causes the respondent tomodel his behavior- after the behavior of theinterviewer, inferences can be made about ques-tion_ length. -Logically, short questions shouldelicit -short answers and long questions shouldyield long responses._

_

This inference is supported by a series ofstUdieS On interView speech hehavior conductedby Matarazzo et _al." Briefly stated, theseStudies demonstratedlliai ionic formal measuresof the:interview -;process unrelated to contentnamely, interviewer and respondent speech dura-

gThis studY -is described in greater detail in an earlier-sectionof this report, "Behavior In.Interviews. For a full report of thestudy, see reference 46-

tion and silencesarc remarkably reliable, valid,and consistent. These- studies have shown veryexplicitly and repeatedly ernplOyment inter7views that an' increase in interviewer averagespeech duration resulted in a significant increasein respondent average speech duration. Forinstance, in'. a 45-minute interview divided intothree 15-minute periods where the intervieWersspoke in utterances averaging 5.005.2, and 5.5seconds, the respondents'- utterances averaged

_ _

30.9, 64.5,- and 31.9 seconds, respectively. /nother experiments of this serieS, the researchersvaried the schedule of the interviewer sequenceof utterances both in range and direction; Theyalso controlled for number and type of -clues;tions, for topics discussed, and for interviewerdifferences. In all cases they consistently ,ob-taMed changes of approximately 100 percent in,respondent speech duration. The changes -ive-re-always in the direction of the patterns shown bythe interviewers.r

The results of these studies of behaviormatching in the interview demonstrate thatincreases 'in interviewer verbal activity produceincreases in respondent verbalization. The proba-bility that a long response. will contain more_information is an interesting h)jpothesis -to betested. The research described here is deVoted toan investigation of this problem.

Hypotheses About the Effects of Question--Length on Reporting Behavior

A major' contributiOn of the--- work byMatarazzo and his associates was the focus onthe measures of speech behavior irt tbc interviewnot related to content,- particularly. interViewer _and respondent speech: duration. Survey, re-searchers arc interested in the potential effect ofthese variables (question -length or speech dura-. _tion) upon the content variables (amount andvalidity of reportedinformation). Ways of in-ducing_ greater respOndent verbalization are of:particular interest because it might.be conduciveto improVing reporting accuracy.

TReplications of this finrig have been obtained under otherinterviewing situations, such as the astronaut-gound communi-cator conversation-37,3 the Kermedy news conference3,74 andthe eperiment on interviewer style by Heller, Davis, -andMyers. -

'

One needs to ascertain if the speech-matching-ffect is present:in: survey interviews. Fur';rier-more, it is_ necessary to find out whether theincreases in respondent sPeech duration reportedin other studies are a direct result of increases ininterviewer speech duration or the result ofgreater information demands made upon therespondent by more elaborate questions. Ifperson A is asked the short question "Tell me alittle bit about your job," and person B is askedthe long qnestion "Tell me everything you canthink of about your job. I am interested in asmany details as you can provide to describe it,one expects person B to talk longer than personA-Simply because of differences in the demandcharacteristics of the question. In order to findout whether changes in respondent speech dura-tion are a _function -of changes in interviewerspeech duration and independent of changes inthe- information dernands (of the questirin, oneneeds to modify the length of the question whileholding constant the amount of information itrequests. Such a design was used in the firstexperiment to be deseribed in this section.

A second crucial task is td find out whetherincreases in question length, without changes inthe information demanded, have _any effect onthe answer content, independent of variations inrespondent speech duration. A mere increase inquestion length might result in an increase inamaunt of reported information, with ar with-out a Corresponding increase in respondentspeech duration. This hypothesis is based uponthe cue-search model of the interview describedabove, in which it is assumed that the respond-ent looks to the interviewer as a source of cues.A long question might provide 'the respondentvyith cognitive and motivational cues conveying-the idea that a full report is desired. It also gives_

the respondent more time to work .on recallwhile the long question is- being asked. The finalreporting performance might change, althoughthis change might not necessarily be reflected inthe length of the answer.

Finally, it is important to ascertain the effectof question length upon the validity of -thereportedinformation. Validity relight- be- int--proved the use of long questions, since theinterviewer cues might transmit a request. forcompleteness and accuracy of report. For thisstep ,of_ the research, the respondent's report willbe-checked against independent records.

EXPERIMENT 1: EFFECTS OFQUESTION LENGTH ON ANSWER; DURATION 'AND REPORTING

FREQUENCY

A pilot field experiment was designed to testthe effects of interviewer speech duration uponrespondent speech duration and reporting fre-quency.5 3 In order to increase control over allaspects of interviewer verbal behavior, variationin speech duration was created through the useof questionnaires with short and long questions.Furthermore, the lengthening of short-formquestions was implemented in such a way thatthe information requested in short and, longquestions was kept constant. This strategy wasused to rule out the possibility of obtaininganger answers to anger questions only becausethese questions explicitly asked for more infor-mation.

Questionnaire Procedures

An interview containing 28 questions wascreated according to-customary methodologicalprinciples. An average _of 14 words was used ineach question. Information was requestedvariouS health events (e.g., acute illnesses, in-'juries or accidents; and chronic conditions) andhealth-related behaviors (e.g., medicines takenand doctor visits) which occurred during variousperiods 'Of time (e.g., last 2 weeks, 4 weeks, 6months). The questions were primarily .aboutthe respondent herself, (24 questions) and sec-ondarily 'about another selected member of the

I household (4_ questions). The types and numbers =

of questions used were as follows:'

a. Open-ended type,- leading to free response

b. "How, many.. . ." type, Seeking informationon frequency of specific events (8); and

c. Closed, forced-choice, checklist type, deal-ing with presence or absence of specificchronic conditions (12).

Then, each question was written in a longform aceording to the folloWing protedure.

-long question-consisted cif-threesentences:_

a. An introductory statement giving a partialdescription_ of the topic; of the_Auestion,including the same terms as taed in theshort qUestion; but in a different grammati-cal structure, possibly using a cliche;

_

b. An intermediary statement conveying in-formation already contained in the shortquestion but not presented in the introduc-tory statement, and usually introduced byanother cliché; or a filler, introducing someextraneous information of obvious andinconsequential nature about the surveybut unlikely to affect the meaning of thequestion; and,

c. The question itself in its short form.The two following examples provide an

illustration of this question-writing procedure:

Q.4. Short Form: Would you tell me whataccidents or injuries you may have had duringthe last six months? (17 words)

Q. 4. Long Form: We would like to go nextto a question about accidents and injuries. Inthis survey we ask everybody to report theiraccidents and injurieS for the last six months.Would you tell me what accidents or injuriesyou may havehad during the last six months?(47 words)

Q. 17. Short Form: Have you ever had anytrouble hearing? (7 words)

17 . Long Form: Trouble hearing is the lastitem of this list. We are looking for someinformation about it. Have you ever had _anytrouble hearing? (24 words)

Thus, length was added to questions byintroducing redundancy, .clichés, and extraneousinformation. It was assumed that this proceduredid not alter the objective or meaning of thequestion_ The short form-D1 the question alwaysappeared with identical wording in the last partof the long question. Long questions containedan average of 38 words each. They were 2.7times longer than the average short question.This length ratio was assumed to be large enoughto ensure 'a =substantial variation in interviewerspeech duration, despite expected variations inthe speed of reading.'

questions were used in the same order in all 3procedures. Questionnaire C (control) consistedof short-form questions only. Questionnaires Aand B consisted of blocks of long-form and'blocks of short-form questions alternated insuch a way that each block of questions asked inthe, long form in procedure A was asked m theshort form in procedure B and vice versa. Thisparticular arrangement was used for two reasons.First, it was assumed that a questionnaireemploying only lengthy questions might bedetrimental to useful respondent performance.'Second, this particular design allowed for aninvestigation of potential carryover effect oflength from blocks of long questions to blocksof short question0 Finally, this design allowedfor a comparison of answers to, all questions inshort form in treatment C and long form -ineither treatment A or B (with a mixture totalingabout one-third short questions and two-thirdslong questions). Thus, the three experimentalprocedures presented the following questiOn-length composition:

Field ProceduresThree., questionnaire procedures (A, B, and C)Were designed from a pool of 28 questions, all Two female interviewers on the Survey Re-written in both short and long forms. The 28 search Center staff each received about 8 hours

"5114 and Webb's research on Kennedy news conferences74used the number of lines of transcripted speech as the unit ofspeech duration analysis. That measure has been found byMatarazzo72 and others to be hiedy correlated with standud

-methods of time recording (over.-.90) and highly reliable. :7

tCurrent Survey Research Center woik tends to discard thisassumption, at least for interviews of moderate total lenwl(about 15 minutes).

'The carrycner effect was not found by Matarazzo, but wassuggested by the Ray and Webb study of Kennedy newsconferences.

of training. They were instructed to read thequestions_ exactly as worded on the question-naires, to avoid any obtrusive speech behavior,and to provide clarification by repeating onlythe relevant part of the question and only whenabsolutely necessary. They were also told toomit probing, to accept the respondent's answer,and to eliminate any kind of verbal or nonverbalfeedback. Furthermore, they were trained toadopt a regular speech rhythm, consistent forquestions of varying length. They were told thatthe purpose of the study was to experiment withvarious types of question structure.

Four city_ blocks were selected at randomfrom two- census tracts in Jackson, Michigan.The tracts contained white families of moderateincome, with a high proportion of native-borncitizens and a low proportion of persons over 65years of age. Two blocks were assigned to eachinterviewer and interviews were taken accordingto a random selection of dwelling unit numbers.In order to be eligible, respondents had to bewhite, 18 to 64 years of age, mrried, female,able to respond adequately, and fluent - inEnglish.

A total of 27 interviews were taken (9A, 9B,and 9C). Questionnaire forms were randomizedamong the four blocks, both in terms of numberand administration order. Interviewer 1 toOk4A, 5B,- and 5C forms; interviewer 2 took 5A,4B, and 4C forms. All interviews were taperecorded with cassette-type machines.

Dependent Variables

Two dependent variables, assumed to beaffected by the length of the question asked,were measured:

a. The duration of respondent answers toeach question. This measure was defined asthe number of seconds from the end of thequestion to the end of the response minusany irrelevant interruption or any addi-tional interviewer verbal intervention.'"

b. The percentage of questions in which oneor more items of the requested healthinformation were reported.

L'Answer duration was timed from the tapes by a single coderusing electronic timers,

esults and Discussion

Answer duration. The average number ofseconds the respondent took to answer a ques-tion in relation to question length is shown in

.table 42. Two converging findings emerge. First,,it is clear tnat within interviews containing both

short and long questions (A and B) the longquestions did not elicit any longer answers thanthe short luestions did (5.6 seconds against 5.7seconds). Second, interviews with only shortquestions (C) ,did not elicit substantially shorteranswers .than long questions did in the otherintervieWs (A and B). The average answer lengthis slightly lower for short questions used alone(5.3 seconds) than for long 9nes used in combi-nation (5.6 seconds), but the difference is .notstatistically significant and may be consideredinconsequential.

T ble 42. Number of respondents, number of questions asked,and average duration of answers per question, by questionlength

Manion length

Number

of ro'pond-

ents

Total num-bar of

questionsasked

Average

duratioP of

resp°nPer question(in seconds)

Long questions ininterviews usingboth long endshort 4uestions(questionnaires Aand B)

Short questions in

boi ttehrvlioevfing aunsdi nsgh or tquestions (question-naires A and B) -

i'hort questions ininterviews usingshort questions only(questionnaire C)

270

270

270

.

15.7

5.3

tDifferances betweencant (two-tailed t).

igures are not stat'stically signifi-

These results are far from the approximately100-percent increase in ansWer duration re-peatedly obtained in other research where com-parable lengthening of interviewer speech hadbeen used. In the limited exploration under-__

taken, the matching effect between interviewerand respondent speech duration did not appear.

could be argued that in contrast to otherypes of interviews which usually, include many

_open questions, a survey interview, containing asubstantia number of closed questions does notprovide a chance for respondents to do much

king; therefore, the matching effect does notemerge.

To provide some understanding of this issue, afurther analysis of answer duration was con-ducted which controlled for question type (openor closed). While the average answer durationwas longer for open questions than for closedquestions, the matching effect of questionlength was not more apparent for thr formerthan for the latter. These results do not replicatethe Matarazzo findings described earlier.

In contrast with other research, the majorchange introduced in this question-type experi-ment was a strict control on the informationload and demands of the questions, so that along question would not transmit or ask formore inforination than would a short question.In view of the results and pending replication, itis very possible that this control was responsiblefor the discrepancy between the present dataand the Matarazzo findings. The absence of alength-matching effect under the new experi-mental conditions supports the authors' inter-pretation of the findings by Matarazzo andothers, prpposed earlier as hypothesis. Varia-tions in respondent speech duration may be

_ quite independent of variations in interviewerspeech duration, resulting instead from changesin information content and demand of a ques-tion. When information level is held constantacross short and Iong questions, there is nolonger any clear indication of_ a matddng effectOf speech duration.

Reporting equency.The effect of questionlength upon the probability that responses con-tain one or more items of requested healthinformation_ is shown in table 43. It is apparentfrom the table that within intewiews containingboth short and long questions, the length of aquestion-does-not seem-to predict the frequencyof report: Thirty-eight per cent of the short-form questions elicited information, cornparedto 40 percent fur the same questions written in along form. The difference is inconsequential. On

_

Table 43. Number of 'respondents, number of questions asked,and percent of questions for which any requested healthinfOrmatiOn was reported, by questiOn length :

Question length

Numberof ra---

spond-tints

Total num-ber of

questionsasked

Percent of,

questionswith infor-

melonreported

Long questions ininterviews usingboth long and shortquestions (question-naires A and B) . 252

Short questions ininterviews usingboth long and shortquestions (question-naires A and B) . .

18

252Short questions ininterviews usingshort questionsonly (question-naire C) 9 252 29

113oth of these prOportions are significantly different fromthe proportion obtained in the third group (p < .05, one-tailed,based on Z).

the other hand, in interviews using only shortquestions, only 2 9 percent of the 'questionselicited some health report. This proportiondiffers significantly (p < .05) from the_ propor-tions obtained when both short and long ques- _

tions were used in the same interview.Thus, the lengthening -of half the questions in

questionnaires A and B is responsible for asignificant increase in the frequency of report,compared with the frequency obtained in thecontrol questionnaire C. Further analysis of thedata by question type showed this effect ,to bepresent in answers to both closed and openquestions.

Two suggestions emerge from the data shownin table 43. First, the probability of report maybe_ enhanced by the use of longer questions.Second, this effect may be carried over fromlong to short questions when blocks of both

-types-are-used in the same interview.--Thus;-whilethere seems to be a positive effect of questionlength per se upon the probability of report, thiseffect also appears throughout the entire inter-view.

7 6

In summary, the results of this pilot experi-ment offer the following proposition. Wheninformation demand is held constant, long ques-tions do not produce noticeably long responses,but do elicit a greater frequency of report.Under these circumstances, increases in inter-viewer speech duration affect the content of therespondent speech without affecting its aura-tion. Somehow a long question provides cueswhich tend to elicit more information from therespondent even though the response durationstays practically unchanged. Tentative explana-tions of this question length effect will beproposed in the conclusions to this section,along with an experimental design for the test ofsome derived hypotheses.

EXPERIMENT 2: EFFECTS OFQUESTION LENGTH ON VALIDITY

OF REPORT

While there is some evidence from experimentthat- increases in question length result in

greater reporting frequency, no data are avail-able so far on the validity of the _extra informa-tion obtained by long questions. It may be thatrespondents overreported a number of healthevents or conditions, as is known to occursometimes in health interviews. However, thereis no- particular hypothesis to predict thatoverreporting would increase as a function -ofquestion length. On the -contrary, drawing againfrom the cue.search model of theinterview, it ishypothesized that respondents may interpretlong questions as calling for. completeness andaccuracy of report. This would decrease under-reporting, and possibly overreporting, while im-proving the overall validity of the data.

The purpose of experiment 2w was to ascer-tain the validity of health information reportedin response to long questions. To_ achieve thisobjective, a sample of reipondents was drawn

:from a pOpulation- of patients who had visited aphysician in a prepaid clinic during a 64nonthperiod prior to the survey. At thetime of the

the_physician was: asked_tosurvey_a_checklist_ of 13 chronic conditions indicatingwhether the patient had or did not have each

listed condition, or whether sufficient diagnosticinformation was available. Information aboutthe patient was obtained by-the physician fromthe patient's record and from his own knowledge of the patient's health. A' weighted sampleof persons was used in which 88 percent had atleast one of the listed chronic conditions and 12percent had none of them. Respondents werewhite females, 18 to 60 years of age who residedin the greater Detroit metropolitan area.

Among other experimental techniques used inthe study was a test of the effects of questionlength on accuracy of reporting. A questionnaire _

was prepared using standard short questions.These call for the reporting of various healthconditions and behaviors. In the middle of thequestionnaire, checklist-type questions were in-troduced which asked about the presence orabsence of_ the 13 chronic conditions listed onthe physician summary form.

A second version of the questionnaire wasprepared using the same questions but in longform. Questions were lengthened in a waysomewhat comparable to the method used inexperiment 1. Since experiment 1 had shown acarryover effect of question length from long toshort questions, it was hypothesized that con-trast between questions of various lengths ratherthan specific question length per se was theoperating variable. In order to increase thiscontrast effect, a mixture of questions varying inlength, rather than _the previous large blocks ofshort and long questions, was used. Also, incontrast to the procedure used in the firstexperiment, standgd short probe questions wereintroduced after all items in both questionnaire_forms. This meant that the more information arespondent reported, the more short probequestions she would -be asked. This Procedurewas aimed at decreasing the contrast in lengthbetween the two questionnaire forms as afunction of reporting frequency, thusing the expected effect of question length on thenumber of items reported. Finally, the strateused in lengthening the questions was slightly

Sonie_questions contained_three_state-__merits as in the original experiment, while otherscontained only two.

Thus, while question length was implementedon the_ basis of- ticomparable principles (redn-w For a full report of this experiment. which 4so included an

_ -investigation of the ef(ects of verbal reinforcement and reinter- _dancy, cliches, and fillers with_the_informationview. see reference 53. characteristics of the question held constant).,

less differentiation was probably attained be-tween the two experimental treatments. Thismight result in some minimization of the ques-tion length effect.

Ten white female interviewers were employedin the study. Question treatments, long andshort, were assiiiial at random within geo-graphic clusters of respondents with each inter-viewer admunstering both types of interviews.Budget_considerations led to some compromise,both in terms of experimental desigi and samplesize, which might be reflected in less thanoptimal stability of the result-5. One hundred andsix persons were interviewed with the short-question procedure and 96 with the long-question procedure.

Since, for various reasons, errors can exist inthe physician forms as well as in the respondentreports, a high degree of agreement between thetwo sources was noi-expected. However, it wasassumed that better validity of the respondentreporting would increase the agreement ratesbetween the 'two sources. In other words, ifagreement rates on presence and absence ofchronic conditions were found to be higherunder long-question inteiviews than Under short-question interviews, presumably an improve-ment in validity had been obtained in the formerprocedure.

Probability of aKreement between physicianand respondent on the presence or absence ofthe listed 13 chronic conditions was computedon the basis of. match and mismatch in- "Yes"and "No" responses provided hy the two infor-mation sources. Excluding oases where, thephysician or respondent lacked sufficient infor-mation te determine the presence of a condi-tion, the_ following four possibilities of match ormisniatch existed for each chronic condition:

Yrs

pHySICIAN

No

Resulb and Discussion

The original report of the study t ed thedata in terms of two types of mismatch:

Type X -A4 and type Y

Type X mismatch was expected to represent theextent of underreporting (false negatiVesponses); type Y, the extent of overreporting(false positive responses), assuming that thephysician was right in his evaluation. Accordingto the analysis, the overreporting error wasrelatively infrequent and was not_affected by thequestion length (.10 under both interview tech-niques). On the other hand, the underreportingerror was more frequent and it was reducedsignificantly < .05) in the interviews in which _long questions were used (.46 with short ques-.tions and .38 with long questions).

Since one may reasonably question the ade- .

quacy of the physician's report, as well as thevalidity of respondent report, an analysis of thedata from a slightly different-point of view ispresented_ For both _interview treatments, table44 shows probabilities of agreement relating tothe presence of chronic conditions.

Table 45 presents the corresponding probabil-s of agreement on the absence of chronic

-

conditions.This new approach avoids the assumption that

the medical_ records were more valid than the-

respondent's report. it cioes assume that greateragreement- between the two sources indicateshigher validity of the reported data.

Based on the material shown intable 44,whether one starts from the physician data (roVv_1), or from the respondent data (row 2), or fromboth (row 3), the probability of agreement onthe existence of a chronic condition is con-sistently -higher with interviews using long ques-tions than it is with interviews using__ shortquestions. For two agreement rates the obtainedimprovement:is statistically significant. The in-crease-in- Overall probability of agreement ob-tained with long-question interviews amounts to16 percent.

The data in table 45 show that while agree-ment rates on the absence of chronic conditionsare not noticeably enhanced by the-use of longquestions, neither are they dimmished by ques-tion length.

Table 44. Probability of physician-respondent agreemer on the presence of chronic conditions, by type of agreement rate andquestion length (originel interview)

Type of agreement rates

Questionnaireprocedure/

Difference

Percentincreasedue to

questionlength

Shortquestions

Longquestions

A Probability that chronic conditions checked as present by theA+9 physician were reported as present by respondent . .537 .622 +17

A Probability that chronic conditions reported as present by theA+C respondent were checked as present by the physician .477 .516 +8

A Overall probability of agreement between physician and respondentA+B+C for chronic canditione-Mentioned as present by either of them . . .338 .392 +16

, -1 Number of persons interviewed were 106 in short-ouestion procedure and-96 in long-question procedure.2p 4 .05, one-tailed, baled oh Z.

< .10;one-tailed, based on Z.

Table 45. Probability of physician-respondent agreement on the absence of chronic conditions by typo of agreement rate andquestion length (original interview)


13 Probability that chronic conditions checked as absent by the

Questionnaireprocedure/

iUffer

Percentncreasedui to

questionlength

Shortquestions

Longuestions

D+C physician were reported as absent by the respondent .... . 1 -.901

13 Probability that chronic conditions reported as absent by theD+B respondent were checked as absent by the physician .921 .934 .013 +1

13 Overall probability of agreement between physician and respondentD+B+C for chronic conditions mentioned as absent by either of diem . 6 7 .011 +1

/Number of persons interviewed were 106 in short-ouestion procedure and 96 in long-question procedure.

This information provides an interesting corn- pereent increase in - the number of chronic, ,. _ .plernent to the earlier findings. Experiment 1 -conditions reported, it is not negligible in viewhas shown that more information is elicited by of the- fact that the rccall task is easier and the

_ .long questions than-by short questions, and the likelihood of responding is expected- to be -present experiment indicates that long questions uniformly higher -when recognition stiinuli like :,-

also elicit information of higher validity, checklist questions are provided.53 For healthEven though the study was designed primarily behavior noted in other parts of the question-to test the validity of reporting of chronie naire, the effect Of- question length on number_condi tions rather than , simply.. the..arnount ...re ____ of Atems, reported-was-modest-and-not -statistiz-

: ported,' attention was also -given to the number cally significant, Several reasons for aininiiniza.of- repOrted conditions. From the list of 13 tion of the expected effects of question length-chronic Conditions, an average of 2.26 were in this study were proposed earlier.accoUnted-for in the longqiiestion treatment as Since reinterview was anothei variable to be-compared to, 2.06 in the , short,question treat- investigated in -the study, -50 percent- of_ the---ment. AlthOugh this amounts to only a --10- -respondents selected- for original interview were -.

7 9-

designated to be contacted 2 Weeks later.x Thecontent of the followup was very close to thatof the original interview. Some new questionsabout health insurance were introduced to avoidexcessive repetition. The questionnaire con-tained the identical chronic conditions checklist.All respondents who originally, had been givenshort-question interviews were given short-question reinterviews; respondents originallyasked long questions were reinterviewed with

xFor reasons of field efficiency, all respondents originallyinteniewed dining the first half of the data collection periodwere designated for reinterview.

Table 46. Probability of physician-responden

long questions. Thus, data were available toexamine potential variations in agreement ratesfrom the first interview to the later one. Whileagreement rates were poorer the second time

_

than they were originally for both short- andlong-question treatments, the deterioration invalidity was lower with long questions than itwas with short questions.

Since one focus of this study is questionlength rather than reinterView, tables 46 and 47show the same type of agreement rates onpresence and absence of chronic conditions aspresented in tables 44 and 46, but computedthis time on the basis of the reinterview data.

agreement on the presence of chronic conditions, by type of agreement rate andquestion length (reinterview


Questionnaireprocedure1

Difference


Short Long questionquestions questions - length

Probability that chronic conditions checked as present by theA+B --- physician were reported as present by the respondent _5 i .697 12

A Probability that chronic conditions reported as present by theA+C respondent were checked as present by the physician . . . . .408 .527 2.1 ve"- . +29

A Overall probability of agreement between physician and respondentA+B+C for chronic conditions mentidn'ecl as present by either of them . . .301 .3 :

1Number of persons reinteneiewed were 49 in short-question procedure and 53 in long-question procedure.2p c .05, one-tailed, based on Z.

Table 47. Probability of physician-respondent agreement on the absence of chronic conditions, by type of agreement rate andquestion length (reinterview)


Questionnaireprocedure.'

Short-

questionsLong

-uestions

Difference


questionlength

-Probability that chronic conditions checked as absent by the0+C physician were reported as absent by the respondent

Probability that chronic conditions reported as absent by the0+61 respondent checked as absent by the physician . . . .......

Overall probability Of agieement.between physician and respondent0+B+C for chronic conditions mentioned as absent by either of them .

(3)

+3

1Number of persons reinterviewed whre 49 in short-question procedure and 53 in long-question procedure.2p 4 .10, one-tailed, beSed on Z.3More than zero, but less than 1 percent. 10

All three agreement rates shown in table 46were substantially improved by the use of longquestions in reinterview. The obtained increasein two of the agreement rates reaches statisticalsignificance (p < .05). The increase in overallprobability of agreement between physician andreSpondent on the presence of chronic condi-tions amounts to 29 percent.

While increases in agreement on the absenceof conditions due to_ question length are verymodeit (table 47), there is indication again thatno, deterioration occurred for this type ofagreement as a result of question length inreinterviews. For the first agreement rate, theincrease obtained in long-question reintemiewsreached statistical significance at the 10-percentlevel. Thus, the data obtained the second timeconfirm the fiadings--obtained originally. Longquestions elicit a more valid report than do shortquestions under conditions of reinterview, aswell as under conditions of initial interview.

In siimmary, this second experiment demon-strated that improvement in the validity of thereporting of health conditions can be achievedwithout changing the content or meaning of thequestions, but only by increasing their length.Somehow the retrieval of more accurate infor-mation is facilitated by a question of longduration. The request for accurate reporting isimplicitly conveyed by long questions, eventhough the explicit demand for information andapparently the response duration are unchanged.

CONCLUSIONS

On the basis of the data piTzented, threemajor suggestions have been proposed which canbe summarized as follows: .When_the length ofsurvey interview questions is substantially in-creased and their information dernand heldconstant (a) no appreciable increase is obtainedlin response duration; yer, (b) the responsecontains more information; and (c) the reportedinformation is more valid.

The First suggestion contradicts Other researchin wlilch it-was found that there -was speech--

.length matching effect in the interview. -Thismatching effect in other research might have

-resulted from ari uncontrolled increase in theinformation demanded by long questions. Thepresent Stadyz-indicates that,-when-information

70

demand is controlled, long questions do notelicit long responses Experimental manipula-tions of information demanded, associated witha control imposed on question length, mighthelp to solve the issue in further research.

That long questions in comparison with shortones might elicit more information and a moreaccurate report is contradictory to co- -inassumptions and current survey metho logy.However, the evidence in this paper leads to theconclusion that lengthy and redundant ques-tions, as designed in this study, elicit increasedaccuracy of report, even though the responsesdo not last any longer than those in response toshort questions. These findings are puzzling, andthey raise questions of importance to surveyresearch. The following should be considered asresults of preliminary investigations that requirereplication:

a. The length of the question has cueingeffects upon reporting behavior, causing in-creased accuracy but not extending to responseduration. A long question may convey to therespondent the idea that, -because the inter-viewer has spent much time asking the question,the task is important and, therefore, requiresserious efforts. Furthermore, a long questionmay indicate to the respondent that the inter-viewer is not in a hurry, and thus releasesperception-of-time constraints possibly detri-mental to adequate performance in regularinterviews. Finally, the responding behavior mayalso gain in effectiveness because some!of theinitial ruminating-type activity has already takenplace while the question was being asked. A longquestion may therefore provide cues leading tomore adequate performance and at the sametime prepare the respondent for the expressionof more efficient verbal behavior-.

b. Question length or interviewer speechduration is only a vehicle or a proxy for anotherinfluential variable, namely, the time given forrecall activity. A long question increases thetime available for search activity and thusimproves the outcome of recall.

c.SinceTi1ncreases in--question length havebeen implemented partially by introducing re-dundancy in question wording, the multiplepresentation of the stimulus may be the -influ-ential variable. It may act either because ofincreased exposure time or because of a repeated =

8 1

-trials effect: Finally, it may be that redundancyimproves the clarity of the question which alsoleads to better reporting performance.

According to this analysis, the effects of atleast three variables should be investigated infurther research: question length per se and itscue-giving properties; time provicien by thequestion for recall activity; and redundancy ofthe :question. In the experiments describedearlier in this paper, all three dimensions havebeen varied simidtaneously so that the specificeffect _of each _could not be isolated. In theexperimental treatments, questions were longer;they also provided more time for recall sincetheir first statement always referred to a majorpart of the question content; and they wereredundant.

The following design proposal is an attemptto-investigate the specific effects of variations intotal question length and recall time, whilepartially controlling for redundancy. Experi-mental questions could be designed uSing thefolloWing pool of statementsequal in lengthaccording to variotr: arrangements:

Q -= question in its short standard form;F filler staiement introducing extraneous

information of inconsequential characterand Unrelated to the specific questiondemand; and

q = introductory statement describing thetopic of the question in a manner suffi-dent to stimulate relevant recall activity.

Each experimental questionnaire would con-tain only one type of question structure, asdesaibed below:

---

Questionnaire questions in their shortstandard form. Question length and recalltime are low.

Questionnaire 2 F the short question pre-ceded by a filler statement. The total lengthof the question is roughly doubled, whereasthe_time for_recall activity is unchanged sincethe filler is entirely unrelated to the questiondemand. Question length is medium and recalltime is low.estionnaire 3 q the short question pre-ceded this time by a statement introducingthe major question demand. Question length =and recall-time are both medium.

Questionnaire 4 = FqQ: the short question ispreceded by the introductory statement,which is itself preceded by an irrelevant filler,question length is high, and recall time ismedium.

uestionnaire 5 = qF . the short question ispreceded by a filler, which is itself precededby -.an introductory statement. Questionlength and recall time are both high.

Redundancy occurs whenever a question usesboth q and Q, so that thiS variable is controlledwithin groups of treatmerits 1-2 where there is _

no redundancy and within groups I-4-5 whereredundancy has been introduced.

This experimental design is presented in figure2, where each cell represents one questionnaireprocedure and Indicates- the strategy -Used inquestion _wording for this procedure. Compari-son of data from celli land 2 will detect theeffect of increased length under conditions ofequal recall time and no redundancy. Compari-son of cells 3 and 4 will ,detect an effect- ofincreased length with equal recall time andredundancy. A comparison of cells 2 and 3 willexplore a combined effect of increased recalltime and redundancy, under conditions of equallength. Comparison of cells 4 and 5 will detect

t-

00ESTIONLENGTH Wi'dium

High

TIME FOR RECALL ACTIVITY

Low Medium Nigh

El

GI

_

T

_

(2)

F q

13 __ 0

.

Naredundancy

0 n Question in as short standard tem_

F fillor Naternentq introductory cturrniont

Redundancy

Figure 2.- Proposal for a research design.

the effect of increased recall times with equallength and equal redundancy. Diagonal compari-sons of cells will provide a means of examiningthe effects of various combinations of the threevariables. The application of this design maygenerate three types of outcomes:

a. A replication of the earlier findings;

b. An identification of the most efficientstrategy or question anatomy; and

c. Some identification of the variables causingimprovement in reporting.

REFERENCES

1Cannell, C. F., and Fowl r, F. J.: A Study of theReporting of Visits to Doctors in the National HealthSuruey. Ann Arbor, Mich. Survey Resetrch Center. TheUniversity of Michiran, Oct. 1963. Mimeographed.

2Marquis, K. H., and Cannel!, C. F.: A Study ofInterviewer-Respondent Interaction in the Urban Em-ployment Survey. Ann Arbor, Mich. Survey RescuchCenter. The University of Mich:gm, Aug. 1969.

3National Center for Health Statistics: Interview dataon ch7onic conditions compared with information de-rive.1 from medical records. Vital and Health Statistics.PHS Pub. No. 1000-Series 2-No. 23. Public HealthSerriee. Washington. U.S. Government Printing Office,May 1967.

4National Center for Health Statistics: Health inter-view responses compared with medical records. Vital andHealth Statistics. PHS Pub. No. 1000-Series 2-No. 7.Publie Health Service:Washington. U.S. GovernmentPrinting Office, ju1y1965.

5Riee, S. A.: Contagious bias in the interview. Am. J.Social. XXXV: 420-423, 1929.

6Stanton, F., and Baker, K. H.: Interview bias andthe reeal of incompletely learned materials. Sociornetry5: 123-124, 1942.

7Katz, D.: Do interviewers bias poll results. Publicn. Q. 6: 248-268, 1942.

8Robinsoh D and Rohde S.: Two experiments withan anti-Semitism poll. J. Abnorm. Sac. Psychal. Q.41: 136-144, 1946.

Nyers, R. J.: Errors and bias in the reporting of agesin census cl2ta. Trans. Actuarial Soc. Am. 41: 359-415,1940.

1 0 Neely, T.: A Study of Error in the Interview.Privately_printed, 1937. --- _ - .

11 Perry, H.,- aid Crossley, H.: Validity of responsesto survey questions. Public Opin. Q. 14, 1950.

12Lansing, J. B., Ginsberg, G. P., and Braaten, K.: Aninvestigation of Response Error. Urbaha, Ill. Bureau ofEconomic and Business Research, University of Illinois,1961.

13 Ferber, R.: Collecting Financial Data bY Consumer-_ Panel Techniques. Urbana, Ill. Bureau of Economic-and 27Taffel, C.: Anxiety and the conditioning of verbal

Business Research, University of Ilinois, 1959. behavior. j Abnorm. Soc. Psychol._51: 496-501,1955.

14bansing, J. B., mid Blood, D. The ChangingTravel Market. Ann Arbor, Mich. Survey ResearchCenter, The University of Michigan. Monograph No. 38,1964.

15Goddard, K. E., Broder, G., and Wenar, C.:Reliability of pediatric histories, a preliminary study.Pediatrics 28 1011-1018, 1961.

16Weiss, D. J., et al.: Validity of work historiesobtained by interview. 'Minn. Stud. Vocat12(41): 1961.

17National Center for Heath Statistics: Reporting ofhospitalization in Lhe Health Interview Survey. Vital andHealth Statistics. PHS Pub. No. 1000-Series 2-No. 6.Public Health Service. Washington. U.S. GovernmentPrinting Office, July 1965.

"National Center for Health Statistics: Comparisonof hospitalization reporting in three survey procedures.Vital and Health Statistics. PHS Pub. No. 1000-

Series 2-No. 8. Public Health Service. Washington. U.S.Government Printing Office, July 1965.

19Thorndike, E. L.: Educational Psychology. NewYork. Teachers College Press, Columbia University,1913.

20Ebbinghaus, H.: Memory. Translated by H. A.Ruger and C. E. Busscnius. New York. Teachers College,1913. (Reissued as paperback, New York: Dover, 1964.

21 Koffka, K.: Principles of Gestalt Psychology. NewYork. Harcourt Brace, 1935.

22Rdhler, W.: Gestalt PsycholoLiveright, 1947.

23McGeoch, J. A.: Forgetting and the law of disuse.Psychol. Rev. 39: 352-370, 1932.--24Postman,L.: -The, present- status of interference

theory, in C. N. Cofer; ed., Verbal Learning and VerbalBehavior. New York. McGraw-Hill, 1961. pp. 152-179.

251ingard, E. R., and Bower, G. H.: Theories ofLearning, 3rd ed. New York. Appleton, 1966.

2 6Cobb, S., and Cannell, C. F.: Some thoug hts aboutinterview data. Int. Epidemiol. Assoc. Bull. 13: 43-54,

"Greenspoon, J.: The reinforcing effect of twospoken sounds on the frequency of two responses. Am.J. Psycho!. 68: 409416, 1955.

29Verplanck, W. S.: The convol of the content ofconversation: Reinforcement of statements of opinion.J. Abnorm. Soc. Psycho!. 51: 668-676, 1955.

"Azarin, N. H., et al.: The control of conversationthrough reinforcement. J. Exp. Anal. Behav. 4: 25-30,1961.

3ICenters, R.: A laboratory adaptation of the conver-sational procedure for the conditioning of verbal ()per-ants. J. Abnorm. Soc. Psycho'. 67: 334-339, 1964.

32Kanfer, F. H., and McBrearty, J. F.: Minimal socialinforcement and interview content. J. Clin. Psyched.

18: 210-215, Apr. 1962.33Hildrum, D. C., and Brown, R. W.: Verbal re-

inforcement and interviewer laim. J. Abnorm. Soc.Psycho!. 53: 108-111, 1956.

34Nuthman, A. M.: Conditioning of a response classon a personality test. J. Abnorrn. Soc. Psycho!. 54:19-23, 1957.

"Staats, A. W., et al.: Operant conditioning of factoranalytic personality traits. J. Gen. Psycho!. 66: 101-114,an. 1962.

36Singer, R. D.: Verbal conditioning and generaliza-tion of prodemocratic responses. J. Abnorm. Soc.Psychol. 63: 43-46, July 1961.

37Insko, C. A.: Verbal reinforcement and attitude. J.Pers. Soc. Psycho'. 2: 621-623, Oct. 1965.

"Kahn, R. L., and Cannell, C. F.: The Dynamics ofInterviewing.- New York. John Wiley & Sons, Inc., 1957.

39Cannell, C. F., and Kahn, R. L.: Interviewing,Iri G.L. Lindiey and E. Aronson, eds., The Handbook ofSocial Psychology. Reading, Mass. Addison Wesley,1968.

"National Center for Health Statistics: Internationalcompatisons of medical care utilization: A feasibilitystudy. Vital and Health Statistics. PHS Pub. No. 1000-Series 2-No. 33:Public Health Service. Washington. U.S.Government Printing Office, June 1969.

4IMooney, H. W.: Methodology in Two CaliforniaHealth Surveys. Public Health Monograph No. 70. PHSPub. No. 942. Public Health Service. Washington. U.S.Government Printing Office, 1962.

42National Center for Health Statistics: Optimumrecall period for reporting persons injured in motor

_ vehicle accidents. Vital and Health_ Statistics.=_Scries2-No. 50. DHEW Pub. No. (HSM) 72-1050. HealthServices and Mental Health Administration. Washington.U.S. Government Printing Office, Apr. 1972.

43World Health Oiganization: Manual of the Inter-national Statistical Classification of Diseases, Injurzes,and Causes of Death, Based on the Recommendations ofthe Seventh Revision Conference, 1955. Geneva:-WorldHealth Organization, 1957.

44 Fowler, F. J., Jr.: Education, Interaction, and

8 4

Interview Performuice. Doctoral dissertation, The Uni-versity of Michigan, 1965.

"Weiss, C. H.: Validity of Interview Responses oWelfare Mothers. New York. Bureau of Applied Soci0Research, Columbia University, Feb. 1968.

"National Center for Health Statistics: The influ-ence of interviewer and respondent psychological andbehavioral variables on the reporting in householdinterviews. Vital and Health Statistics. PHS Pub.No. 1000-Series 2-No. 26. Public Health Service.Washington. U.S. Government PrintMg Office, March1968.

47Mueller, J. H., Schuessler, K. F., and Costner, H.L.: Statistical Reasoning in Sociology. Boston.Houghton-Mifflin, 1970. pp. 279.291.

48deKoning, T. L.: Interviewer mid Respondent Inter-action in the Household Interview. Doctoral dissertation,The University of Michign, 1969.

49Marshall, J, Marquis, K. H., and Oskarnp, S.:Effects of kind of question and atmosphere of interroga-tion on accuracy and completeness of testimony. Ham.Law Rev. 84(7): 1620-1643, May 1971.

"National Center for Health Statistics: Effect ofsome experimental inteniewing techniques on reportingin the Health interview Survey. Vital and HealthStatistics. PHS Pub. No. 1000-Series 2-No. 41. PublicHeth Service. Washington, U.S. Government PrintingOffice, May 1971.

51Dollard, J., and Miller, N. E.: Personality andPsychotherapy. New York. McGraw-Hill, 1950.

62Skinner, B. F.: Verbal Behavior. New York. Apple-ton Century Croils, 1957.

"National Center for Health Statistics: Reporting ofhealth events in household Lnterviews: Effects of re-inforcement, question length, and reinterviews. Vital andHealth Statistics. Series 2-No. 45. DHEW Pub. No.(HSM) 72-1028. Health Serfices and Mental HealthAdministration. Washington. U.S. Government PrinthigOffice, Mar. 1972.

"Marquis, K. H., and Laurent, A.: Accuracy ofHealth Reporting in Household interviews as a Functionof Reinforcement, Question Length, and Reinterviews.Unpublished manuscript,-1969.

"Cannell, C. F., Fowler, J., and Marquis, K. H.:Respondents Talk About the National Health SurveyInterview. Ann Arbor, Mich. Survey Research Center.The University of Michigan, Mar. 1965. Mimeographed.

"Cannell, C:F., Fa-wler; and Marqiiis, K. H.:Report on Development of Brochures for H.I.S. Re-spondents. Ann Arbor, Mich. Survey Research Center.The University of Michigan, Mar. 1965. Unpublishedmanuscript.

"Zajonc, R. B.: Social Psychology, An ExperimentalApproach. Belmont, Calif. Wadsworth, 1966. (BasicConcepts in Psychology series.)

"Edwards, A. L. The Social Desirability Va ' ble ui

73

Personality Assessment and Research. New York,Dryden Press, 1957.

63Bath, R. F.: Interaction Process Analysis. Cam-bridge, Mass. Wesley Press, 1950.

60 Hyman, H. H., et al.: Interviewing in SocialResearch. Chicago. University of Chicago Press, 1954.

61Cannell, C. F.: Private communication: Propos0.1969.

62Kasner, L.: Studies of the conditioning of verbalbehavior. Psycho!. Bull. 55: 3, 1958.

63Spielberger, C. D.: The role of awareness in verbalconditioning, in C. W. Erikson, ed. Behaviorism andAwareness. Durham, N.C. Duke University Press, 1962,pp. 73-101.

64Spielberger, C. D., and De Nike, L. D.: _Descriptivebehaviorism vs. cognitive theory in verbal operantconditioning. Psych. Rev. 73(2): 306-326, 1966.

"Dulany, D E.: Hypotheses and habits in verbal"operant conditioning." J. Abnorm. Soc. Psychol. 63:251, 263, 1961.

660akes, W. F.: Verbal operant concht-acsning, inter-trial activity, awareness and the extended interview. J.Pers. Soc. Psycho& 6: 198-202, 1967.

67Marquis, K. H.: Report of Legal Study. Ann Arbor,Mich. Survey Research Center, The University of Michi-gan, 1967. UnpubliShed manuscript. -

68Asch, S. E.: A reformulation of the problemassociations. Am. Psychol. 24: 92-102, 1969.

69Nation Center for Health Statistics: Reportinghealth events in household interviews: Effects of anextensive questionnaire and a diary procedure. Vital andHealth Statistics. Series 2-No. 49. DHEW Pub. No.(HSM) 72-1049. Health Services and Mental HealthAdministration. Washington. U.S. Covemment PrintingOffice, Apr. 1972.

"Parten, M.: Surveys, Polls. and Samples: PracticalProcedures. New York. Harper & Bros., 1950.

71Oppenheitn, A. N.: Questionnaire Design andAttitude Measurement. New York. BaSic Books, 1966.

72Matarazzo, J. D., Saslow, G., and Wiens, A. N.:Studies in interview speech behavior, in L. Krasner andL. P. Ullman, eds., Research in Behavior Modification.New York. Holt, Rinehart and Winston, 1965. pp.179-210.

73Matuazzo, J. D., Wiens, A. N., Saslow, G.,Dunham, R. H., and Voas, R. B.: Speech durations ofastronaut and ground communicator. Science 143:148-150, 1964.

74Ray, M. L., and Webb, E. Speech durationeffects in the Kennedy news conferences. Science 153:899-901, 1966.

"Heller, K., Davis, J. D., and Myers, R. A.: Theeffects of interviewer style in a standardized interview. J.Consult. Psychol. 30(6): 501-508, 1966.

APPENDIX

-APPLICATION OF SURVEY RESEARCH.PENTERFINOINGS TO:HEALTH INTERVIEW

.SURVEY.PROCEOURES

Based on the experience of many researchersin the health field and other areas it wasexpected that underreporting would be one ofthe major problems to be solved in the datacollection phase of the Health Interview Survey.However, before any attempt could be made to

. remedy this shortcoming of the interviewmethod, it was necessary to obtain some idea ofthe magnitude of the problem, to learn some-thing about the characteristics of persons whofail to report health events, and to identify theparticular kinds of events that tend to beunderreported. The need for this kind of infor-mation led to a number of studies involving acomparison of information provided by inter-view respondents with independent medicalrecords of known validity. Several of thesestudies, conducted by the Survey ResearchCenter (SRC) of the University of Michigan,provided information that led to major revisionsin the collection and processing of HealthInterview Survey (HIS) data.

Recall of Health Events

One of the early studies carried out by SRC in1958-59 consisted of a comparison of hospitali-zations reported in interviews with actual hos-pital records. A sample of patients discharged

--from hospitals- participating-in- the- Pro fessionalActivity Study were interviewed, and the resultswere compared with :e. discharge records (seeref. 17 for a complete description of this study).Findings of this study described earlier in thisreport, demonstrate that underreporting of hos-pitalization in a health interview situation isinfhienced by the impact of the hospitalization,

the threat or embarrassment caused by thenature of the condition causing hospitalization,and the time elapsed between the interview andthe period spent in the hospital. Since the firstwo of these findings involved intrinsic charac-

teristics of the health event, no Immediatesolution to underreporting associated with thesecauses was available.

However, the consistent increase in under- ,reporting with the time elapsed between hcis-pitalization and the interview was a finding thatappeared to have practical application to datacollected in the Health Interview Survey. As aresult, in the derivation of estimates of thevolume of hospital discharges from the basic HISdata collected beginning in July 1958," the6-month period preceding the date of interview_was used as the period of reference. By doublingthe weight attached to each of the reportedevents within that period, it was possible toproduce estimates comparable to those based on12 months of recall, but with considerably lessof the underreporting bias introduced by the useof the longer recall period.

The use of the 12-month recall period wascontinued in the collection of data on hospitalexperience because Of the several kinds ofestimates produced from the Survey. To ccun-

- --bine the -hospital episodes of iampleorder to estimate the number of persons withone or more episodes in a given year, it is

necessary to consider a year's experience foreach sample person. On the other hand, inestimating the annual volume of hospital dis-charges, any recall period can be used if -theweight attached to each event in the estimating

procedure is properly adjusted. Since the lengthof the recall period is inverse related to themagnitude of the sampling error, the 6-monthreference period was selected so that responsebias could be appreciably reduced without anundue increase in the size of the sampling error.

The imprecision with which respondents re-called dates of health events during an interview,brought to light by several of the SRC studies,led to the use of a recall period in the collectionphase of hospital data extending beyond theyear preceding the interview. For example,persons interviewed during July of a particularyear were asked about their hospital experiencesince May of the previous yeas. This innovation'mproved the reporting of events that occurrednear the beginning of the reference year, as wellas hospitalizations that started prior to the yearpreceding the interview but extending into thereference year. During the processing phase,those hospitalizations for which no days duringthe year prior to interview were recorded wereeliminated from the hospitalization data.

The SRC record-check studies also revealed_inaccuracies in the reporting of physician visits:Even though the recall period for the reportingof physician visits was limited to the 2-weekperiod prior to week of interview, some of thevisits occurring during that period were notreported and, in other instances, visits occurringprior to the period were reported as happeningwithin the recall period. This finding eventuallyled to the decision to enumerate physician visitson the questionnaire by date of visit so thatcomparison of the occurrence and interviewdates would establish that the event had oc-curred during the appropriate recall period.

Effective Probing for Health EventsSeveral of the Survey Research Center studies

have indicated that much of the information notreported during an interview has not beenrepressed,_nor has it disappeared, from memory._

wis-simply not elicited because the question-ing procedures failed to bring it forth. Thisfinding suggested that the probe questions de-signed to_ encourage the reporting of healthevents in the HIS were not stimulating retrievalf information sufficiently, and that questions

constructed to elicit certain kinds of response;hould be added.

Prior to July 1962, respondents in the HISwere asked about overnight stays in hospitals offamily members during the previous 12 months,and about stays in nursing homes and sani-tariums. Comparison of ,the estimates of hos-pitalizations for delivery derived from these datawith natality figures for the years 1958-62indicated that hospitalizations of this type wereunderreported in the interview survey. To cor-rect this situation, a probe question directedparticularly to the population at risk was addedto the questionnaire. In households where chil-dren 1 year of age or under were reported ashousehold members, the following probe ques-tions were asked: "When was (the child) born?""Was (the born in a hospital?" "Is thishospitalization included in the number you gaveme?" If the hospitalization had occurred duringthe reference period and it had not beenreported in response to earlier probes, then theentries on the questionnaire for the mother andthe child were corrected. The addition of thisseries of questions resulted in an appreciabledecrease in the amount of underreporting in thisarea of the questionnaire:

Through June 1964, the reporting of informa-tion on the number of physician visits during the2 weeks prior to week of interview was depend-ent on one probe question; "Last week or theweek before, did anyone in the family talk to adoctor or go to a doctor's, office or clinic?"Beginning in January 1966, the next periodduring which information on physician visits wascollected in the Survey, two probe questionswere added: "During that 2-week period hasanyone in the family been to a doctor's office orclinic for shots, -X-rays, tests, or examinations?"and, "During that period, did anyone in thefamily get any medical advice from a doctorover the telephone?" The first of these questionswas added to remind the respondent of visitsthat were made for preventive care or, in someinstances, _for reasons other= than -treatment ofillness. The second question informed therespondent that telephone calls to obtain medi-cal advice were consie -red as physician vists inthe Survey. Both of these questions had beenused by SRC in the study designed to evaluateinterviewer performance over time.

Because these probe questions were added atthe same time as the question regarding the date_ .

of the visit, their effect on the data was not asobNrious as it otherwise would have been. Theprocedure of relating the date of the occurrenceof a visit to the date of interview to determine ifit actually occurred during the proper recallperiod effectively excluded all overreporting ofvisits that had actually occurred prior to therecall period or during the interview week.Previously such visits Would have compensatcdfor some of the underreporting. Their removalfrom the data made it difficult to evaluate theeffectiveness of the added probe questions interms of additional visits reported, but there isevidence that the yield from these questions wassubstantial.

Interviewer-Respondent Communication

Many of the studies conducted by SRC haveemphasized the importance of the influence theinterviewer exerts on the respondent and, inturn, on the completeness and accuracy of thereported data. The interviewer's attitude, herexpectations, the kind of feedback she provides,and her behavior during the interview are only afew of the factors that determine the kind andamount of information obtained during aninterview. However, to take advantage of thisphenomenon in order to improve the quality ofreported information, controls rnust_be initiatedto avoid the introduction of interviewer biases.One of the best methods of exercising control ofinterviewer behavior is to include devices, ques-tions, and statements in the questionnaire whichwill improve communication between theparticipants in the interview but will not directthe responses:

Some of the innovat ons in the HIS question-naire that have resulted from this type ofresearch include the folloWing:

a. A simple introductory statement has beenprepared in which the interviewer identifies

_herself_ At_the_cloor of_the _household andexplains very briefly the purpose of hervisit. In case the respondent (or anotherfamily member) wants to know more aboutthe purpose of the gurvey or the uses of thecollected data, a more detailed statement isavailable to the interviewer.

b. Within the questionnaire, introductorystatements are used to explain the subject

matter about which questions are to beasked and to serve as transition devicesfrom one health topic to another. Forexample, a section on X-iay visits wasintroduced by the statement, "Exposure toall kinds of X-rays is a matter of particularinterest to the Public Health Service, and Ihave some questions about X-rays andfluoroscopes."

c. A sinl calendar card, with the appropriate2-week recall period outlined in red is

handed to the respondent early in theinterview so that she is constantly aware ofthe specific 2-week period referred tothroughout the interview.

cl. Nondirective probe questions have beenincluded on the questionnaires in areaswhere nonspecific or ambiguous informa-tion is likely to be reported. For example,if the respondent reports that she visitedthe doctor at a clinic, the interviewer isinstructed to ask: "Was it a hospital out-patient clinic, a company clinic, or someother kind of clinic?"

In the SRC study on interviewer performanceover time, it was found that interviewers be-came less careful or conscientious in using thetechniques they were trained to use. There wasalso evidence that interviewers who performedwell inspired their respondents to perform well,as measured by the reporting of hospitalizations.This study also brought to light the need forinterviewer training to include devices for stimu-lating the interviewer's enthusiasm for her job inaddition to retraining in the use of interviewingtechniques. These findings have been reinforcedby records of interviewer performance main-tained by the Bureau of the Census, and havebeen taken into account in the preparation ofmaterial_ for the periodic training and retrainingsessions conducted for interviewers in the HealthInterview Survey.

Other Considerations

As in any series of research studies, some ofthe experimental measures in the SRC -series,when tested as methods for reducing .under-reporting in interviews, did not contribute anysignificant findings. In other instances, encourag-ing results from studies were either inconclusive

or needed further testing in a nonlaboratorysituation. In this latter category is the findingthat long questions are more effective than shortones in bringing forth c mplete and accurateresponses. More research is needed to determineif long questions are productive because theyhave cuing effects on reporting behavior, allowmore time for recall activity, or merely becausethey introduce redundancy of stimuli. Until thespecific variables causing improved reporting areidentified, the introduction of longer questionson the HIS questionnaire, which would lead tolonger interviews and increased costs, could notbe justified.

Verbal reinforcement by the interviewer hasbeen shown to have cognit ve and motivational

ffects on the responciznt by instilling awarenessof respondent task requirements and encourag-ing adequate responses to subsequent questions.However, it will be necessary to develop ways inwhich the interviewer can effectively use re-inforcement without introducing an undueamount of bias in the collected data before thisdevice can be seriously considered.

All of the studies in which the SRC hasattempted to measure the amounts and types ofunderreporting in interviews indicate that somebasic principles of memory and retrieval can beused to improve reporting. Further research isneeded on the ways in which information isstored and on effective methods of retrievingthat information.

000

U. . GOVERNMENT POINTING OFFICE: 977-241-1 :

Series

VITAL AND HEALTH STATISTICS PUBLICATIONS SERIES

Forintrly Public Health Service Publication No. 1000

ograms and Collection Pr.ocedures. Reports which describe the general programs of the NationalCenter for Health Statistics and its offices and divisions, data collection methods used, definitions, andother matthal necessary for understanding the data.

Series 2. Data Evaluation and Methods Research. Studies of new statistical methodology including experimentaltests of new survey methods, studies of vital statistics collection methods, new analytical techniques,objective evaluations of reliability of collected data contributions to statistical theory.

Analytical Studies. Reports presenting analytical or interpretive studies based on vital and heajthstatistics, carrying the analysis further than the expository types of reports in the other series.

Series 4 Documents and Committee Reports. Final reports of major committees concerned with vital andhealth statistics, and documents such as recommended model vital registration laws and reAsed birth-and death certificates.

Series 10. Data from the Health Interview Survey. Statistics on ill s; accidental injuries: disability: use ofhospital, medical, dental, and other services; and other health-related topics, based on data collected ina continuing national household interview survey.

Series 11 Data from the Health' EiaMination Survey.Data from direct examination, testing, arid measurementof national samples of the civilian, noniastitutionalized population proide the basis for two types ofreports: (1) estimates of the medically defined prevalence of Specific diseases in the United States andthe distributions of the population with respect to physical, physiological, and psychologiCal ctarac-teristics; and (2) analysis of relationships among the vzaious measurements without reference to anexplicit finite universe of personi.

Series 12. Data from the Institutionalized Population Surveys. Discontinued effective 1975. Future repons fromthese surveys will be in Series 13.

Series 13. Data on Health Resources Utilization.Statistics on the utilization of health manpower and facilitiesp-roviding long-term care, ambulatory care, hospital case, and family planningservices.

Series 14. Data on Health Resources: Manpower and Facilities. Statisties on the numbers, geographic distrib -ution, and characteristics of health resources including physicians, dentists, nurses, other health occu-pations, hospitals, nursing hornes, and outpatient facilities.

Series 20. Data on Mortality. Various statistics on mortality other than as included in regular annual or monthlyreports. Special analyses by cause of death, age, and other demographic variables; geographic and timeseries analyses; and statistics on chuacteristics of deaths ncit available from the vital records, based onsample surveys of those records;

Series 21. Data on Natality, Marriage, and Divorce.Vations statistics on natality, marriage, and divorce otherthan as included in regular anntial or monthly reports. Special analyses by demographic variables:geographic and time series analyses; studies of fertility; and statistics on characteristics of births notavailable from the-vital records,-based on sample surveys of those records:---

cries 22. Data from the National Mortality and Natality Surveys.Discontinucd effective 1975. Future reportsfrom these sample surveys based on vital records will be included in Series 20 and 21, respectively.

Series 23. Data from the National Survey of Family Growth. Statistics on fertility, family formation and disso-lution, farnily planning, and related maternal and infant health topics derived from a biennial survey ofa nationwide probability sample of ever-married women 1544 years of age.

For a list of titles of reports published in these eries, write to: Scientific and Technical Information BranchNational Center for Health StatisticsPublic Health Service, HRARockville, Md. 20857

9 0

I _

I