error categorization and analysis in man-computer communication systems

6
IEEE TRANSACTIONS ON RELIABILITY, VOL. R-22, NO. 3, AUGUST 1973 135 Error Categorization and Analysis in Man- Computer Communication Systems LEON H. NAWROCKI, MICHAEL H. STRUB, AND ROSS M. CECIL Abstract-This paper briefly examines traditional approaches terms of system design and to present a methodology and to human reliability and presents a technique which permits procedure for evaluating and interpreting operator error in a the system designer to derive a mutually exclusive and exhaus- system context. tive set of operator error categories in a man-computer system. The existing literature on operator error in system design These error categories are defined in terms of process failures typically emphasizes either a quantitative or qualitative ap- and provide the system designer with a qualitative index suit- proach. The quantitative view of operator errmr is repre- able for determining error causes and consequences. The tech- sented by those who are concerned with overall system reli- nique is demonstrated, and the utility of the resulting error ability in probabilistic terms [4-6]. The quantitative approach categories is evaluated in the context of two studies on a mili- assumes that one has frequency data available which can be tary information processing system. The paper concludes with expressed as the probability of error for each major component a brief discussion of detectable and non-detectable errors and a of the system. In its simplest form, one assumes that suggestion for determining the impact of errors on ultimate system goals. 1) the probability of error for the operator is P, 2) the probability of machine, or computer, error is Q, Reader Aids: 3) the two errors are statistically independent. Purpose: widen state of the art Given these or similar data and the rules of combinatorial Special mathematics needed for explanations: none probability, the probability of system error, E, can be deter- Special mathematics needed for results: none mined. If the operator and computer are arranged sequentially, Results useful to: system designers E = PQ; if arranged in parallel, 1 - E = (1 - P)(1 - Q). Con- sidering the values of system reliability which are obtained as a function of varying the values of any component or combina- INTRODUCTION AND BACKGROUND tion of system components, and knowing minimum acceptable level of overall system reliability, one can determine the most From the point of view of those involved in system design beneficial component linkage (serial or parallel) or the compo- and development it is becoming increasingly evident that the nent most in need of attention in order to provide the greatest introduction of computers has magnified the importance of ef- payoff for reducing system error. ficient man-machine interface [1, 2] . Particularly critical is the The weakness of this approach is that all too often compo- reduction of human error which occurs when the man in the nent errors are assumed to be statistically independent, error system (operator) must transform or recode incoming data into rates for each component are assumed to be constant, and for a form suitable for use by the computer [3]. The difficulty is any particular component each error is assumed to produce that while the operation') of the computer itself may be vir- equal system consequences. None of these assumptions is tually error free, accuracy may be negated if operator errors particularly valid for most systems employing a human opera- are introduced and accepted during the data input phase. Once tor. Clearly, computer errors are subject in part to the imbedded in the computer data base, such errors will be per- accuracy of the data which are input by the operator and, petuated and may result in system output errors with conse- hence, not at all independent of the operator error rate. There quences of varying degrees of severity for the successful com- are, of course, statistical techniques available for such condi- pletion of system goals. The purpose of this paper is to tional probability situations, but there stll remains the assump- examine briefly the traditional approach to operator error in tion that all operator errors affect the computer component equally and this is not always the case. Error rate constancy Manuscript received December 7, 1972; revised January 30, 1973. is another assumption that is seldom, if ever, true of human The views expressed in this paper are those of the authors and do not necessarily reflect the view of the United States Army or the Depart- performance and at the least the quantitative approach requires ment of Defense. Reproduction in whole or part is permitted for any specification of the statistical distribution of operator error purpose of the United States Government. over time. Finally, and perhaps most important, all errors do The authors are with the Department of the Army, U.S. Army Re-''. search Institute for the Behavioral and Social Sciences, 1300 Wilson not produce equal effects on system output. This last point Boulevard, Arlington, Va. 22209. warrants illustration by means of actual cases. 1)The hard/software mechanics/operations of the system itself are Vicino, Andrews, and Ringel [71 examined the ability of distinguished from the less tangible software such as inputing proce- . '' .. dures and rules at the interface level. Of course, the latter imply modi- subjects to detect the addition, deletion, or replacement of fications of the former when errors occur. symbolic data in sequentially presented displays. Overall error

Upload: ross-m

Post on 06-Nov-2016

217 views

Category:

Documents


5 download

TRANSCRIPT

IEEE TRANSACTIONS ON RELIABILITY, VOL. R-22, NO. 3, AUGUST 1973 135

Error Categorization and Analysis inMan-Computer Communication Systems

LEON H. NAWROCKI, MICHAEL H. STRUB, AND ROSS M. CECIL

Abstract-This paper briefly examines traditional approaches terms of system design and to present a methodology andto human reliability and presents a technique which permits procedure for evaluating and interpreting operator error in a

the system designer to derive a mutually exclusive and exhaus- system context.tive set of operator error categories in a man-computer system. The existing literature on operator error in system designThese error categories are defined in terms of process failures typically emphasizes either a quantitative or qualitative ap-and provide the system designer with a qualitative index suit- proach. The quantitative view of operator errmr is repre-able for determining error causes and consequences. The tech- sented by those who are concerned with overall system reli-nique is demonstrated, and the utility of the resulting error ability in probabilistic terms [4-6]. The quantitative approachcategories is evaluated in the context of two studies on a mili- assumes that one has frequency data available which can betary information processing system. The paper concludes with expressed as the probability of error for each major componenta brief discussion of detectable and non-detectable errors and a of the system. In its simplest form, one assumes thatsuggestion for determining the impact of errors on ultimatesystem goals. 1) the probability of error for the operator isP,

2) the probability of machine, or computer, error is Q,

Reader Aids: 3) the two errors are statistically independent.

Purpose: widen state of the art Given these or similar data and the rules of combinatorialSpecial mathematics needed for explanations: none probability, the probability of system error, E, can be deter-Special mathematics needed for results: none mined. If the operator and computer are arranged sequentially,Results useful to: system designers E = PQ; if arranged in parallel, 1 - E = (1 - P)(1 - Q). Con-

sidering the values of system reliability which are obtained as afunction of varying the values of any component or combina-

INTRODUCTION AND BACKGROUND tion of system components, and knowing minimum acceptablelevel of overall system reliability, one can determine the most

From the point of view of those involved in system design beneficial component linkage (serial or parallel) or the compo-and development it is becoming increasingly evident that the nent most in need of attention in order to provide the greatestintroduction of computers has magnified the importance of ef- payoff for reducing system error.ficient man-machine interface [1, 2] . Particularly critical is the The weakness of this approach is that all too often compo-reduction of human error which occurs when the man in the nent errors are assumed to be statistically independent, errorsystem (operator) must transform or recode incoming data into rates for each component are assumed to be constant, and fora form suitable for use by the computer [3]. The difficulty is any particular component each error is assumed to producethat while the operation') of the computer itself may be vir- equal system consequences. None of these assumptions istually error free, accuracy may be negated if operator errors particularly valid for most systems employing a human opera-are introduced and accepted during the data input phase. Once tor. Clearly, computer errors are subject in part to theimbedded in the computer data base, such errors will be per- accuracy of the data which are input by the operator and,petuated and may result in system output errors with conse- hence, not at all independent of the operator error rate. Therequences of varying degrees of severity for the successful com- are, of course, statistical techniques available for such condi-pletion of system goals. The purpose of this paper is to tional probability situations, but there stll remains the assump-examine briefly the traditional approach to operator error in tion that all operator errors affect the computer component

equally and this is not always the case. Error rate constancyManuscript received December 7, 1972; revised January 30, 1973. is another assumption that is seldom, if ever, true of human

The views expressed in this paper are those of the authors and do notnecessarily reflect the view of the United States Army or the Depart- performance and at the least the quantitative approach requiresment of Defense. Reproduction in whole or part is permitted for any specification of the statistical distribution of operator errorpurpose of the United States Government. over time. Finally, and perhaps most important, all errors doThe authors are with the Department of the Army, U.S. Army Re-''.

search Institute for the Behavioral and Social Sciences, 1300 Wilson not produce equal effects on system output. This last pointBoulevard, Arlington, Va. 22209. warrants illustration by means of actual cases.

1)The hard/software mechanics/operations of the system itself are Vicino, Andrews, and Ringel [71 examined the ability ofdistinguished from the less tangible software such as inputing proce- . '' ..dures and rules at the interface level. Of course, the latter imply modi- subjects to detect the addition, deletion, or replacement offications of the former when errors occur. symbolic data in sequentially presented displays. Overall error

136 IEEE TRANSACTIONS ON RELIABILITY, AUGUST 1973

rate increased directly with the number of changes in the dis- isolating the location, cause, and consequence of these mis-plays. More important, however, they noted that errors of matches. Meister and Rabideau [10] suggest this is accom-omission (failure to report a change) increased at 2.5 times the plished by recording errors according to type, frequency ofrate of errors of commission (reporting a change which has not occurrence, apparent cause, and the seriousness of the error inoccurred). This finding is an example not only of an error rate terms of system consequences. The intuitive obviousness ofas a function of data change, but two distinct error types much of the preceding is the major shortcoming of the qualita-emanating from the same component. Each error type can tive approach. That is, the system designer is given a logicalproduce dissimilar system consequences. That is, failure to structure which suggests what he ought to be concerned withrecord a data change may have a considerably different impact in considering operator error, but at a molar level. The de-on system goals than recording a change which has not in fact signer is not given a detailed procedure for obtaining the erroroccurred. The selection of system components might depend characteristics of the operator. So while this approach stresseson such differences. the quality of error rather than only the quantity, the problemFor example, Nawrocki [8] compared alphanumeric to pic- of defining errors still remains. The remainder of this paper is

torial codes in a task which required subjects to construct a set devoted to describing a methodology for defining operator er-of instructions which would satisfy changes depicted in dis- rors in man-computer systems and to illustrating the utility ofplays of military unit organization. Overall operator accuracy such a procedure.was identical for both display codes, but further inspection re-vealed that the proportion of omissions was significantly PROPOSEDEVALUATIONPROCEDUREgreater than that of commissions for the pictorial codes. Thisfinding produced the design recommendation that the selection Two basic conditions must be met in order to define operatorof a display code depends upon which type of error was po- errors in a man-computer system. First, there must be pre-tentially more disadvantageous to system goals. Alphanumeric determined system performance criteria. Often it is the usercodes are preferable if we wish to reduce omissions, and who must work in conjunction with the system designer topictorial codes preferable for decreasing commissions. establish these criteria. Second, the system must be capable ofAn important point illustrated by the preceding experiment generating performance data. This can be accomplished either

is that while manipulation of a system variable may produce through operation of the actual system or by simulating thenegligible effects on total error rate, the composition of the operation of the system or critical elements within the system.errors may shift, which in turn may effect final system output. Given that these conditions are met, the system designer is in aIn short, the simple strictly-quantitative approach requires a position to determine the type and frequency of operator er-considerable number of assumptions which are oversimplifica- rors. The causes and impact of errors are primarily a logicaltions of the system designer's problem. Moreover, the poten- issue which is simplified if the types of errors are properlytial causes of error are at best specified in terms of component classified. In fact, it is the contention of this paper that an ef-failure without suggesting which aspect of the component may fective error classification scheme will essentially answer theaccount for a failure. Thus, this approach identifies the source questions of cause, consequence, and correction of operatorwhich contributes most to system error but does not indicate error.appropriate corrective solutions other than substitution or ad- The following procedure is suggested as a means for obtainingdition of components, neither of which may be feasible or cost an effective error classification scheme for man-computer sys-effective. tems:An alternate view of operator error may be termed the 1. Determine the desired computer input by organizing the

qualitative, or descriptive, approach. It consists of clarifying original data as it appeared to the operator into the form inand identifying all those components and processes in the which it was intended to enter the computer (ideally).system which may introduce error [9-11 ]. In essence the 2. Observe system operation and record the data which thequalitative approach provides a structural framework to insure operator receives, the operator's input based on these data, andthat the system designer does not lose sight of those system the final form in which the data are accepted by the computer.aspects relevant to the evaluation and recognition of errors. 3. Compare each distinct data item input by the operator toSeveral authors [2, 1-1 ] in discussing qualitative error analysis the desired input.

suggest that operator errors occur solely because there exists a 4. When a discrepancy occurs between operator input andmismatch between the operator and the system components desired input, consider the process which the operator waswith which he interacts. In other words, there are no errors in- required to perform. Identify the error category for this dis-herent in the operator, only the potential for producing errors. crepancy in terms of failure of this process or a portion of it.In addition, errors are defined as deviations from some prede- 5. Determine if the error category has been already identi-termined output. Thus the system designer must have a crite- fled. Any discrepancy which does not fit a previous categorynion of desired system output and must compare this criterion requires that a new category be defined.to obtained system output. Differences between obtained and When all discrepancies have been accounted for, the categorydesired output (deviations) can be considered as mismatches definitions should be checked to insure that they are mutuallybetween operator capabilities and the- system requirements of exclusive. That is, an error should not be potentially assignablethe operator. The qualitative approach is oriented toward to more than one category. If this is not the case, there are an

NAWROCKI et al.: ERROR CATEGORIZATION AND ANALYSIS 137

insufficient number of categories. Ambiguous categories Msg #45601 UNIT ID: 1/BN/AR/2/ARshould be redefined until such overlap does not occur. "The CO has decided to attack the area enclosed by AA100001, BB200002,

CC300003, DD400004, AA100001. He assigns this area to Alpha Company andThe above procedure insures that error categories will be informs them he expects this area to be secured by 171600Z Jun 72. He has

, . \ . , . ~~~~~~~~~~namedthis area 'MICKEY '."mutually exclusive (non-overlapping), exhaustive (all categorles.UGC CON MEAS DATA MESSAGE OBJECTIVEShave been considered), and of sufficient depth to be useful to MSG-NR[45601]PREC[R]SCTY[U ]ORIG[FlCO]RESTR(

the system designer in determining error cause, consequence, UNIT-ID-OR-TF-NAME[l/BN/AR/2/AR I

and potential corrective measures. In addition, the number of LOC(AA100001] ,BB200002,CC300003,DD400004,AA100001,-cases in each category provides the designer with the error fre- RESP-UNIT(1/BN/AR/2/AR/A

OBJ-2-NAME ( ) TIME ()quencies which can be used to arrive at a quantitative estimate LOC(of system reliability and to provide a baseline for comparing RESP-UNIT(alternate system design modifications. On the other hand, one LOC(category may be a subcategory of another. If so, there are too RESP-UNIT(many categories and the designer should combine some. If the Fig. 1. Sample message and correctly completed message format.definitions are based on process failure, there will be as manydistinct categories as there are operator processes.

b) the formats as they would appear if converted from thefree text with perfect accuracy (desired output, or system

ILLUSTRATION: EXPERIMENT I crtra..criteria).The proposed procedure described can be illustrated by, and A discrepancy was defined as any mismatch between a single

was initially employed on, data obtained from a study con- entry and the corresponding desired entry. In this task anducted by Strub [12] to evaluate alternate man-computer input entry was considered as all the data in a space dedicated to aprocedures for an automated military information processing single unit of information (indicated in the format betweensystem. The operator's basic task was to transform a free text bracketed sections). A more universal definition would be thatmessage into a format acceptable for input to a computer data an entry is the smallest self-contained unit of information em-base. Figure 1 shows a typical message and a correctly com- ployed to satisfy each specific system requirement.pleted format. All items within brackets are those which are Each time an undefined discrepancy occurred, a new errorobtained from the free text. The remaining items are from a category was established which, in effect, described the mannerpredetermined format which is selected on the basis of the in which the required transform process had failed. Forgeneral message category according to overall message content. example, if a desired entry was "MICKEY" and the obtainedIn the current experimental system one operator transforms was "MICKOY," the process required was to place the correct

each message onto a paper format. A second operator calls up character, E, in the appropriate place. This might be definedthe same format on a Cathode Ray Tube (CRT) and then as "an error in which at least one character is added (0) andenters the data obtained from the first operator into the com- another character deleted (E)." Furthermore, suppose for theputer by typing the paper format items onto the CRT format. same entry "MICKY" is obtained from another operator. ThisThe study evaluated both the existing procedure and four category might be defined as "at least one or more charactersalternate procedures. The four alternatives were selected to is deleted from an entry (E)." In either case, since the stepsexamine format preparation method and verification, ie, error required to correctly input this entry are correctly reading,check effectiveness. Format preparation consisted of the copying, or typing, both categories can be labeled typographi-operator's either translating the free text onto a paper format cal. The first definition is more specifically Typographical Ex-and then entering this on the CRT (off-line preparation), or change and the second, Typographical Commission. Thetranslating the free text directly onto the CRT (on-line prepa- process' failure is identical in either case, and similarly, theration). In the non error-check alternative one operator was result for both is a character error. Therefore, upon review-employed (unverified), while in the error-check alternative two ing the completed list of categories it may be necessary to col-operators simultaneously and independently prepared their lapse these categories into a more encompassing definition:formats but checked one another's products for accuracy "all errors in which one or more individual characters for an(verified). The four categories were entry are incorrect."

1) off-line and verified, The category name then becomes the common element,2) off-line and unverified, Typographic. Analysis of the Strub data initially revealed3)~~~~onln.neiid twelve categories, which upon final review for functional over-4)n-lneanduneriie.' lap reduced to the following six:

An error analysis was performed on the data by employing Omsinarqreetyisc pleyoitdthe previously described error categorization procedure. Coin- Commission-an entry iS provided where none is desired

parisonswerebetween ~~~~~~Incorrect-an inappropriate entry iS provided in an appropri-ate position

a) the formats completed and entered by the operators (ob- Location-an appropriate entry is provided in an inappropri-tained output), ate position

138 IEEE TRANSACTIONS ON RELIABILITY, AUGUST 1973

TABLE 1 utility and generalizability of the procedure were evaluated inMean Error Rates (Percent) for Each Error Category a second study conducted by Strub [13]. Since subsequentand System Input Technique statistical analyses indicated no statistically significant dif-

Input Technique ferences in the error rate between the experimental alterna-Error Off-Line On-Line Off-Line On-Line tives employed by Strub, these need not be discussed here.

Category Current Unverified Unverified Verified Verified However, the error analyses uncovered an important format

Omission 5.4 5.2 3.3 2.6 2.0 variable previously unrecognized and also indicated a potentialCommission 1.6 1.9 1.8 2.7 1.7 system design difficulty which had not previously been recog-Incorrect 2.5 2.3 2.0 1.8 .8 nized.Location 1.2 1.1 1.3 .5 .4 The task was essentially identical to that of the on-line inputTypographic 8.2 4.2 3.0 2.9 2.1Abbreviation .4 .8 .8 .5 .4 procedure previously described. Since the task and the system

19.3 15.512.2 11.0 -4 being examined were similar to those in Experiment I, thegenerality of the error classification scheme would be sup-ported to the extent that the original categories were sufficientTypographic-one or more characters in the entry are incor- to account for all obtained errors. Thus the existing error

rect categories were employed with two minor modifications. TheAbbreviation-an entry employs an inappropriate code "Incorrect" category was relabeled "Glossary" to imply thatTable 1 presents the error rates occurring in each error the cause of the error was "selecting an inappropriate entry

category for each of the five alternate system input techniques. from among a glossary of potential entries available to theAn analysis of variance indicated that the following three ef- operator." Also, the "Location" label was redesignatedfects were satistically significant. 1, 2) The overall error rate "Category" to imply that the cause of the error was "selectingwas less for on-line techniques (1% level) and for techniques an entry from a category, or list, inappropriate for a particularrequiring verification (0.1% level). 3) The current experi- format position." In addition, the "Abbreviation" categorymental-system techniques produced more errors than all other definition required clarification. It is difficult to determinetechniques (1% level). As can be seen in Table 1, techniques from a single instance if a character error in a coded word is aproducing fewer errors tended to do so for all error categories. Typographic or Abbreviation error. To minimize subjectiveThe major exception is the unusually high proportion of decisions in the data reduction process this issue was resolvedtypographic errors found in the current system. Since the by declaring an Abbreviation error as a character error whichcurrent technique is the only one requiring an operator to occurs more than once for the same operator.copy another operator's input without verification, we assume Table 2 lists the error rates for each category (percent ofthat it is this translation step that is responsible for the high responses which were in error) and the proportional contribu-typographic error rate. From the system designer's viewpoint tion of each error to total error rate. Since there were twoit is clear that the best system technique is on-line, verified, general types of messages, those concerning friendly unit activ-As the proportion of errors of all types in on-line verified is ities and those about enemy unit activities, these figures areequal to or less than the proportion of errors produced in all given for each type of message. Message type was a hithertoother techniques, the designer need not be concerned with unexamined variable which error analysis exposed as beingweighting the error categories for seriousness of system con- considerably important. Enemy activity messages produce asequences. Of course, this may not always be the case. For all higher error rate than friendly messages and the bulk of thisthe techniques examined, the greatest number of errors is difference is in the category of omissions. Clearly, the designaccounted for by the omission and typographic categories. of this system calls for a modification of operator processingThat is, one third of the error categories account for nearly of enemy-activity information. Of more general interest, these60% of the errors. Assuming that similar system consequences data indicate that the greatest proportion of the errors (81-result from each type of error, it is obvious that attempts to 83%) are those labeled Omission, Commission, and Glossary.reduce errors should concentrate on omissions and typographic This is of considerable interest from a system design stand-errors. The category titles themselves imply error causes. Thus point as these three categories represent errors which would"omission" suggests the operator is not attending to all ele- remain undetected by computer edit-and-validate routines.ments within the free text. If this is so there should be fewer Thais it e f t computer to an ro rerrors of omission in a procedure calling for verification of chec on etrisp forw thereare peretrminedrogformat, which is the case. Typographic errors imply copying requirements. However error checks cannot be provided for(misreading) and typing errors. The fewer number of typo- rentri which. areotn, dependingaupon m e ontent.Pntrip.s whlch- nre. optionanl dependlng-n uponn message conitent.graphic errors in on-line and verified techniques supports this Foexml,iitsknw thtnycraniesadcdscontention. Knowing the causes, the system designer can thien.''' ~~~~~~~areapplicable to a particular entry, then the computer coulddirect his attention to appropriate human factors techniques scntenryadoifthopaorfannossec.for correcting these specific deficiencies in operator perfor- Thi prcdr is reaivl efetv lhuh o efco

mance. ~~~~~~~~~~~Category, Typographic, and Abbreviation error types. The

ILLUSTRATION: EXPERIMENT II typographical error "TINK" for "TANK" could be detected asthere is no vehicle class known as "TINK" (not to date!).

The data from Experiment I were used primarily to deter- Omission, Commission, and Glossary errors are more difficultmine the feasibility of the error classification procedure. The to detect. If a particular entry always requires data, i.e., is a

NAWROCKI et al.: ERROR CATEGORIZATION AND ANALYSIS 139

TABLE 2Mean Error Rates (Percent) for Each Error

Category Across Friendly and Enemy Messages

Friendly EnemyError

Category Error Rate % % Contribution Error Rate % % Contribution

Omission 5.3 23 14.8 48Commission 6.4 28 1.8 6Glossary 6.9 30 8.7 29

18.6 81 25.3 83

Category 1.8 8 2.3 8Typographic 2.6 11 1.6 5Abbreviation .1 0 1.3 4

4.5 19 5.2 17

mandatory entry, then omissions for that entry could be de- little to provide the designer with data on final system output.tected. But when an entry need not be completed, i.e., is an However, the use of a process failure definition for erroroptional entry, then omissions and commissions are not de- categories does provide a starting point for the evaluation oftectable. Indeed, for optional entries the computer cannot system consequences. For example, in the preceding studydetermine if the entry should be completed or not unless the certain error categories produced detectable errors if edit-and-original message were known. Clearly, to detect glossary er- validate software was employed while other errors remainedrors, as defined, would also require the computer to have undetected. Detected errors require the operator to correctknowledge of the original message. If the operator enters his original input and increase the input time. Therefore, the"155 mm" when "75 mm" is correct this will not be detected impact of detected errors would have to be evaluated in termsas a glossary error since both entries are logical alternatives. of the consequences of delays in processing. Undetected errorsThe results of Experiment II suggest that only one fifth of do not affect processing speed but must be evaluated in terms

the error types are detectable. Thus, an edit-and-validate soft- of the consequences of inappropriate data. In either case, theware addition would be expected to reduce errors by approxi- evaluation must consider the system user. That is, the de-mately 20%. The remaining 80o of the errors are undetect- signer's knowledge of the immediate effect of errors permitsable; increased operator instruction appears to be the only him to predict resulting system outputs, and the user canmeans of reducing this type of error. Hence, the system de- identify those errors or delays in output which impair hissigner might recommend that initial error reduction effort ability to perform a task. For example, in the context of themight better be spent on improved operator training rather present system the user(s) could be shown examples of omis-than additional edit-and-validate software. If the ratio of un- sion errors for all possible entries and asked to rank order thedetected to detected errors remains relatively constant (about impact of omitted data on their decisions. In essence then,4:1 in this case), then the measure of detected error rate per- error categorization permits the designer to construct hypo-mits one to infer the undetected error rate. thetical system outputs which can be evaluated in conjunctionDemonstrating that the categories modified for Experiment with the system user.

II are sufficient to describe all errors shows only that the Another potential use of error categorization is in modelingcategories are exhaustive. To determine the extent to which and system simulation. Error rates for different types of er-the error categories are mutually exclusive, two judges in- rors provide important model parameters. Such data have beendependently categorized errors, and Cohen's K statistic [14] successfully employed in a computer model which simulateswas computed to measure interjudge agreement for each the system being evaluated in the Strub studies [16]. More-operator. The median value of interjudge agreement across over, by employing parameter values from existing data theoperators was 0.78, a statistically significant level of agreement. validity of the model can be determined.This value indicates that the requirement of mutually exclu-sive categories was, for the most part, attained and suggests CONCLUSIONthe categories can be employed as is with reasonable assuranceof proper usage. The present methodology offers the designer a systematicIn summary, Experiment II provided support for the pro- means for deriving error categories for the operator component

posed classification methodology and indicated that the cate- of a man-computer system. The use of such categories pro-gories as defined are suitable for other man-computer systems vides the designer with error causes and immediate conse-of a similar nature. quences on which system modifications can be based. TheOne issue which needs additional clarification is that of sys- categories used to illustrate this methodology appear accept-

tem consequence. Baker [15] pointed out that there are often able for systems in which the operator must transform andperformance fluctuations within a system which have little or input data. These are basically information reduction tasks.no impact on ultimate system output. The present error classi- The utility of the classification for information transmissionfication system emphasizes the operator component and does and creation tasks remains to be examined.

140 IEEE TRANSACTIONS ON RELIABILITY, VOL. R-22, NO. 3, AUGUST 1973

REFERENCES [15] J. D. Baker, "Quantitative modeling of human performance ininformation systems," Ergonomics, vol. 13, pp. 645-664, Nov.

1] J. D. Baker, D. J. Mace, and J. M. McKendry, "The transform 1970.operation in TOS: Assessment of the human component," ]Be- [16] A. I. Seigel, J. J. Wolf, and W. R. Leahy, "A digital simulationhavior and Syst. Res. Lab., TRN 212, AD 697 716, Aug. 1969.3) model of message handling in the TOS," U. S. Army Res. Inst.,

[21 R. W. Bailey, "Testing manual procedures in computer-based TRN, Apr. 1973.business information systems," in Proc. 16th Annu. Meeting ofthe Human Factors Soc., Oct. 1972, pp. 395-401. (Human Fac-tors Soc., Box 1369, Santa Monica, Calif. 90406.)

[3] A. Chapanis, "Prelude to 2001: Explorations in human communi-cation," Amer. Psychol., vol. 26, pp. 949-961, Nov. 1971.

[4] A. Chapanis, W. R. Garner, and C. T. Morgan, Applied Experi- Leon H. Nawrocki was born in Cleveland, Ohio in 1940. He receivedmental Psychology. New York: Wiley, 1957, pp. 39-56. his B.Sc., M.A. and Ph.D. from the Ohio State University in 1965,

[5] I. G. Wilson and M. E. Wilson, Information, Computers and Sys- 1967 and 1969, respectivelytem Design. New York: Wiley, 1965, ch. 14. Since 1970 he has been employed as a Research Psychologist with

[6] W. Edwards, "Men and computers," in Psychological Principles in the U. S. Army Research Institute in Arlington, Va.System Development, R. M. Gagn, Ed. New York: Holt, Dr. Nawrocki is a member of the American Psychological AssociationRinehart and Winston, 1967. and the Human Factors Society.

[7] F. Vicino, R. Andrews, and S. Ringel, "Conspicuity coding ofupdated symbolic information," Behavior and Syst. Res. Lab.,TRN 152, AD 616 6000, May 1965.3)

[8] L. H. Nawrocki, "Alpha-numeric versus graphic displays in aproblem-solving task," Behavior and Syst. Res. Lab., TRN 226,AD 748 799, Sept. 1972.3) Michael H. Strub was born in Washington, D. C. in 1942. He received

[9] J. S. Kidd, "Human tasks and equipment design," in Psycholog- his A.B. from Fordham Univ. in 1964 and his M.A. and Ph.D. from theical Principles in System Development, R. M. Gagne, Ed. New Ohio State Univ. in 1966 and 1968, respectively.York: Holt, Rinehart and Winston, 1967. Since 1968 he has been employed as a research psychologist by the

[10] D. Meister and G. F. Rabideau, Human Factors in System Devel- U. S. Army Research Institute in Arlington, Va.opment. New York: Wiley, 1965. Dr. Strub is a member of the American Psychological Association, the

[11] D. Meister, Human Factors: Thieory and Practice. New York: Human Factors Society and the American Association for the Advance-Wiley Interscience, 1971, pp. 21-56. ment of Science.

[12] M. H. Strub, "Evaluation of man-computer input techniques formilitary information systems," Behavior and Syst. Res. Lab.,TRN 226, AD 730 315, May 1971. )

[13] -, "Automated aids to on-line tactical data inputting," U. S.Army Res. Inst., 1972.2) Ross M. Cecil was born in Kenosha, Wisconsin in 1949. He received his

[14] J. Cohen, "A coefficient of agreement for nominal scales," Edu- . o W iX _ ~~~B.A. from the Univ. of Wisconsin in 1971.cational and Psychological Measurements, vol. 20, pp. 37-46,Spring 1960. ' Since 1971 he has been serving as an enlisted man in the U. S. Army

and is currently assigned in a research assistant capacity to the U. S.Army Research Institute in Arlington, Va.

2)currently available from Nawrocki, Strub, or Cecil in draft form. Specialist Cecil is a member of the Association of American Geogra-3)Available from NTIS, Springfield, Va. 22151. phers, and the American Association for the Advancement of Science.

Human Reliability in Computer-Based BusinessInformation Systems

ROBERT W. BAILEY, STEPHEN T. DEMERS, AND ALLEN I. LEBOWITZ

Abstract-The study of human reliability has traditionally training, source data, man-machine interface, and environ-taken an "error-rate" approach, with the emphasis on iden- ment. This causal factor approach suggests that most human-tifying and reporting supposedly consistent and basic error- generated errors can be prevented by eliminating the factorsrate levels for various manual activities. This paper approaches that cause errors to occur.the study of human reliability by identifying numerous factorsthat tend to increase the probability that errors will occur mn Raerlscomputer-based business information systems. The Purpose Adac tteo.h r

causl fctocaegoesreprsoal,desgn,doc menajiors Special math needed for explanations: nonecausl fcto caegoiesarepersnal deign doumetaton, Special math needed for results: none

Manuscript received January 29, 1973; revised February 22, 1973. Results useful to: system desig,ners, reliability engineers, hu-The authors are with Bell Laboratories, New Brunswick, N.J. 08903. man reliability specialists