ieee transactions on nuclear science, …koasas.kaist.ac.kr/bitstream/10203/10013/1/jsha1.pdfieee...

14
IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007 2687 Development of Human Performance Measures for Human Factors Validation in the Advanced MCR of APR-1400 Jun Su Ha, Poong Hyun Seong, Member, IEEE, Myeong Soo Lee, and Jin Hyuk Hong Abstract—Main control room (MCR) man-machine interface (MMI) design of advanced nuclear power plants (NPPs) such as APR (advanced power reactor)-1400 can be validated through performance-based tests to determine whether it acceptably supports safe operation of the plant. In this paper, plant per- formance, personnel task performance, situation awareness, workload, teamwork, and anthropometric/physiological factor are considered for the human performance evaluation. For the development of human performance measures, attention is paid to considerations and constraints such as the changed environment in an advanced MCR, needs for a practical and economic evalu- ation, and suitability of evaluation criteria. Measures generally used in various industries and empirically proven to be useful are adopted as the main measures with some modifications. In addition, complementary measures are developed to overcome some of the limitations associated with the main measures. The development of the measures is addressed based on theoretical and empirical background. Finally we discuss the way in which the measures can be effectively integrated. The HUPESS (HUman Performance Evaluation Support System) which is in development is also briefly introduced. Index Terms—Advanced MCR, anthropometry/physiological factor, human performance, HUPESS, personnel task perfor- mance, plant performance, situation awareness, teamwork, workload. I. INTRODUCTION R ESEARCH and development for enhancing reliability and safety in nuclear power plants (NPPs) have been mainly focused on areas such as automation of facilities, securing safety margin of safety systems, and improvement of main process systems. However the studies of Three Mile Island (TMI), Chernobyl, and other NPP events have revealed that deficiencies in human factors such as poor control room design, procedure, and training, are significant contributing factors to NPPs incidents and accidents [1]–[5]. Accordingly more attention has been focused on the human factors study. As processing and information presentation capabilities of modern computers are increased, modern computer techniques have been gradually introduced into the design of advanced Manuscript received March 8, 2007; revised August 6, 2007. This was sup- ported in part by “The Development of the HFE V&V System for the Advanced Digitalized MCR MMIS” project. J. S. Ha and P. H. Seong are with the Department of Nuclear and Quantum Engineering, Korea Advanced Institute of Science and Technology, Daejeon 305-701, Korea (e-mail: [email protected]; [email protected]). M. S. Lee and J. H. Hong are with the MMIS Group, Korea Electric Power Research Institute, Daejeon 305-380, Korea (e-mail: fi[email protected]; jh- [email protected]). Digital Object Identifier 10.1109/TNS.2007.907549 main control rooms (MCRs) of NPPs [6], [7]. The design of instrumentation and control (I&C) systems for various plant systems is also rapidly moving toward fully digital I&C [8], [9]. For example, CRT(or LCD)-based displays, large display panels (LDP), soft-controls, a computerized procedure system, and an advanced alarm system were applied to APR-1400 (Advanced Power Reactor-1400) [10]. Hence the role of the operators in advanced NPPs shifts from a manual controller to a supervisor or a decision-maker [11] and the operator’ tasks have been more cognitive works. As a result, human factors engineering became more important in designing a MCR of an advanced NPP. In order to support the advanced reactor design certification reviews, the human factors engineering program review model (HFE PRM) was developed with the support of U.S. NRC [4]. Integrated system validation (ISV) is part of this review activity. An integrated system design is evaluated through performance-based tests to determine whether it acceptably supports safe operation of the plant [12]. NUREG-0711 and NUREG/CR-6393 provide the gen- eral guideline for the ISV. However in order to validate a real system, appropriate measures should be developed in consideration of the actual application environment. A lot of techniques for the evaluation of human performance have been developed in a variety of industrial area. Especially, OECD Halden Reactor Project has been conducting lots of studies regarding human factors in nuclear industry [13]–[18]. There were also R&D projects concerning human performance eval- uation in South-Korea [10], [19]. The studies provide not only valuable background but also measures for human performance evaluation. In this paper, human performance measures are developed in order to validate the MMI design in the advanced MCR of APR- 1400. Plant performance, personnel task performance, situation awareness, workload, teamwork, and anthropometric and phys- iological factor are considered as factors for the human perfor- mance evaluation. Measures generally used in various industries and empirically proven to be useful are adopted as main mea- sures with some modifications. In addition, helpful measures are developed as complementary measures in order to overcome some of the limitations associated with the main measures. The development of the measures is addressed based on the theoret- ical and empirical background and also based on the regulatory guidelines for the ISV such as NUREG-0711 and NUREG/CR- 6393. In addition, for the development of the measures in each of the factors, attention is paid to considerations and constraints, which will be addressed in the following section. 0018-9499/$25.00 © 2007 IEEE Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.

Upload: dangduong

Post on 10-Jun-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IEEE TRANSACTIONS ON NUCLEAR SCIENCE, …koasas.kaist.ac.kr/bitstream/10203/10013/1/JSHa1.pdfIEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007 2687 Development of

IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007 2687

Development of Human PerformanceMeasures for Human Factors Validation in the

Advanced MCR of APR-1400Jun Su Ha, Poong Hyun Seong, Member, IEEE, Myeong Soo Lee, and Jin Hyuk Hong

Abstract—Main control room (MCR) man-machine interface(MMI) design of advanced nuclear power plants (NPPs) such asAPR (advanced power reactor)-1400 can be validated throughperformance-based tests to determine whether it acceptablysupports safe operation of the plant. In this paper, plant per-formance, personnel task performance, situation awareness,workload, teamwork, and anthropometric/physiological factorare considered for the human performance evaluation. For thedevelopment of human performance measures, attention is paid toconsiderations and constraints such as the changed environmentin an advanced MCR, needs for a practical and economic evalu-ation, and suitability of evaluation criteria. Measures generallyused in various industries and empirically proven to be usefulare adopted as the main measures with some modifications. Inaddition, complementary measures are developed to overcomesome of the limitations associated with the main measures. Thedevelopment of the measures is addressed based on theoreticaland empirical background. Finally we discuss the way in whichthe measures can be effectively integrated. The HUPESS (HUmanPerformance Evaluation Support System) which is in developmentis also briefly introduced.

Index Terms—Advanced MCR, anthropometry/physiologicalfactor, human performance, HUPESS, personnel task perfor-mance, plant performance, situation awareness, teamwork,workload.

I. INTRODUCTION

RESEARCH and development for enhancing reliabilityand safety in nuclear power plants (NPPs) have been

mainly focused on areas such as automation of facilities,securing safety margin of safety systems, and improvementof main process systems. However the studies of Three MileIsland (TMI), Chernobyl, and other NPP events have revealedthat deficiencies in human factors such as poor control roomdesign, procedure, and training, are significant contributingfactors to NPPs incidents and accidents [1]–[5]. Accordinglymore attention has been focused on the human factors study.As processing and information presentation capabilities ofmodern computers are increased, modern computer techniqueshave been gradually introduced into the design of advanced

Manuscript received March 8, 2007; revised August 6, 2007. This was sup-ported in part by “The Development of the HFE V&V System for the AdvancedDigitalized MCR MMIS” project.

J. S. Ha and P. H. Seong are with the Department of Nuclear and QuantumEngineering, Korea Advanced Institute of Science and Technology, Daejeon305-701, Korea (e-mail: [email protected]; [email protected]).

M. S. Lee and J. H. Hong are with the MMIS Group, Korea Electric PowerResearch Institute, Daejeon 305-380, Korea (e-mail: [email protected]; [email protected]).

Digital Object Identifier 10.1109/TNS.2007.907549

main control rooms (MCRs) of NPPs [6], [7]. The design ofinstrumentation and control (I&C) systems for various plantsystems is also rapidly moving toward fully digital I&C [8],[9]. For example, CRT(or LCD)-based displays, large displaypanels (LDP), soft-controls, a computerized procedure system,and an advanced alarm system were applied to APR-1400(Advanced Power Reactor-1400) [10]. Hence the role of theoperators in advanced NPPs shifts from a manual controller toa supervisor or a decision-maker [11] and the operator’ taskshave been more cognitive works. As a result, human factorsengineering became more important in designing a MCR ofan advanced NPP. In order to support the advanced reactordesign certification reviews, the human factors engineeringprogram review model (HFE PRM) was developed with thesupport of U.S. NRC [4]. Integrated system validation (ISV)is part of this review activity. An integrated system designis evaluated through performance-based tests to determinewhether it acceptably supports safe operation of the plant[12]. NUREG-0711 and NUREG/CR-6393 provide the gen-eral guideline for the ISV. However in order to validate areal system, appropriate measures should be developed inconsideration of the actual application environment. A lot oftechniques for the evaluation of human performance have beendeveloped in a variety of industrial area. Especially, OECDHalden Reactor Project has been conducting lots of studiesregarding human factors in nuclear industry [13]–[18]. Therewere also R&D projects concerning human performance eval-uation in South-Korea [10], [19]. The studies provide not onlyvaluable background but also measures for human performanceevaluation.

In this paper, human performance measures are developed inorder to validate the MMI design in the advanced MCR of APR-1400. Plant performance, personnel task performance, situationawareness, workload, teamwork, and anthropometric and phys-iological factor are considered as factors for the human perfor-mance evaluation. Measures generally used in various industriesand empirically proven to be useful are adopted as main mea-sures with some modifications. In addition, helpful measuresare developed as complementary measures in order to overcomesome of the limitations associated with the main measures. Thedevelopment of the measures is addressed based on the theoret-ical and empirical background and also based on the regulatoryguidelines for the ISV such as NUREG-0711 and NUREG/CR-6393. In addition, for the development of the measures in eachof the factors, attention is paid to considerations and constraints,which will be addressed in the following section.

0018-9499/$25.00 © 2007 IEEE

Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.

Page 2: IEEE TRANSACTIONS ON NUCLEAR SCIENCE, …koasas.kaist.ac.kr/bitstream/10203/10013/1/JSHa1.pdfIEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007 2687 Development of

2688 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007

Fig. 1. Factors for human performance evaluation.

II. HUMAN PERFORMANCE EVALUATION: NEEDS,CONSIDERATIONS, CONSTRAINTS, AND

PERFORMANCE CRITERIA

The objective of the ISV is to provide evidence that the inte-grated system adequately supports plant personnel in the safeoperation of the relevant NPP [12]. The safety of a NPP is aconcept which is not directly observed but must be inferred fromavailable evidence. The evidence can be obtained through a se-ries of performance-based tests. Consequently, if the integratedsystem is assured to be operated within acceptable performanceranges, the integrated system is considered to supports plantpersonnel in the safe operation. The operator’s tasks are gen-erally performed through a series of cognitive activities suchas monitoring the environment, detecting data or information,understanding and assessing the situation, diagnosing the symp-toms, decision-making, planning responses, and implementingthe responses [5]. Hence, the MMI design of a MCR should havecapability to support the operators in performing these cognitiveactivities by providing sufficient and timely data and informationin an appropriate format. Effective means for the system controlshould be provided in an integrated manner as well. If the MMIdesign has the capability, the operators can effectively monitorand detect the data and information representing the plant status,understand the state of the plant system correctly, which also sup-port appropriate diagnosing the plant system, decision-making,and thus responses planning, and then implement the responses.Consequently, the suitability of the MMI design of a MCR isvalidated by evaluating human (operator) performance resultingfrom the series of cognitive activities. A dynamic mock-up in-cluding the simulator for the APR-1400 is utilized as a validationfacility.

In this paper, plant performance, personnel task performance,situation awareness, workload, teamwork, and anthropometricand physiological factors are considered for the human perfor-mance evaluation (see Fig. 1), which is also recommended inregulatory guideline [4], [12].

A. Considerations and Constraints

In this paper, human performance measures are developedbased on some considerations and constraints. Firstly, the

operating environment in an advanced MCR is changed fromthe conventional analog based MMI to digitalized one. AsO’Hara and Robert [97] pointed out, there are three importanttrends in the evolution of advanced MCRs such as increasedautomation, development of compact and computer-basedworkstations, and development of intelligent operator aids.Increases in automation result in a shift of operator’s roles froma manual controller to a supervisor or a decision-maker. Therole change is typically viewed as positive from a reliabilitystandpoint since unpredictable human actions can be removedor reduced. Thus the operator can better concentrate on su-pervising the overall performance and safety of the system byautomating routine, tedious, physically demanding, or difficulttasks. However inappropriate allocation of functions betweenautomated systems and the operator may results in adverseconsequences such as poor task performance, out-of-loopcontrol coupled with poor situation awareness, and so on [12].In addition, the shift in the operator’s role may lead to a shiftfrom high physical to high cognitive workload, even though theoverall workload can be reduced. Computer–based workstationof advanced MCRs, which has much flexibility offered bysoftware–driven interface such as various display formats (e.g.,lists, tables, flow charts, graphs, etc.) and diverse soft-controls(e.g., touch screen, mice, joy sticks, etc.), is thought to affectthe operator performance as well. Information is typicallypresented in pre-processed or integrated forms rather thanraw data of parameters and much information is condensedin a small screen. In addition, the operator has to manage thedisplay in order to obtain data and information which he or shewants to check. Hence poorly designed displays may misleadand/or confuse the operator and thus increase excessively cog-nitive workload, which can lead to human errors. Due to thesechanges of the operating environment, the operator’s tasks inan advanced MCR are conducted in a different way from theconventional one. Hence enhanced attention should be paidto operator task performance and cognitive measures such assituation awareness and workload. Secondly, the evaluationof human performance should be practical and economic.Since the aim of the performance evaluation considered inthis paper is eventually to provide an effective tool for thevalidation of MMI design of an advanced MCR, evaluationtechniques should be practically able to provide technical basisin order to get the operation license. In addition, the ISV isperformed through a series of tests which require considerableresources (e.g., time, labor, or money) from preparation toexecution. Hence economic methods which are able to saveresources are required. In order to consider these constraints,techniques proven to be empirically practical in various indus-tries are adopted as main measures with some modificationsand complementary measures are developed to supplementthe limitations associated with main measures. Both the mainmeasure and the complementary measure are used for theevaluation of plant performance, personnel task performance,situation awareness, and workload. Teamwork and anthropo-metric and physiological factors are evaluated with only mainmeasure. In addition, all the measures are developed to beevaluated simultaneously without interfering with each other.For example, if simulator-freezing techniques such as SAGAT

Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.

Page 3: IEEE TRANSACTIONS ON NUCLEAR SCIENCE, …koasas.kaist.ac.kr/bitstream/10203/10013/1/JSHa1.pdfIEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007 2687 Development of

HA et al.: DEVELOPMENT OF HUMAN PERFORMANCE MEASURES FOR HUMAN FACTORS VALIDATION IN THE ADVANCED MCR OF APR-1400 2689

(situation awareness global assessment technique) or SACRI(situation awareness control room inventory) are adopted forthe evaluation of situation awareness, it is thought that thesimultaneous evaluation of workload might be interfered bythat of situation awareness. Thirdly, evaluation criteria for theperformance measures should be clear. If it is not applicable toprovide clear criteria, the criteria should, at least, be reasonablein the state-of-the-art. As mentioned before, empirically proventechniques in various industries are adopted as main measureswith some modifications in order to provide clear or reasonablecriteria. More specifically, we focused on the techniques whichhave been used in nuclear industry so that we may utilize theresults of the studies as reference criteria. Main measures areused to determine whether the performance is acceptable ornot, whereas complementary measures are used to compareand then scrutinize the performance among operators or shiftsor supplement the limitation of the main measures.

B. Performance Criteria

The performance measures represent only the extent ofthe performance in the relevant measures. For example, ifNASA-TLX (national aeronautic and space administration taskload index) which uses 7 point scale is used for the evaluationof workload during operators’ tasks in NPPs, some scores suchas 4 or 6 represent the extent of the workload induced by therelevant tasks. The scores can be interpreted into the evaluationresults with reasonable criteria for the evaluation. Hence theacceptability of the performance in each of the measures shouldbe evaluated on the basis of performance criteria. The literature[12] summarizes approaches to establishing criteria, which varybased on types of comparisons such as requirement referenced,benchmark referenced, normative referenced, and expert-judg-ment referenced. Firstly, the requirement referenced is a compar-ison of the performance in the integrated system considered withan accepted and quantified performance requirement based onengineering analysis, technical specification, operating proce-dures, safety analysis reports, and/or design documents. Specificvalues in the plant parameters required by technical specificationand time requirements for critical operator actions can be usedas criteria for the requirement referenced comparison. When therequirement referenced comparison is not applicable, the otherapproaches are typically employed. Secondly, the benchmarkreferenced is a comparison of the performance in the integratedsystem considered with that of a benchmark system which ispredefined as acceptable under the same or equivalent condi-tions. There was a project for the ISV of a modernized NPPMCR which is based on the benchmark referenced comparison[98]. The MCR of the 30-year-operated NPP was renewed withmodernization of the major part of the MCR MMI. In the project,it was judged that the human performance level in the existingMCR could be used as an acceptance criterion for the humanperformance in the modernized MCR. Hence if the human per-formance in the modernized MCR is evaluated as better than orat least equal to that in the existing MCR, the modernized MCRcan be considered as acceptable. On the other hand, if a totallynew MCR (i.e., an advanced MCR) is considered for the ISV,this approach is also applicable. For example, if the operatorworkload in an advance MCR is not exceeding that in a referenceMCR (conventional one) which is identified as acceptable, this

can be used as criteria for the benchmark referenced comparison.Thirdly, the normative referenced comparison is based on normsestablished for performance measures through its use in manysystem evaluations. The performance in the integrated systemconsidered is compared to the norms established under the sameor equivalent conditions. In aerospace industry, the use of theCooper-Harper scale and the NASA-TLX for workload assess-ment are examples of this approach [12]. Finally, the expert-judgment referenced comparison is based on the criteria estab-lished through the judgment of subject matter experts (SMEs).

In the following section, the human performance measuresare described one by one with the performance criteria consid-ered in this paper.

III. HUMAN PERFORMANCE MEASURES

A. Plant Performance

The principal objective of operators in a NPP MCR is tooperate the NPP safely. The operators’ performance can beevaluated by observing whether the plant system is operatedwithin acceptable safety level which can be specified by theprocess parameters of the NPP. The operators’ performancewhich is measured by observing, analyzing, and then evalu-ating process parameters of a NPP is hence referred to as plantperformance. Since a NPP is usually operated by a crew as ateam, the plant performance is considered as a crew performancerather than individual performance. The plant performance isa result of the operators’ activities including individual tasks,cognitive activities, teamwork, and so on. Hence, measures ofthe plant performance can be considered as product measures,whereas the other measures for personnel task performance,situation awareness, workload, teamwork, and anthropometricand physiological factor can be considered as process measures.Product measures provide an assessment of results while processmeasures provide an assessment of how that result was achieved[17]. Since the achievement of safety and/or operational goals inNPPs is generally determined by values of process parameters,the plant performance can be directly interpreted into whetherthe goals in NPPs are achieved or not, which is considered as afavorable merit. There are usually values (e.g., set-points) re-quired to assure the safety of NPPs (or the sub-systems of a NPP)in each of process parameters. Another favorable merit of theplant performance is that objective evaluation can be conductedbecause explicit data are obtainable. In a loss of coolant accident(LOCA), for example, an important goal is to maintain thepressurizer level, which can be evaluated by examining the plantperformance measure regarding the pressurizer level. Howeverinformation on how the pressurizer level is maintained in therequired level is not provided by the plant performance mea-sures, which is considered as a demerit of the plant performancemeasures. Braarud and Skraaning [98] argued that the plantperformance measure in isolation do not inform about humanperformance. As Moracho [17] pointed out, the plant perfor-mance should be considered as a global performance of a crew’scontrol, that is, a product. The human performance accountingfor the process should be evaluated by the other measures forpersonnel task performance, situation awareness, workload,teamwork, and anthropometric and physiological factor. Also

Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.

Page 4: IEEE TRANSACTIONS ON NUCLEAR SCIENCE, …koasas.kaist.ac.kr/bitstream/10203/10013/1/JSHa1.pdfIEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007 2687 Development of

2690 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007

there can be another challenging case that the plant can beoperated within acceptable ranges even though design faults inhuman factors exist. For example, a highly experienced crew canoperate a plant system within acceptable range, even though theMMI is poorly designed. This is another reason that the otherperformance measures should be considered to complement theplant performance [12]. In order for the plant performance tobe more informative, attention should be deliberately paid tothree problems such as preparation of test scenarios, selection ofimportant process parameters, and integrated analysis with theother measures. Firstly, test scenarios must be designed so thateffects of MMI design (e.g., a new design or design upgrade) ofinterest can be manifested in operators’ performance, which isalso expected to improve the quality of evaluations with the otherperformance measures. Secondly, process parameters sensitiveto and representative of operators’ performance must be selectedas important process parameters. Thirdly, the plant performanceshould be analyzed with the other measures in an integratedmanner, which will be addressed in more detail in chapter 4.

In this paper, operational achievement in important processparameters is considered for the evaluation of the plant perfor-mance. Several important process parameters are selected bySMEs (process experts). It is used as a main measure for theevaluation of the plant performance whether the values of theselected process parameters are maintained within upper andlower operational limits (within acceptable range) or not. Beside,discrepancy between operationally suitable values and observedvalues in the selected process parameters is utilized in order toscore the plant performance as a complementary measure. Alsoat the end of test scenarios, the process parameters should bewithin a range of values, which is called target range, to achievethe plant safety. The elapsed time from event (e.g., transient oraccident) to getting into the target range in each of the selectedprocess parameters is calculated with simulator logging data.

1) Main Measure: Checking Operational Limits: First of all,if test scenarios are developed, SMEs (process experts) selectimportant process parameters (empirically 5 to 7) for each ofthe scenarios. After reviewing operating procedures, technicalspecifications, safety analysis reports, design documents, and soon, upper and lower operational limits for the safe operation ofNPPs are determined by the SMEs (process experts). During atest (a validation test), it is confirmed whether the values of theselected parameters exceed those of the upper and lower limitsor not. If the values don’t exceed the limits, the plant perfor-mance is evaluated as acceptable. The evaluation criterion ofthis measure is based on the requirement referenced compar-ison. The values of the parameters can be obtained from the log-ging data of a simulator.

2) Complementary Measure: Discrepancy Score & ElapsedTime From Event to Target Range: During the test, the discrep-ancies between operationally suitable values and the observedvalues in the selected process parameters are calculated. Thiskind of evaluation techniques were applied to PPAS (plant per-formance assessment system) and effectively utilized for theevaluation of plant performance [13], [17]. The operationallysuitable value is assessed as a range value not a point valueby SMEs (process experts), because it is not easy to assess theoperationally suitable value as a specific point value. Hence, it

has upper and lower bounds. The range value should representgood performance expected for a specific scenario (e.g., LOCAor transient scenario). Also the assessment of the operationallysuitable value should be based on some references such as op-erating procedures, technical specifications, safety analysis re-ports, design documents, and so on. If the value of a processparameter is getting beyond the range (e.g., upper bound) or get-ting under the range (e.g., lower bound), the discrepancy is usedfor calculation of the complementary measure. In mathematicalforms, first, discrepancy in each of the parameters is obtained,as follows:

(1)

wherediscrepancy of parameter at time during the

test,value of the parameter at time during the test,upper bound value of the operationally suitable

value,lower bound value of the operationally suitable

value,mean value of the parameter during initial steady-

state,simulation time after an event occurs.

Here, the discrepancy between observed value and the op-erationally suitable value in each parameter is normalized bydividing it by the mean value of parameter obtained duringinitial steady-state, because all the discrepancy in the parame-ters are eventually integrated into a measure, that is, a kind oftotal discrepancy. The normalized discrepancy of parameter issummed up over test time .

(2)

where,averaged sum of normalized discrepancy of pa-

rameter over the test time, .The next step is to obtain weights of the selected process pa-

rameters. The analytic hierarchy process (AHP) is used as toolfor evaluating the weights. The AHP has the merits of beinguseful to structure a decision problem hierarchically and to ob-tain the weighting values quantitatively. The AHP serves as aframework to structure complex decision problems and providejudgments based on the expert’s knowledge and experience toderive a set of weighting values by using the pair-wise compar-ison [20]. The averaged sums of the parameters are multipliedby the weights of the relevant parameters and then the multi-plied values are summed up, as follows:

(3)

where,total discrepancy during the test,

total number of the selected parameters,weighing value of parameter .

Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.

Page 5: IEEE TRANSACTIONS ON NUCLEAR SCIENCE, …koasas.kaist.ac.kr/bitstream/10203/10013/1/JSHa1.pdfIEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007 2687 Development of

HA et al.: DEVELOPMENT OF HUMAN PERFORMANCE MEASURES FOR HUMAN FACTORS VALIDATION IN THE ADVANCED MCR OF APR-1400 2691

At the end of the test, another measure for the discrepancy iscalculated, which can represent the ability of a crew to completean operational goal.

(4)

where,discrepancy of parameter at the end of the test,

value of parameter at the end of the test,upper bound value of the operationally suitable

value,lower bound value of the operationally suitable

value,mean value of the parameter during initial steady -

state.The normalized discrepancy of parameter is multiplied by

the weight of each parameter and then the multiplied valuesare summed up, as follows:

(5)

where,total discrepancy at the end of the test,

Low total discrepancy means better plant performance. Thismeasure is used for comparing the performance among thecrews or test scenarios rather than for determining it is accept-able or not.

Finally, the elapsed time from an event (e.g., transient or ac-cident) to getting into the target range in each of the selectedprocess parameters is calculated with simulator logging data.This measure is basically based on the fact that shorter timespent in accomplishing a task goal represents good performance.This measure is calculated at the end of test. Considering somefluctuation in the parameter, the time when the parameter is sta-bilized should be considered as the measure. The evaluation cri-teria of these measures are hence based on both the requirementreferenced and the expert-judgment referenced comparisons.

B. Personnel Task Performance

Even though plant performance is maintained within accept-able ranges, design faults or shortcomings may result in un-necessary work being placed on operators. Personnel task mea-sures provide complementary data to plant performance mea-sures. Personnel task measures can reveal potential human per-formance problems, which were not found in the evaluationof the plant performance [12]. As mentioned before, personneltasks in the MCR can be summarized as a series of cognitiveactivities. Consequently, operators’ tasks can be evaluated byobserving whether they monitor or detect the relevant data orinformation to the situation, whether they perform correct re-sponses, and then whether the sequence of the series of activi-ties is appropriate [18].

1) Main Measure: Confirming Indispensable Tasks & Com-pletion Time: Whether the cognitive activities are performedcorrectly or not can be evaluated by observing a series of tasks.Some elements of the cognitive activities are observable, even

Fig. 2. Hierarchical task analysis.

though the others are not observable but inferable. The activ-ities related to detection or monitoring and execution can beconsidered as observable activities, whereas the other cogni-tive activities can be inferred from the observable activities [21].Consequently, personnel task performance can be evaluated byobserving whether the operators monitor and detect the appro-priate data and information, whether they perform appropriateresponses, and finally whether the sequence of the processes isappropriate. Primary task and secondary task should be evalu-ated for the personnel task evaluation. For an analytic and log-ical measurement, a validation test scenario is hierarchically an-alyzed and then an optimal solution for the scenario is devel-oped, as shown in Fig. 2.

Since operators’ tasks in NPPs are generally based on thegoal-oriented procedure, the operating procedure provides theguide for the development of the optimal solution. Main goalrefers to a goal to be accomplished in a scenario. The main goalis located at highest rank and it breaks down into the sub-goalsto achieve it: the sub-goals can also break down, if needed.There are detections, operations, and sequences to achieve therelevant sub-goal in the next rank. Detections and operationsbreak down into detailed tasks to achieve the relevant detec-tions and operations, respectively. Tasks located in the bottomrank in Fig. 2 comprise a crew’s tasks required for completionof the main goal. Top-down and bottom-up approaches are uti-lized for the development of the optimal solution. Next, indis-pensable tasks required for safe NPP operation are determinedby SMEs (process experts). During the test, SMEs (the same orother process experts) observe the operators’ activities, collectdata such as operators’ speech, behavior, cognitive process, andlogging data, and then evaluate whether the tasks located in thebottom rank are appropriately performed or not. If all the indis-pensable tasks are satisfied, personnel task performance is con-sidered as acceptable. The evaluation criterion of this measureis hence based on both the requirement referenced and the ex-pert-judgment referenced comparisons. It should be noted thatthere is possibility that the operators may implement the tasksin different way from the optimal solution according to theirstrategy which is not considered by the SMEs (process experts)in advance. In this case, the SMEs should check and record the

Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.

Page 6: IEEE TRANSACTIONS ON NUCLEAR SCIENCE, …koasas.kaist.ac.kr/bitstream/10203/10013/1/JSHa1.pdfIEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007 2687 Development of

2692 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007

operators’ activities during the test and then some part of the op-timal solution are revised based on the observed activities afterthe test. The task performance is reevaluated with the revisedsolution and the collected data.

In addition, task completion time is also evaluated. Time tocomplete each of the tasks located in the bottom rank is evalu-ated based on experience and expertise of the SMEs. The sum-mation of the evaluated times can be interpreted as a requiredtime to complete a goal. If the real time spent for the completionof a goal in a test is less than or equal to the required time, timeperformance of the personnel task is considered as acceptable.

2) Complementary Measure: Scoring Task Performance:The main measure, a kind of descriptive measure, is comple-mented by scoring the task performance, which can be usedfor analyzing and comparing performance among crews or testscenarios. First, the weights of the elements in the optimal so-lution shown in Fig. 2 are calculated using the AHP. Second, asmentioned above section, the operators’ activities are observedand evaluated during a test. Third, it is evaluated by SMEs(process experts) whether the respective tasks are satisfied in anappropriate sequence. Finally, the task performance is scoredwith the observed and evaluated data and the weights of the tasks.Higher score means higher task performance. Hence the evalu-ation criterion of this measure is based on the expert-judgmentreferenced comparison. This kind of measures was used in OPAS(operator performance assessment system) and reported to bereliable, valid, and sensitive indicator of human performance indynamic operating environments [18]. In mathematical forms,first, there are two kinds of calculation such as task score andsequence score. Each task score is calculated, as follows:

(6)

where, task- score. Each sequence score is calculated, asfollows:

(7)

where, sequence- score. Finally personnel task scorecan be calculated by summing up the weighted task scores andsequence scores.

(8)

where,the personnel task score

total number of the tasks in the bottom ranktotal number of the sequences consideredweighting value of task-weighting value of sequence-

C. Situation Awareness (SA)

In NPPs, the operator’s actions must always be based on iden-tification of the operational state of the system. As shown inTMI accident [22], incorrect SA may contribute to the prop-agation or occurrence of accidents. Consequently, SA is fre-quently considered as a crucial key to improve performanceand reduce error [23]–[25]. There have been discussed defini-tions of SA in the literature [26]–[29]. One of the most influ-ential perspectives on SA has been put forth by Endsley whoinformally notes that SA concerns “knowing what is going on”[29]. Endsley defined more precisely that “situation awarenessis the perception of the elements in the environment within avolume of time and space, the comprehension of their meaning,and the projection of their status in the near future” [29]. Con-sidering that the operator’s tasks in NPPs can be summarizedas a series of cognitive activities such as monitoring, detecting,understanding, diagnosing, decision-making, planning, and im-plementing, the operator’s tasks can be significantly influencedby the operator’s SA. Consequently, it is recognized that cor-rect SA is one of the most critical contribution to safe opera-tion in NPPs. Moreover the advanced MCR in APR-1400 adoptsnew technologies such as CRT(or LCD)-based displays, largedisplay panels (LDP), soft-controls, a computerized proceduresystem, and an advanced alarm system. Even though operatorsare expected to be better aware of situation with the new tech-nologies, there is also possibility that the changed operationalenvironment can deteriorate SA of the operators. As O’Hara andRobert [97] pointed out, there can be difficulty in navigatingand finding important information through computerized sys-tems, loss of operator vigilance due to automated systems, andloss of the ability to utilize well-learned and rapid eye scan-ning patterns and pattern recognition from spatially fixed pa-rameter displays. Hence, the new design of an advanced MCRshould be validated throughout the ISV tests, that is, perfor-mance-based tests. Measurement techniques which were devel-oped for SA measurement can be categorized into 4 groups suchas performance-based, direct query & questionnaire, subjectiverating, and physiological measurement techniques [12], [25].O’Hara et al. [12] pointed out that performance-based tech-niques have both logical ambiguities in their interpretation andpractical problems in their administration. Thus they may notbe well suited for ISV tests. Direct query & questionnaire tech-niques can be categorized into post-test, on-line-test, and freezetechniques according to the evaluation point over time [30]. Thiskind of techniques is based on questions and answers regardingthe SA. Among them, it takes up much time to complete thedetailed questions and answers generally used in the post-testtechnique, which can lead to incorrect memory problems of op-erators. In addition, the operator has a tendency to overgener-alize or rationalize their answers [31]. The on-line-test tech-niques require questions and answers during the test to over-come the memory problem. However, the questions and answerscan be considered as another task, which may distort the oper-ator performance [12]. The freeze techniques require questionsand answers by randomly freezing the simulation to overcomethe demerits of the post-test and on-line-test techniques. One ofthe most representative techniques is SAGAT (Situation Aware-ness Global Assessment Technique) which has been employed

Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.

Page 7: IEEE TRANSACTIONS ON NUCLEAR SCIENCE, …koasas.kaist.ac.kr/bitstream/10203/10013/1/JSHa1.pdfIEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007 2687 Development of

HA et al.: DEVELOPMENT OF HUMAN PERFORMANCE MEASURES FOR HUMAN FACTORS VALIDATION IN THE ADVANCED MCR OF APR-1400 2693

across a wide range of dynamic tasks including air traffic con-trol, driving, and NPP control [32]. The SAGAT has the advan-tages of being easy to use (in a simulator environment), pos-sessing good external indices of information accuracy, and pos-sessing well-accepted face validity [33]. However a criticism ofthe SAGAT has been that the periodic interruptions are too intru-sive, contaminating any performance measures, which is relatedto the concern that the questions may cue participants (e.g., op-erators) to some details of the scenario, setting up an expectancyfor certain types of questions [12], [34], [35]. Meanwhile End-sley has shown that performance measures (e.g., kills and lossesin an air-to-air fighter sweep mission) are not significantly af-fected by the conditions of simulation freeze or non-freeze [32]and by question point in time, question duration, and questionfrequency for the SAGAT measurement [36], [37]. However sherecognized that it is never possible to “prove” that SAGAT doesnot influence performance, pointing out that all of the studiescollected so far indicate that it does not appear to significantlyinfluence performance as long as the stops (or freezes) are un-predictable to the subject [32]. There are studies of the SACRI(situation awareness control room inventory) which is adaptedafter SAGAT for use in a NPP [38], [39]. The SACRI has beendeveloped for the use in the NORS simulator in the Halden re-actor project (HRP). Subjective rating techniques typically in-volve assigning a numerical value to the quality of SA during aparticular period of event [40]. Subjective ratings techniques arepopular because these techniques are fairly inexpensive, easy toadminister, and non-intrusive [33], [40]. However, there havebeen criticisms. First, participants’ (or operators’) knowledgemay not be correct and the reality of the situation may be quitedifferent from what they believe [41]. Second, SA may be highlyinfluenced by self-assessments of performance [33]. Third, op-erators will probably be inclined to rationalize or overgeneralizeabout their SA [41]. In addition, some measures such as SARTand SA-SWORD include workload factors rather than limitingthe techniques to SA measurement itself [12]. Physiologicalmeasurement techniques have been used to study complex cog-nitive domains such as mental workload and fatigue and veryfew experiments have been conducted to study SA [42]. Eventhough physiological measures are likely to require high cost ofcollecting, analyzing, and interpreting the measures, comparedwith the subjective rating and performance-based measurementtechniques, they have unique properties considered attractive toresearchers in SA field. First, it does not require intrusive inter-ference such as freezing the simulation. Second, it can providecontinuous indication on the SA in contrast to the above-men-tioned techniques. Third, it is possible to go back and assessthe situation, because it is continuously recorded. In nuclearindustry, eye fixation measurement has been used as an indi-cator for SA, which is called VISA (visual indicator of situationawareness) [16]. In an experimental study of the VISA, timespent on the eye fixation has been proposed as a visual indi-cator of SA. The results of the VISA study showed that SACRIscores were correlated with the VISA, which was somewhat in-consistent between two experiments in the study. Even thoughthese techniques cannot provide clearly how much informationis retained in memory, whether the information is registered cor-rectly, or what comprehension the subject has of those elements

[31], [42], it is believed that physiological techniques can be po-tentially helpful and useful indicators regarding SA.

In this paper, a subjective rating measure is used as the mainmeasure for the SA evaluation, even though it has some draw-backs mentioned before. Eye fixation measurement is also usedfor the complementary measure.

1) Main Measure: KSAX: KSAX [10] is a subjective ratingstechnique adapted from the SART [43]. After completion of atest, the operators subjectively assess their own SA on a ratingscale and provide the description or the reason why they givethe rating. One of the crucial problems in the use of SART wasthat workload factors could not be separated from the SA eval-uation. In the KSAX, Endsley’s SA model has been applied tothe evaluation regime of the SART. The KSAX has been suc-cessfully utilized in the evaluation of suitability for the designof soft control and safety console for the APR1400 [10]. SinceSA is evaluated based on questionnaire after a test, the operatorsare not interfered by the evaluation activities. Consequently, itdoes not affect the evaluations of the other performance mea-sures, especially cognitive workload, which leads to economicevaluation of human performance for the ISV. All the measuresconsidered in this paper can be evaluated in one test. In addition,the KSAX results from an antecedent study for the APR-1400[10] can be utilized as a criterion based on the benchmark ref-erenced comparison for the ISV, which is considered as an im-portant merit [12]. The questionnaire of the KSAX consists ofseveral questions regarding the level 1, 2, and 3 SAs definedby Endsley. Usually 7 point scale is used for the measurement.The rating scale is not fixed but the use of 7 point scale is recom-mended, because the antecedent study used 7 point scale. Thequestions used in KSAX are made such that SA in an advancedNPP is compared with that of the already licensed NPPs. Gener-ally, the operators who have been working in the licensed NPPsare selected as participants for the validation tests. Therefore ifthe result of SA evaluation in an advanced NPP is evaluated asbetter than or equal to that in the licensed NPP, the result of theSA evaluation is considered as acceptable. The evaluation crite-rion of this measure is hence based on the benchmark referencedcomparison.

2) Complementary Measure: Continuous Measure Based onEye Fixation Measurement: The subjective measure of SA canbe complemented by a continuous measure based on eye fix-ation data which is a kind of physiological measures. SinceKSAX is evaluated subjectively after a test, it is not possible tocontinuously measure the operator’s SA and to secure the ob-jectivity. A physiological method generally involves the mea-surement and data processing of one or more variables relatedto human physiological processes. The physiological measuresare known as being objective and can provide continuous infor-mation on activities of subjects. These days, there are developedeye tracking systems which have capability to measure a sub-ject’ eye movement without direct contact. Hence the measure-ment of the eye movement is not intrusive to the operators’ activ-ities. In the majority of cases, the primary means of informationinput to the operator are through the visual channel. An anal-ysis of the manner in which the operator’s eyes move and fixategives an indication of the information input. Hence, even thoughthe eye fixation measurement cannot tell the operator’s SA ex-

Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.

Page 8: IEEE TRANSACTIONS ON NUCLEAR SCIENCE, …koasas.kaist.ac.kr/bitstream/10203/10013/1/JSHa1.pdfIEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007 2687 Development of

2694 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007

actly, we believe that it can extract the indication regarding theoperator’s SA, which then can be used as a complementary in-dicator for the SA evaluation. In NPPs, there are a lot of infor-mation sources that should be monitored but the operators haveonly limited capacity of attention and memory. Because it is im-possible to monitor all information sources, the operators con-tinuously decide where to allocate their attentional resources.This kind of cognitive skill is called selective attention. The op-erators use this cognitive skill to overcome the limitations ofhuman attention. The stages of information processing dependon mental or cognitive resources, a sort of pool of attention ormental effort that is of limited availability and can be allocatedto processes as required [44]. When an abnormal situation oc-curs in an NPP, the operators try to understand what is goingon in the plant. The operators receive information from the en-vironment (e.g., indicators or other operators) and process theinformation to establish situation model based on their mentalmodel. As O’Hara et al. [45] summarized, a situation model isan operator’s understanding of the specific situation, and themodel is constantly updated as new information is received.Mental model refers to the general knowledge governing theperformance of highly experienced operators. Mental model in-cludes expectancies on how the NPPs will behave in various ab-normal situations. For example, when a LOCA occurs, the pres-surizer pressure, temperature, and level will decrease, and thecontainment radiation will increase. These expectancies formrules on the dynamics of the NPPs and the mental model is es-tablished based on these rules [46]. When an abnormal or ac-cident situation occurs, operators usually first recognize it bythe onset of salience such as alarm or deviation in process pa-rameters from the normal condition. Then, they develop theirsituation awareness or establish their situation model by selec-tively attending the important information sources. The mainte-nance of their situation awareness or confirmation of their sit-uation model is accomplished by iterating the selective atten-tion. The selection of the information sources to attend is typ-ically driven by four factors: salience, expectancy, value, andeffort [44]. The operators are expected to attend salient infor-mation sources. With the expectancy our attention is shifted tothe specific sources which are most likely to provide informa-tion. For example, the pressurizer pressure, temperature, andlevel in NPPs decrease in both accidents, LOCA and SGTR(steam generator tube rupture). The two accidents can be dis-tinguished by observing the containment radiation or feed andsteam flow deviation (also there is other information that distin-guishes the two accidents). The value of the containment radi-ation is changed in LOCA not in SGTR. The feed/steam flowdeviation in one of the steam generators is changed in SGTRnot in LOCA. If the pressurizer pressure, temperature, and leveldecrease, the operators may frequently look at the salient indi-cators of the pressure, the temperature, and the level and mayconsider that the accident may be a LOCA or a SGTR. Thenthey are expected to attend to the indicator of the containmentradiation or the indicators representing the feed/steam flow devi-ation. If the containment radiation increases, the operators prob-ably consider the accident as a LOCA. However, there is pos-sibility of the failure of the indicator of the containment radia-tion, even though the likelihood is very small. Hence, they may

look at the indicators representing the feed/steam flow deviationin order to get more information (or evidence). In this case, ifthe accident is a LOCA, no change would be observed in thefeed/steam flow deviation. This information is also importantto understand the situation correctly, even though no salience isprovided in this information. Also the frequency of looking ator attending to information source is modified by how valuableit is to look at. As mentioned before, there can be other infor-mation that distinguishes the two accidents. However, there areusually representative dynamics in NPPs which should be es-tablished in the operators’ mental model through training andexperience. The mental model determines the values of the in-formation sources. Therefore, well-trained and experienced op-erators are expected to attend the valuable information sourcesmore frequently. Finally, selective attention may be inhibited ifit is effortful compared to its value. If there are two informationsources with the same value (importance), the operators mayattend the information source easy to access. To summarize theselective attention, the operators are expected to attend the in-formation sources which are salient, important (valuable), andeasy to access. Consequently, in order for the operators to effec-tively monitor, detect, and thus understand the state of a system,the operator should allocate their attentional resources not onlyto the salient information sources but also to valuable informa-tion sources. The eye fixations on areas of interest (AOIs) thatare important for solving the problems can be considered as anindex of monitoring and detection, which then can be interpretedinto the perception of the elements (level 1 SA). As we thinkabout or manipulate perceived information in working memory,an action is delayed or not executed at all [44]. Consequently,time spent on the AOIs by the operators can be understood asan index for the comprehension of their meaning (level 2 SA).As mentioned before, the selective attention is associated withexpectancy for the near future. The projection of their status inthe near future (level 3 SA) is, therefore, can be inferred fromthe sequence of the eye fixations.

In this paper, the eye fixation on the AOIs, the time spent onAOIs, and the sequence of the fixations are used for the SA evalu-ation. SMEs (process and/or human factors experts) analyze theeye fixation data after the completion of a test. It is recommendedthat the analysis be performed for specific periods representingthe task steps in the optimal solution of the personnel task perfor-mance. For example, the times spent for achieving the sub-goalsin the optimal solution can be used as the specific periods for theanalysis. The attention should be paid to finding out deficienciesof the MMI design or the operators’ incompetence leading toinappropriate ways of the eye fixations. For each of the periods,SMEs analyze the eye fixation data and evaluate the SA as oneof three grades such as excellent, appropriate, or not appropriate.The evaluation criterion of this measure is hence based on theexpert-judgment referenced comparisons. Even though thistechnique has the drawback that the eye fixation data should beanalyzed by the SMEs, which requires much effort and time, it isthought that only the SMEs can provide meaningful evaluationfrom the eye fixation data, because the SMEs have usually mostknowledge and experience about the system and the operation.The authors performed an experimental study with a simplifiedNPP simulator [47]. In the experiments, the eye fixation data

Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.

Page 9: IEEE TRANSACTIONS ON NUCLEAR SCIENCE, …koasas.kaist.ac.kr/bitstream/10203/10013/1/JSHa1.pdfIEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007 2687 Development of

HA et al.: DEVELOPMENT OF HUMAN PERFORMANCE MEASURES FOR HUMAN FACTORS VALIDATION IN THE ADVANCED MCR OF APR-1400 2695

Fig. 3. Example of the eye fixation measurement.

during complex diagnostic tasks were analyzed. The resultsshowed that the eye fixation patterns of subjects with high,medium, or low expertise were different in the same operationconditions. The subject who has more knowledge about thesystem fixated various information sources with short fixationtime; iteratively fixated at the important information sources;and then reported the situation with high confidence. However,the subject who has poor knowledge spent much time on salientinformation sources; didn’t fixate various information sourcesimportant to solve the problem; and then reported the situationwith low confidence (it seemed just a guess). As shown in Fig. 3,a computerized system for the eye fixation analysis facilitatethe SA evaluation. The number centered in the circle representsthe order of the fixation. The area of the circle is proportional tothe fixation time which is displayed at the bottom of the circle.

D. Workload

Workload has important relationship to human performanceand error [12]. Despite its importance, generally accepted def-inition of cognitive workload is still not available [48]–[50].O’Donnell and Eggemeier defined the workload as the portionof the operator’s limited capacity actually required to performa particular task [51]. Consequently, more mental resources arerequired as the cognitive workload is increased. If the cognitiveworkload exceeds the limit of the operator capacity, human errorsmay occur and then human performance would be deteriorated[52]. In advanced MCRs, advanced information technologiesare applied and thus the environment requires the operators toplay the role of supervisor or decision-maker rather than manualcontroller. The operator’s tasks are expected to require increasedmental activities rather than physical activities. Consequently,the evaluation of the cognitive workload has been consideredas one of the most important factors to be evaluated for theISV. Generally, techniques for measuring cognitive workloadcan be divided into two broad types: predictive and empirical[12]. Predictive techniques are usually based on mathematicalmodeling, task analysis, simulation modeling and expert’s opin-ions. These techniques do not require operators to participatein simulation exercises. Thus, they are typically used in the

early stages of design process and therefore, are thought not tobe suitable for the ISV stage [12]. Empirical techniques can bedivided into three types: performance-based, subjective ratings,and physiological measures [53]. Performance-based techniquesare categorized into primary task measures and secondary taskmeasures. Primary task measures are not suitable for the mea-surement of cognitive workload associated with monitoring ordecision-making tasks like in NPPs and secondary task measureshave the drawback that it can contaminate human performanceby interfering the primary tasks [44]. Subjective ratings tech-niques measure the cognitive workload experienced by a subject(or an operator) through a questionnaire and an interview. Sincesubjective measures have been found to be reliable, sensitiveto changes in workload level, minimally intrusive, diagnostic,easy to administer, independent of tasks (or relevant to a widevariety of tasks) and possessive of a high degree of operatoracceptance, they have been most frequently used in a variety ofdomains [54]–[60]. There are representative subjective measuressuch as overall workload (OW), modified cooper-harper scale(MCH), subjective workload assessment technique (SWAT),and national aeronautic and space administration task loadindex (NASA-TLX). Hill et al. verified the models of SWAT,NASA-TLX, OW, and MCH by examining the reliability ofthe methods and evaluated that NASA-TLX is superior in va-lidity and NASA-TLX and OW are superior in usability [60].Physiological techniques measure the physiological change ofautonomic or central nervous system associated with cognitiveworkload [44]. Electroencephalogram (EEG), evoked potential,hear rate related measures, and eye movement related mea-sures are representative tools for cognitive workload evaluationbased on the physiological measurements. Even though variousstudies reported that the EEG measures have proven sensitiveto variations of mental workload during tasks such as in-flightmission [63], [64], air traffic control [65], automobile driving[66], and so on, the use of EEG is thought to be limited for theISV, because usually multiple electrodes should be attached to anoperator’ head to measure the EEG signals, which may restrictthe operator’s activities and thus may contaminate the operator’sperformance in dynamic situations. With regard to evoked poten-tial (EP) or event relate potential (ERP) analysis, wave patternsregarding latencies and amplitudes of each peak are analyzedafter providing specific stimulations. The EP is thought not tobe applicable to the study on complex cognitive activities in theISV, because event evoking the EP should be simple and iteratedquite many times [67]. Measures of heart rate (HR) and heart ratevariability (HRV) have proven sensitive to variations in the diffi-culty of tasks such as flight maneuvers and phases of flight (e.g.,straight and level, takeoffs, landings) [68], [69], automobiledriving [66], air traffic control [65], and electroenergy processcontrol [70]. However since the heart rate related measures arelikely to be influenced by the physical or psychological stateof a subject, they do not always produce the same pattern ofeffects with regard to their sensitivity to mental workload andtask difficulty [71], [73]. The eye movement related measuresare generally based on blinking, fixation, and pupillary response.There have been lots of studies which suggested that the eyemovement related measures could be used as effective tools forthe evaluation of cognitive workload [74]–[78]. Conventionally,

Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.

Page 10: IEEE TRANSACTIONS ON NUCLEAR SCIENCE, …koasas.kaist.ac.kr/bitstream/10203/10013/1/JSHa1.pdfIEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007 2687 Development of

2696 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007

cumbersome equipments such as a head mounted eye trackingsystem were used to obtain the eye movement data, which isthought to be intrusive to the operator’s tasks and hence it wasconsidered to be inappropriate to the cognitive workload evalu-ation for the ISV. However recently there have been developedeye tracking systems which can measure the eye movementdata without direct contact (non-intrusively) [79], [80].

In this paper, NASA-TLX, a most widely used subjective rat-ings technique, is used as the main measure for the evaluationof cognitive workload and continuous measures based on eyemovement data are used as the complementary measures.

1) Main Measure: NASA-TLX: A subjective measure isconsidered as an indicator related to the participants’ internalexperience. As mentioned before, subjective rating techniqueshave been most widely used for the evaluation of the workloadin the various fields. Especially, NASA-TLX has been exten-sively used in multitask contexts such as real and simulatedflight tasks [81]–[84], air combat [85], [86], remote-controlof vehicles [87] and simulator-based NPP operation [10],[14]–[16], [78], [88], [89]. NASA-TLX is a recommendedinstrument for assessing cognitive workload by U.S. Nu-clear Regulatory Commission (NRC) [90]. In addition, theNASA-TLX results from antecedent studies for the APR-1400[10], [89] can be utilized as reference criteria for the ISV, whichis considered as an important merit. NASA-TLX divides theworkload experience into the six components: mental demand,physical demand, temporal demand, performance, effort, andfrustration [91]. After completion of a test, the operators subjec-tively assess their own workload on a rating scale and providethe description or the reason why they give the rating. In thispaper, the six questions used in NASA-TLX are made suchthat workload in an advanced NPP is compared with that in thealready licensed NPPs. Hence if the result of the NASA-TLX inan advanced NPP is evaluated as lower than or equal to that inthe licensed NPP, the result of the workload evaluation shouldbe considered as acceptable. Usually 7 point scale is used forthe measurement. The rating scale is not fixed but the use of7 point scale is recommended, because the antecedent studiesused 7 point scale. The evaluation criterion of this measure ishence based on the benchmark referenced comparison.

2) Complementary Measure: Continuous Measures Basedon Eye Movement Measurement: In the similar way to the eval-uation of SA, the subjective measure of the cognitive workloadcan be complemented by continuous measures based on eyemovement data. Since the NASA-TLX is evaluated subjectivelyafter the completion of a test, it is not possible to continuouslymeasure the operator’s workload and to secure the objectivity.Hence continuous measures based on eye movement data areutilized as complementary measures for the evaluation of thecognitive workload. Blink rate, blink duration, number of fixa-tion, and fixation dwell time are used as indices representing thecognitive workload. Blinking refers to a complete or partial clo-sure of the eye. Since visual input is disabled during eye closure,a reduced blink rate helps to maintain continuous visual input.The duration and the number of eye blinks should decrease whenthe cognitive demands of the task increase [75], [76]. A recentstudy showed that blink rates and duration during the diagnostictasks in simulated NPP operation correlated with NASA-TLX

and MCH scores, which means that they can be used as a cogni-tive workload index [78]. Even though some studies have shownthat higher level of arousal or attention increased the blink rate[92], [93], considering that the operator’s tasks in NPPs are aseries of cognitive activities, the increased blink rate can beused as a clue indicating the point that requires high level ofconcentration or attention. The eye fixation parameters includethe number of fixations on area of interest and the duration of thefixation, also called dwell time. The more the eye fixations aremade for a problem-solving, the more information processingis required. Longer fixation duration means that much time isrequired to correctly understand the relevant situation or object.In other words, if an operator experiences higher cognitiveworkload, the number of fixations and the fixation dwell timeare increased. The number of fixations and the fixation dwelltime were found to be sensitive to the measurement of the mentalworkload [74], [94]. More specifically, the dwell time can serveas an index of the resources required for information extractionfrom a single source [44]. Bellenkes et al. [95] found that dwellswere largest on the most information-rich flight instrument andthat dwells were much longer for novice than expert pilots,reflecting the novice’s greater workload. The authors also foundthat the subject with low expertise spent more time for fixation ona single component than the subject with high expertise duringcomplex diagnostic tasks in simulated NPP operations [47]. Inaddition, the eye fixation pattern (or visual scanning) can be usedas a diagnostic index of the source of workload within a multi-el-ement display environment [44]. The authors observed that morefrequent and extended dwells were made for the fixation on moreimportant instruments during the diagnostic tasks [47]. Bel-lenkes et al. [95] also found that long novice dwells were coupledwith more frequent visits and hence served as a major “sink” forvisual attention. Little time was left for novices to monitor otherinstruments, and as a result, their performance declined on tasksusing those other instruments. Consequently, the eye fixationparameters can be effectively used for evaluating the strategicaspects of resource allocation. The evaluation of these measuresshould be performed by SMEs to find out valuable aspects.Hence these measures are based on the expert-judgment refer-enced comparison. The author has performed an experimentalstudy to investigate the cognitive workload during complex di-agnostic tasks during simulated NPP operations [78]. This studyshowed that the eye movement related measures such as blinkrate, blink duration, number of fixation, and fixation dwell timecorrelate with NASA-TLX and MCH scores. Hence we concludethat continuous measures based on eye movement data are veryuseful tools for complementing the subjective rating measure.

E. Team Work

A NPP is operated by a crew not an individual operator. Thereare individual tasks which should be performed by the relevantoperators and there are some tasks which require cooperation ofthe crew. The cooperative tasks should be appropriately dividedand then allocated to the relevant operators to achieve the opera-tional goal. The advanced MCR of APR-1400 is equipped withthe large display panel designed to support team performanceby providing common reference display for discussions. The ad-vanced MCR design also allows operators to be located nearer to

Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.

Page 11: IEEE TRANSACTIONS ON NUCLEAR SCIENCE, …koasas.kaist.ac.kr/bitstream/10203/10013/1/JSHa1.pdfIEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007 2687 Development of

HA et al.: DEVELOPMENT OF HUMAN PERFORMANCE MEASURES FOR HUMAN FACTORS VALIDATION IN THE ADVANCED MCR OF APR-1400 2697

one another than the conventional MCRs and to access the plantinformation from workstations allocated to the relevant opera-tors for exclusive use. These interface changes are expected toimprove the operator performance by facilitating verbal and vi-sual communication among the operators [88], [96] and thus im-prove the team work. In order to evaluate the team work, BARS(behaviorally anchored rating scale) is used in this paper [88].The BARS include task focus/decision-making, coordination asa crew, communication, openness, and team spirit. In each ofthese components, several example behaviors (positive or neg-ative) and anchors (or critical behaviors) indicating good/badteam interactions are identified by SMEs (usually process expertand/or human factors expert) during a test. The example behav-iors and the anchors identified are used as criteria for final (oroverall) rating of team work by the SMEs after the test. Usually7 point scale (1–7) is used for the BARS ratings with 7 beingthe best team interaction. The rating scale is not fixed but theuse of 7 point scale is recommended, because the BARS resultswith 7 point scale from an antecedent study for the APR-1400[10] can be utilized as reference criteria. In this measure, atten-tion should be focused on the findings which are considered toinfluence the team work. Finally, the experts determine whetherthe teamwork is acceptable or not based on the experience andknowledge. Hence, the evaluation criterion of this measure isbased on the expert-judgment referenced comparisons.

F. Anthropometric-Physiological Factors

Anthropometric and physiological factors include such con-cerns as visibility and audibility of indication, accessibility ofcontrol devices to operator reach and manipulation, and the de-sign and arrangement of equipment [12]. Generally many of theconcerns are evaluated earlier in the design process with HFEV&V checklist. Since the ISV is a kind of feedback step forthe design validation and improvement, attention should be fo-cused on those anthropometric and physiological factors thatcan only be addressed in real or almost real (simulation withhigh fidelity) operating conditions; e.g., the ability of the oper-ators to effectively use or manipulate various controls, displays,workstations, or consoles in an integrated manner [12]. Con-sequently, items related to these factors in HFE V&V check-list are selected before the validation test and then reconfirmedduring the validation test by SMEs. Also it should be checkedwhether there are anthropometric and physiological problemscaused by unexpected design faults, which can be performedduring the test or after the test with audio/video (AV) recordingdata. The evaluation criterion of this measure is hence based onboth the requirement referenced (HFE V&V checklist) and theexpert-judgment referenced comparisons.

IV. DISCUSSIONS: STRATEGIES FOR EFFECTIVE HUMAN

PERFORMANCE EVALUATION

In this paper, human performance measures are developed forhuman factors validation, so called ISV, in the advanced MCR ofthe APR-1400. The measures for plant performance, personneltask performance, situation awareness, workload, team work,and anthropometric/physiological factors are thought to pro-vide multilateral information for the ISV. Preferably the eval-uation of human performance should be performed in an inte-

grated manner to produce results with the most information. Inorder for the human performance to be effectively evaluated, thetimes of operators’ activities should be recorded during the val-idation test. The operators’ activities include the bottom-ranktasks considered in the evaluation of the personnel task per-formance, the example behaviors and the critical behaviors inthe teamwork evaluation, and activities belonging to the an-thropometric and physiological factors. The time-tagging canbe easily conducted with a computerized system. All that SMEs(as evaluators) have to do are just to check items listed in a com-puterized system based on their observation. The computerizedsystem can record automatically the checked items and the rel-evant times. This time-tagged information can facilitate the in-tegrated evaluation of the human performance. Firstly, the plantperformance can be connected to personnel task performancewith the time-tagged information. A computerized system forhuman performance evaluation can be connected with the rel-evant simulator to acquire logging data representing the plantstate (e.g., process parameters and alarms) and control activitiesperformed by operators. It can be evaluated whether the plantsystem is well operated or not by observing and evaluating theprocess parameters. Even though the plant performance is main-tained within acceptable ranges, design faults or shortcomingsmay require unnecessary work or inappropriate manner of op-eration. This kind of problem can be solved by analyzing thesystem state with the operators’ activities. If the operators’ ac-tivities are time-tagged, the system state can be analyzed withthe operators’ activities. Since the logging data provided by thesimulator are time-tagged as well, inappropriate or unnecessaryactivities performed by the operators can be compared with thelogging data representing the plant state. This kind of analysiscan provide diagnostic information on the operators’ activities.For example, if the operators should navigate the workstationor move around in a scrambled way in order to operate theNPP within acceptable ranges, the MMI design of the MCRshould be considered as inappropriate and thus some revisionsshould be followed, even though the plant performance is main-tained within acceptable ranges. Secondly, the eye tracking mea-sures for the SA and workload evaluation can be connected tothe personnel task performance with the time-tagged informa-tion. The eye tracking measures can be analyzed for each ofthe tasks defined in the optimal solution. This means that wecan evaluate the SA and workload in each task step by consid-ering the cognitive aspects specified by the task attribute, whichis expected to increase the level of detail for the measurement.Also the eye fixation data can be used for determining whetherthe operator are monitoring and detecting the environment cor-rectly or not. This information can be used for the evaluationof personnel task performance. Thirdly, the evaluation of thepersonnel task performance, the teamwork, and the anthropo-metric/physiological factors can be analyzed in an integratedmanner with the time-tagged information, which is expected toprovide diagnostic information for the human performance eval-uation. Teamwork is required in the context of the operators’tasks many of which would be the series of cognitive activities.The example behaviors and the critical behaviors attributable tothe teamwork can be investigated in the series of the operators’tasks with time line analysis. Hence, it can be analyzed whether

Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.

Page 12: IEEE TRANSACTIONS ON NUCLEAR SCIENCE, …koasas.kaist.ac.kr/bitstream/10203/10013/1/JSHa1.pdfIEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007 2687 Development of

2698 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007

Fig. 4. Human performance evaluation with HUPESS.

behaviors attributable to the teamwork contribute to good orpoor performance of the operators’ tasks or whether the oper-ators’ tasks overloaded inhibit the teamwork. Also the anthro-pometric/physiological problems unexpected in advance but ob-served during a test can be analyzed in the context of the oper-ators’ tasks, which may be useful for analyzing the cause of theanthropometric/physiological problems. Finally, AV recordingdata can be effectively utilized with the real-time evaluationdata. The AV recording data can provide the information whichmay be missed or not processed by SMEs during a test.

In addition, considering that the operators’ tasks in NPPs aregenerally based on the goal-oriented procedure, the operators’tasks are analyzed and then constructed into an optimal solu-tion in a hierarchical form. The optimal solution consists of themain goal, the sub-goals, the observable cognitive tasks, and thesub-tasks. The relative importance (or weight value) of the ele-ments in the optimal solution is obtained by using the AHP. Thismeans that the operators’ tasks can be ranked with the weightvalues of the tasks. Hence, we can allocate the analysis resourcesaccording the relative importance of the tasks. For example, aspecific task in a context of the operators’ tasks is more impor-tant than other tasks, we can analyze the specific task with moreresources (e.g., more time can be allocated or more additionalconsideration can be allocated to the analysis). It is expectedthat the analysis of the human performance in a test takes a lotof time and moreover many tests covering sufficient spectrumof the operational situations in the NPP should be performed tovalidate the MMI design. Consequently, the importance-basedapproach is thought to be an efficient strategy.

The authors have been developing a computerized systemfor the human performance evaluation which is called “HUmanPerformance Evaluation Support System (HUPESS)”. Thissystem is developed based on the measures and the strategyconsidered in this paper. Hence, the authors expect that thehuman performance can be evaluated in an integrated and ef-fective way with the HUPESS. Let us introduce the HUPESS inbrief. The HUPESS is interfaced with the APR-1400 simulatorwhich is equipped in the dynamic mock-up, as shown in Fig. 4.

The HUPESS acquires the simulator logging data during atest. The logging data include the data representing the plantsystem events and status (e.g., status change of controlled com-ponents, alarms and flags, and process variables/parameters) andthe data representing operator activities (e.g., display navigation,

alarm control, soft control, and CPS (computerized proceduresystem) control/navigation). The plant performance is evaluatedwith the logging data by the HUPESS during the test. The op-erators are operating the plant system during the test and thenevaluate the KSAX and the NASA-TLX after the test. The SMEsare observing the operators’ activities and checking the activitiesrelated to the personnel task performance, the teamwork, andthe anthropometric/physiological factors during the test andcomplete the evaluations of the personnel task performance andthe BARS based on the observations after the test. The HUPESSincludes an eye tracking system (ETS) and an AV recordingsystem. The HUPESS acquire the eye tracking data from the ETSand process them into the measures for the SA and the workloadevaluations. Also, the test is recorded by the AV system. The dataobserved and checked, evaluated, and recorded during the testcan be further evaluated by time line analysis in an integratedway. In addition, the HUPESS has very useful functions such asvarious statistical analyses and convenient reporting function.The HUPESS was designed to be effectively used for the ISVin the advanced MCR of the APR-1400 through reviews bySMEs including 1 process expert and 2 human factors experts.Consequently, the HUPESS is expected to be used as an effectivetool for the ISV in the advanced MCR of Shin Kori 3 & 4 NPPs(APR-1400 type) which are under construction in South-Korea.

V. SUMMARY AND CONCLUSIONS

The MMI design in the advanced MCR of APR-1400 canbe validated through performance-based tests to determinewhether it acceptably supports safe operation of the plant. Inthis paper, plant performance, personnel task performance,situation awareness, workload, teamwork, and anthropo-metric/physiological factor are considered as factors for thehuman performance evaluation. For the development of mea-sures in each of the factors, attention is paid to considerationsand constraints such as the changed environment in an advancedMCR, needs for a practical and economic evaluation, and suit-ability of evaluation criteria. Measures generally used in variousindustries and empirically proven to be useful are adopted asmain measures with some modifications. In addition, helpfulmeasures are developed as complementary measures in orderto overcome some of the limitations associated with the mainmeasures. The development of the measures is addressed basedon the theoretical and empirical background and also based onthe regulatory guidelines for the ISV such as NUREG-0711 andNUREG/CR-6393. Consequently, we conclude that the mea-sures developed in this paper can be effectively used for the ISVin the advanced MCR of the APR-1400. Also a computerizedsystem for the human performance evaluation, called HUPESS,is briefly introduced. The HUPESS is in development basedon the measures developed and the strategies discussed in thispaper. The HUPESS is expected to be used as an effective toolfor the ISV in the advanced MCR of Shin Kori 3 & 4 NPPs(APR-1400 type) which are under construction in South-Korea.

ACKNOWLEDGMENT

The authors would like to thank J. C. Ra and S. B. Jo ofKorea Power Engineering Company (KOPEC) and Y. C. Shinand J. H. Kim of Korea Hydro and Nuclear Power (KHNP) for

Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.

Page 13: IEEE TRANSACTIONS ON NUCLEAR SCIENCE, …koasas.kaist.ac.kr/bitstream/10203/10013/1/JSHa1.pdfIEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007 2687 Development of

HA et al.: DEVELOPMENT OF HUMAN PERFORMANCE MEASURES FOR HUMAN FACTORS VALIDATION IN THE ADVANCED MCR OF APR-1400 2699

their continuous support and valuable comments. The authorsalso express their gratitude to Prof. S. N. Byun of KyungheeUniversity, Prof. J. H. Park of Hankyong National University,and Dr. S. N. Choi of Korea Institute of Nuclear Safety (KINS)for their valuable advices, comments, and encouragement.

REFERENCES

[1] Functional Criteria for Emergency Response Facilities US NuclearRegulatory Commission. Washington, DC, 1980, NUREG-0696.

[2] Clarification of TMI Action Plan Requirements US Nuclear RegulatoryCommission. Washington, DC, 1980, NUREG-0737.

[3] J. M. O’Hara, W. S. Brown, P. M. Lewis, and J. J. Persensky, Human-System Interface Design Review Guidelines NUREG-0700, Rev.2, USNRC, 2002.

[4] J. M. O’Hara, J. C. Higgins, J. J. Persensky, P. M. Lewis, and J. P.Bongarra, Human Factors Engineering Program Review Model 2004,NUREG-0711, Rev.2, US NRC.

[5] M. Barriere, D. Bley, S. Cooper, J. Forester, A. Kolaczkowski, W.Luckas, G. Parry, A. Ramey-smith, C. Thompson, D. Whitehead, and J.Wreathall, Technical Basis and Implementation Guidelines for a Tech-nique for Human Event Analysis (ATHEANA) 2000, Rev.01, NUREG-1624, US NRC.

[6] S. H. Chang, S. S. Choi, J. K. Park, G. Heo, and H. G. Kim, “Devel-opment of an advanced human-machine interface for next generationnuclear power plants,” Reliab. Eng. Syst. Safety, vol. 64, pp. 109–126,1999.

[7] I. S. Kim, “Computerized systems for on-line management of failures:A state-of-the-art discussion of alarm systems and diagnostic systemsapplied in the nuclear industry,” Reliab. Eng. Syst. Safety, vol. 44, pp.279–295, 1994.

[8] H. Yoahikawa, T. Nakagawa, Y. Nakatani, T. Furuta, and A. Hasegawa,“Development of an analysis support system for man-machine systemdesign information,” Contr. Eng. Practice, vol. 5, no. 3, pp. 417–425,1997.

[9] H. Yoshikawa, “Human-machine interaction in nuclear power plants,”Nucl. Eng. Technol., vol. 37, no. 2, pp. 151–158, 2005.

[10] S. J. Cho et al., “The Evaluation of Suitability for the Design of SoftControl and Safety Console for APR1400,”. Daejeon, Korea, 2003,KHNP, TR. A02NS04.S2003.EN8.

[11] T. B. Sheridan, Telerobotics, Automation, and Human SupervisoryControl. Cambridge, MA: MIT Press, 1992.

[12] J. M. O’Hara, W. F. Stubler, J. C. Higgins, and W. S. Brown, In-tegrated System Validation: Methodology and Review Criteria 1997,NUREG/CR-6393, US NRC.

[13] G. Andresen and A. Drøivoldsmo, Human Performance Assessment:Methods and Measures 2000, HPR-353, OECD Halden ReactorProject.

[14] P. Ø. Braarud, Subjective Task Complexity in Control Room 2000,HWR-621, OECD Halden Reactor Project.

[15] P. Ø. Braarud and H. Brendryen, Task Demand, Task Management, andTeamwork 2001, HWR-657, OECD Halden Reactor Project.

[16] A. Drøivoldsmo et al., Continuous Measure of Situation Awareness andWorkload 1988, HWR-539, OECD Halden Reactor Project.

[17] M. Moracho, Plant Performance Assessment System (PPAS) for CrewPerformance Evaluations. Lessons Learned from an Alarm StudyConducted in HAMMLAB 1998, HWR-504, OECD Halden ReactorProject.

[18] G. Jr. Skraning, The Operator Performance Assessment System(OPAS) HWR-538, OECD Halden Reactor Project, 1998.

[19] B. S. Sim et al., The Development of Human Factors Technologies:The Development of Human Factors Experimental Evaluation Tech-niques. Daejeon, Korea, 1996, KAERI/RR-1693.

[20] T. L. Saaty, The Analytic Hierarchy Process. New York: McGraw-Hill, 1980.

[21] E. Hollnagel, Cognitive Reliability and Error Analysis Method. Am-sterdam, The Netherlands: Elsevier, 1998.

[22] J. Kemeny, The Need for Change: The Legacy of TMI, Report of thePresident’s Commission on the Accident at Three Miles Island. NewYork: Pergamon, 1979.

[23] M. J. Adams, Y. J. Tenney, and R. W. Pew, “Situation awareness andcognitive management of complex system,” Human Factors, vol. 37,no. 1, pp. 85–104, 1995.

[24] F. T. Durso and S. Gronlund, “Situation awareness,” in The Handbookof Applied Cognition, F. T. Durso, R. Nickerson, R. W. Schvaneveldt,S. T. Dumais, D. S. Lindsay, and M. T. H. Chi, Eds. New York: Wiley,1999, pp. 284–314.

[25] M. R. Endsley and D. J. Garland, Eds., Situation Awareness: Analysisand Measurement. Mahwah, NJ: Erlbaum, 2001.

[26] C. P. Gibson and A. J. Garrett, “Toward a future cockpit-the proto-typing and pilot integration of the mission management aid (MMA),”presented at the The Situational Awareness in Aerospace Operations,Copenhagen, Denmark, 1990, unpublished.

[27] R. M. Taylor, “Situational Awareness Rating Technique (SART): Thedevelopment of a tool for aircrew systems design,” presented at the TheSituational Awareness in Aerospace Operations, Copenhagen, Den-mark, 1990, unpublished.

[28] M. M. Wesler, W. P. Marshak, and M. M. Glumm, “Innovative mea-sures of accuracy and situational awareness during landing navigation,”presented at the The Human Factors and Ergonomics Society 42nd An-nual Meeting, 1998, unpublished.

[29] M. R. Endsley, “Toward a theory of situation awareness in dynamicsystems,” Human Factors, vol. 37, no. 1, pp. 32–64, 1995.

[30] D. H. Lee and H. C. Lee, “A review on measurement and applications ofsituation awareness for an evaluation of Korea next generation reactoroperator performance,” IE Interface, vol. 13, no. 4, pp. 751–758, 2000.

[31] R. E. Nisbett and T. D. Wilson, “Telling more than we can know: Verbalreports on mental process,” Psycholog. Rev., vol. 84, pp. 231–295, 1997.

[32] M. R. Endsley, , M. R. Endsley and D. J. Garland, Eds., “Direct mea-surement of situation awareness: Validity and use of SAGAT,” in Situa-tion Awareness Analysis and Measurement. Mahwah, NJ: LawrenceErlbaum, 2000.

[33] M. R. Endsley, , T. G. O. Brien and S. G. Charlton, Eds., “Situationawareness measurement in test and evaluation,” in Handbook of HumanFactors Testing and Evaluation. Mahwah, NJ: Lawrence Erlbaum,1996.

[34] N. B. Sarter and D. D. Woods, “Situation awareness: A critical but ill-defined phenomenon,” Int. J. Aviation Psychol., vol. 1, no. 1, pp. 45–57,1991.

[35] R. W. Pew, , M. R. Endsley and D. J. Garland, Eds., “The state ofsituation awareness measurement: Heading toward the next century,”in Situation Awareness Analysis and Measurement. Mahwah, NJ:Lawrence Erlbaum, 2000.

[36] M. R. Endsley, “A methodology for the objective measurement of sit-uation awareness,” in Situational Awareness in Aerospace Operations,Neuilly-Sur-Seine, France, 1990, (AGARD-CP-478; pp. 1/1-1/9),NATO-AGARD.

[37] M. R. Endsley, “The out-of-the-loop performance problem and level ofcontrol in automation,” Human Factors, vol. 37, no. 2, pp. 381–394,1995.

[38] S. G. Collier and K. Folleso, , D. J. Garland and M. R. Endsley,Eds., “SACRI: A measure of situation awareness for nuclear powerplant control rooms,” in Experimental Analysis and Measurementof Situation Awareness. Daytona Beach, FL: Embri-Riddle Univ.Press, 1995, pp. 115–122.

[39] D. N. Hogg, K. Fosllesø, F. S. Volden, and B. Torralba, “Developmentof a situation awareness measure to evaluate advanced alarm systemsin nuclear power plant control rooms,” Ergonomics, vol. 38, no. 11, pp.2394–2413, 1995.

[40] M. L. Fracker and M. A. Vidulich, , Y. Queinnec and F. Daniellou,Eds., “Measurement of situation awareness: A brief review,” in Proc.11th Congr. Int. Ergonomics Association Designing for Everyone, .London, U.K.: Taylor & Francis, 1991, pp. 795–797.

[41] M. R. Endsley, “Measurement of situation awareness in dynamic sys-tems,” Human Factors, vol. 37, no. 1, pp. 65–84, 1995.

[42] G. F. Wilson, , M. R. Endsley and D. J. Garland, Eds., “Strategies forpsychophysiological assessment of situation awareness,” in SituationAwareness Analysis and Measurement. Mahwah, NJ: Lawrence Erl-baum, 2000.

[43] R. M. Taylor, “Situational awareness rating technique (SART): Thedevelopment of a tool for aircrew systems design,” in SituationalAwareness Aerospace Operations, Neuilly-Sur-Seine, France, 1990,pp. 3/1–3/17, AGARD-CP-478, NATO- AGARD.

[44] C. D. Wickens and J. G. Hollands, Engineering Psychology and HumanPerformance, 3rd ed. Englewood Cliffs, NJ: Prentice Hall, 2000.

[45] J. M. O’Hara, J. C. Higgins, W. F. Stubler, and J. Kramer, Computer-Based Procedure Systems: Technical Basis and Human Factors ReviewGuidance 2002, NUREG/CR-6634, US NRC.

[46] M. C. Kim and P. H. Seong, “A computational model for knowledge-driven monitoring of nuclear power plant operators based on informa-tion theory,” Reliab. Eng. Syst. Safety, vol. 91, pp. 283–291, 2006.

[47] J. S. Ha and P. H. Seong, “An experimental study: EEG analysis witheye fixation data during complex diagnostic tasks in nuclear powerplants,” presented at the Int. Symp. Future I&C for NPPs (ISOFIC),Chungmu, Korea, 2005.

Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.

Page 14: IEEE TRANSACTIONS ON NUCLEAR SCIENCE, …koasas.kaist.ac.kr/bitstream/10203/10013/1/JSHa1.pdfIEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007 2687 Development of

2700 IEEE TRANSACTIONS ON NUCLEAR SCIENCE, VOL. 54, NO. 6, DECEMBER 2007

[48] C. D. Wickens, “Workload and situation awareness: An analogy of his-tory and implications,” Insight, vol. 94, 1992.

[49] N. Moray, Mental Workload: Its Theory and Measurement. NewYork: Plenum Press, 1979.

[50] P. Hancock and N. Meshkati, in Human Mental Workload, NY, 1988,North-Holland.

[51] R. D. O’Donnell and F. T. Eggemeier, “Workload assessment method-ology,” in Handbook of Perception and Human Performance: Vol. II.Cognitive Processes and Performance, K. R. Boff, L. Kaufman, and J.Thomas, Eds. New York: Wiley, 1986.

[52] D. A. Norman and D. G. Bobrow, “On data-limited and resource-lim-ited process,” Cognit. Psychol., vol. 7, pp. 44–64, 1975.

[53] R. Williges and W. W. Wierwille, “Behavioral measures of aircrewmental workload,” Human Factors, vol. 21, pp. 549–574, 1979.

[54] S. G. Charlton, , S. G. Charlton and T. G. O. Brien, Eds., “Measurementof cognitive states in test and evaluation,” in Handbook of Human Fac-tors Testing and Evaluation. Mahwah, NJ: Lawrence Erlbaum , 2002.

[55] F. T. Eggemeier and G. F. Wilson, “Subjective and performance-basedassessment of workload in multi-task environments,” in Multiple TaskPerformance, D. Damos, Ed. London, U.K.: Taylor & Francis, 1991.

[56] S. Rubio, E. Diaz, J. Martin, and J. M. Puente, “Evaluation of subjectivemental workload: A comparison of SWAT, NASA-TLX, and workloadprofile,” Appl. Psychol., vol. 53, pp. 61–86, 2004.

[57] W. W. Wierwille, M. Rahimi, and J. G. Casali, “Evaluation of 16 mea-sures of mental workload using a simulated flight task emphasizingmediational activity,” Human Factors, vol. 27, pp. 489–502, 1985.

[58] G. Johannsen, N. Moray, R. Pew, J. Rasmussen, A. Sanders, and C.Wickens, “Final report of the experimental psychology group,” inMental Workload: Its Theory and Measurement, N. Moray, Ed. NewYork: Plenum, 1979.

[59] N. Moray, “Subjective mental workload,” Human Factors, vol. 24, pp.25–40, 1982.

[60] S. G. Hill, H. P. Iavecchia, J. C. Byers, A. C. Bittier, A. L. Zaklad, andR. E. Christ, “Comparison of Four Subjective Workload Rating Scales,”Human Factors, vol. 34, pp. 429–440, 1992.

[61] B. Sterman and C. Mann, “Concepts and applications of EEG anal-ysis in aviation performance evaluation,” Biol. Psychol., vol. 40, pp.115–130, 1995.

[62] A. F. Kramer, E. J. Sirevaag, and R. Braune, “A psychophysiolog-ical assessment of operator workload during simulated flight missions,”Human Factors, vol. 29, no. 2, pp. 145–160, 1987.

[63] J. Brookings, G. F. Wilson, and C. Swain, “Psycho-physiological re-sponses to changes in workload during simulated air traffic control,”Biol. Psychol., vol. 42, pp. 361–378, 1996.

[64] K. A. Brookhuis and D. D. Waard, “The use of psychophysiology toassess driver status,” Ergonomics, vol. 36, pp. 1099–1110, 1993.

[65] E. Donchin and M. G. H. Coles, “Is the P300 component a manifes-tation of cognitive updating?,” The Behavioral and Brain Science, vol.11, pp. 357–427, 1988.

[66] L. C. Boer and J. A. Veltman, “From workload assessment to systemimprovement,” presented at the The NATO Workshop on Technologiesin Human Engineering Testing and Evaluation, Brussels, 1997, unpub-lished.

[67] A. H. Roscoe, “Heart Rate Monitoring of Pilots during Steep Gra-dient Approaches,” Aviation, Space Environmental Med., vol. 46, pp.1410–1415, 1975.

[68] R. Rau, “Psychophysiological assessment of human reliability in a sim-ulated complex system,” Biol. Psychol., vol. 42, pp. 287–300, 1996.

[69] A. F. Kramer and T. Weber, , J. T. Cacioppo, Ed. et al., “Applicationof Psychophysiology to Human Factors,” in Handbook of Psychophysi-ology. Cambridge, U.K.: Cambridge Univ. Press, 2000, pp. 794–814.

[70] P. G. A. M. Jorna, “Spectral analysis of heart rate and psychologicalstate: A review of its validity as a workload index,” Biol. Psychol., vol.34, pp. 237–257, 1992.

[71] L. J. M. Mulder, “Measurement and analysis methods of heart rate andrespiration for use in applied environments,” Biol. Psychol., vol. 34, pp.205–236, 1992.

[72] S. W. Porges and E. A. Byrne, “Research methods for the measurementof heart rate and respiration,” Biol. Psychol., vol. 34, pp. 93–130, 1992.

[73] G. F. Wilson, “Applied use of cardiac and respiration measure:Practical considerations and precautions,” Biol. Psychol., vol. 34, pp.163–178, 1992.

[74] Y. Lin, W. J. Zhang, and L. G. Watson, “Using eye movement parame-ters for evaluating human-machine interface frameworks under normalcontrol operation and fault detection situations,” Int. J. Human Com-puter Studies, vol. 59, pp. 837–873, 2003.

[75] J. A. Veltman and A. W. K. Gaillard, “Physiological indices of work-load in a simulated flight task,” Biol. Psychol., vol. 42, pp. 323–342,1996.

[76] L. O. Bauer, R. Goldstein, and J. A. Stern, “Effects of information-pro-cessing demands on physiological response patterns,” Human Factors,vol. 29, pp. 219–234, 1987.

[77] J. H. Goldberg and X. P. Kotval, “Eye movement-based evaluation ofthe computer interface,” in Advances in Occupational Ergonomics andSafety, S. K. Kumar, Ed. Amsterdam, The Netherlands: IOS Press,1998.

[78] C. H. Ha and P. H. Seong, “Investigation on Relationship betweenInformation Flow Rate and Mental Workload of Accident DiagnosisTasks in NPPs,” IEEE Trans. Nucl. Sci., vol. 53, no. 3, pp. 1450–1459,Jun. 2006.

[79] , [Online]. Available: http://www.seeingmachines.com/[80] , [Online]. Available: http://www.smarteye.se/home.html[81] R. Shively, V. Battiste, J. Matsumoto, D. Pepiton, M. Bortolussi, and S.

Hart, “In flight evaluation of pilot workload measures for rotorcraft re-search,” in Proc. 4th Symp. Aviation Psychology, Columbus, OH, 1987,pp. 637–643.

[82] V. Battiste and M. Bortolussi, “Transport pilot workload: A compar-ison of two subjective techniques,” in Proc. Human Factors Society32nd Ann. Meeting, Santa Monica, CA, 1988, pp. 150–154.

[83] M. Nataupsky and T. S. Abbott, “Comparison of workload measureson computer-generated primary flight displays,” in Proc HumanFactors Society 31st Ann. Meeting, Santa Monica, CA, 1987, pp.548–552.

[84] P. S. Tsang and W. W. Johnson, “Cognitive demand in automation,”Aviation, Space, Experiment. Med., vol. 60, pp. 130–135, 1989.

[85] A. V. Bittner, J. C. Byers, S. G. Hill, A. L. Zaklad, and R. E. Christ,“Generic workload ratings of a mobile air defense system (LOS-F-H),”in Proc. Human Factors Society 33rd Ann. Meeting, Santa Monica, CA,1989, pp. 1476–1480.

[86] S. G. Hill, J. C. Byers, A. L. Zaklad, and R. E. Christ, “Workload as-sessment of a mobile air defences system,” in Proc. Human FactorsSociety 32nd Ann. Meeting, Santa Monica, CA, 1988, pp. 1068–1072.

[87] J. C. Byers, A. V. Bittner, S. G. Hill, A. L. Zaklad, and R. E. Christ,“Workload assessment of a remotely piloted vehicle (RPV) system,” inProc. Human Factors Society 32nd Ann. Meeting, Santa Monica, CA,1988, pp. 1145–1149.

[88] A. Sebok, “Team Performance in Process Control: Influences of Inter-face Design and Staffing,” Ergonomics, vol. 43, no. 8, pp. 1210–1236,2000.

[89] S. N. Byun and S. N. Choi, “An evaluation of the operator mental work-load of advanced control facilities in Korea next generation reactor,” J.Korean Inst. Indust. Eng., vol. 28, no. 2, pp. 178–186, 2002.

[90] C. Plott, T. Engh, and V. Bames, Technical Basis for Regulatory Guid-ance for Assessing Exemption Requests From the Nuclear Power PlantLicensed Operator Staffing Requirements Specified in 10 CFR 50.542004, NUREG/CR-6838, US NRC.

[91] S. G. Hart and L. E. Staveland, “Development of NASA-TLX (TaskLoad Index): Results of empirical and theoretical research,” in HumanMental Workload, P. A. Hancock and N. Meshkati, Eds. Amsterdam,The Netherlands: North-Holland, 1988.

[92] J. A. Stern, L. C. Walrath, and R. Golodstein, “The endogenous eye-blink,” Psychophysiology, vol. 21, pp. 22–23, 1984.

[93] Y. Tanaka and K. Yamaoka, “Blink activity and task difficulty,” Per-ceptual Motor Skills, vol. 77, pp. 55–66, 1993.

[94] J. H. Goldberg and X. P. Kotval, “Eye movement-based evaluation ofthe computer interface,” in Advances in Occupational Ergonomics andSafety, S. K. Kumar, Ed. Amsterdam, The Netherlands: IOS Press,1998.

[95] A. H. Bellenkes, C. D. Wickens, and A. F. Kramer, “Visual scanningand pilot expertise: the role of attentional flexibility and mental modeldevelopment,” Aviation, Space, Environment. Med., vol. 68, no. 7, pp.569–579, 1997.

[96] E. M. Roth, R. J. Mumaw, and W. F. Stubler, “Human factors eval-uation issues for advanced control rooms: A research agenda,” Proc.IEEE, pp. 254–265, 1993.

[97] J. M. O’Hara and R. E. Hall, “Advanced control rooms and crew perfor-mance issues: Implications for human reliability,,” IEEE Trans. Nucl.Sci., vol. 39, no. 4, pp. 919–923, Aug. 1992.

[98] P. Ø. Braarud and G. Jr. Skraaning, “Insights from a benchmark in-tegrated system validation of a modernized npp control room: Perfor-mance measurement and the comparison to the benchmark system,” inNPIC&HMIT 2006, Albuquerque, NM, Nov. 2006, pp. 12–16.

Authorized licensed use limited to: Korea Advanced Institute of Science and Technology. Downloaded on July 5, 2009 at 08:18 from IEEE Xplore. Restrictions apply.