30. automating errors and conflicts prognostics automating...

23
503 Automating E 30. Automating Errors and Conflicts Prognostics and Prevention Xin W. Chen, Shimon Y. Nof Errors and conflicts exist in many systems. A fun- damental question from industries is How can errors and conflicts in systems be eliminated by automation, or can we at least use automa- tion to minimize their damage? The purpose of this chapter is to illustrate a theoretical back- ground and applications of how to automatically prevent errors and conflicts with various de- vices, technologies, methods, and systems. Eight key functions to prevent errors and conflicts are identified and their theoretical background and applications in both production and service are explained with examples. As systems and net- works become larger and more complex, such as global enterprises and the Internet, error and conflict prognostics and prevention become more important and challenging; the focus is shifting from passive response to proactive prog- nostics and prevention. Additional theoretical developments and implementation efforts are needed to advance the prognostics and preven- tion of errors and conflicts in many real-world applications. 30.1 Definitions ........................................... 503 30.2 Error Prognostics and Prevention Applications .................. 506 30.2.1 Error Detection in Assembly and Inspection ........................... 506 30.2.2 Process Monitoring and Error Management ................ 506 30.2.3 Hardware Testing Algorithms ........ 507 30.2.4 Error Detection in Software Design 509 30.2.5 Error Detection and Diagnostics in Discrete-Event Systems ............ 510 30.2.6 Error Detection in Service and Healthcare Industries ............ 511 30.2.7 Error Detection and Prevention Algorithms for Production and Service Automation ............... 511 30.2.8 Error-Prevention Culture (EPC) ...... 512 30.3 Conflict Prognostics and Prevention ....... 512 30.4 Integrated Error and Conflict Prognostics and Prevention .................................... 513 30.4.1 Active Middleware ....................... 513 30.4.2 Conflict and Error Detection Model 514 30.4.3 Performance Measures ................. 515 30.5 Error Recovery and Conflict Resolution.... 515 30.5.1 Error Recovery............................. 515 30.5.2 Conflict Resolution ...................... 520 30.6 Emerging Trends .................................. 520 30.6.1 Decentralized and Agent-Based Error and Conflict Prognostics and Prevention ........................... 520 30.6.2 Intelligent Error and Conflict Prognostics and Prevention .......... 521 30.6.3 Graph and Network Theories ........ 521 30.6.4 Financial Models for Prognostics Economy .............. 521 30.7 Conclusion ........................................... 521 References .................................................. 522 30.1 Definitions All humans commit errors (“To err is human”) and en- counter conflicts. In the context of automation, there are two main questions: (1) Does automation com- mit errors and encounter conflicts? (2) Can automation help humans prevent errors and eliminate conflicts? All human-made automation includes human-committed errors and conflicts, for example, human program- ming errors, design errors, and conflicts between Part C 30

Upload: others

Post on 12-Jul-2020

35 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

503

Automating E30. Automating Errors and Conflicts Prognosticsand Prevention

Xin W. Chen, Shimon Y. Nof

Errors and conflicts exist in many systems. A fun-damental question from industries is How canerrors and conflicts in systems be eliminated byautomation, or can we at least use automa-tion to minimize their damage? The purpose ofthis chapter is to illustrate a theoretical back-ground and applications of how to automaticallyprevent errors and conflicts with various de-vices, technologies, methods, and systems. Eightkey functions to prevent errors and conflicts areidentified and their theoretical background andapplications in both production and service areexplained with examples. As systems and net-works become larger and more complex, suchas global enterprises and the Internet, errorand conflict prognostics and prevention becomemore important and challenging; the focus isshifting from passive response to proactive prog-nostics and prevention. Additional theoreticaldevelopments and implementation efforts areneeded to advance the prognostics and preven-tion of errors and conflicts in many real-worldapplications.

30.1 Definitions ........................................... 503

30.2 Error Prognosticsand Prevention Applications .................. 50630.2.1 Error Detection in Assembly

and Inspection ........................... 50630.2.2 Process Monitoring

and Error Management ................ 506

30.2.3 Hardware Testing Algorithms ........ 50730.2.4 Error Detection in Software Design 50930.2.5 Error Detection and Diagnostics

in Discrete-Event Systems ............ 51030.2.6 Error Detection in Service

and Healthcare Industries ............ 51130.2.7 Error Detection and Prevention

Algorithms for Productionand Service Automation ............... 511

30.2.8 Error-Prevention Culture (EPC) ...... 512

30.3 Conflict Prognostics and Prevention ....... 512

30.4 Integrated Error and Conflict Prognosticsand Prevention .................................... 51330.4.1 Active Middleware ....................... 51330.4.2 Conflict and Error Detection Model 51430.4.3 Performance Measures ................. 515

30.5 Error Recovery and Conflict Resolution.... 51530.5.1 Error Recovery............................. 51530.5.2 Conflict Resolution ...................... 520

30.6 Emerging Trends .................................. 52030.6.1 Decentralized and Agent-Based

Error and Conflict Prognosticsand Prevention ........................... 520

30.6.2 Intelligent Error and ConflictPrognostics and Prevention .......... 521

30.6.3 Graph and Network Theories ........ 52130.6.4 Financial Models

for Prognostics Economy .............. 521

30.7 Conclusion ........................................... 521

References .................................................. 522

30.1 Definitions

All humans commit errors (“To err is human”) and en-counter conflicts. In the context of automation, thereare two main questions: (1) Does automation com-mit errors and encounter conflicts? (2) Can automation

help humans prevent errors and eliminate conflicts? Allhuman-made automation includes human-committederrors and conflicts, for example, human program-ming errors, design errors, and conflicts between

PartC

30

Page 2: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

504 Part C Automation Design: Theory, Elements, and Methods

Table 30.1 Examples of errors and conflicts in production automation

Error Conflict

• A robot drops a circuit board while moving itbetween two locations• A machine punches two holes on a metal sheetwhile only one is needed, because the size of themetal sheet is recognized incorrectly by the visionsystem• A lathe stops processing a shaft due to poweroutage• The server of a computer-integrated manufactur-ing system crashes due to high temperature• A facility layout generated by a software pro-gram cannot be implemented due to irregular shapes

• Two numerically controlled machines requesthelp from the same operator at the same time• Three different software packages are used togenerate optimal schedule of jobs for a productionfacility; the schedules generated are totally different• Two automated guided vehicles collide• A DWG (drawing) file prepared by an engineerwith AutoCAD cannot be opened by another engi-neer with the same software• Overlapping workspace defined by two cooperat-ing robots

human planners. Two automation systems, designedseparately by different human teams, will encounterconflicts when they are expected to collaborate, forinstance, the need for communication protocol stan-dards to enable computers to interact automatically.Some errors and conflicts are inherent to automa-tion, similar to all human-made creations, for instance,a robot mechanical structure that collapses under weightoverload.

An error is any input, output or intermediate resultthat has occurred or will occur in a system and doesnot meet system specification, expectation or compar-ison objective. A conflict is an inconsistency betweendifferent units’ goals, plans, tasks or other activities ina system. A system usually has multiple units, some ofwhich collaborate, cooperate, and/or coordinate to com-plete tasks. The most important difference between anerror and a conflict is that an error can involve only one

a) c) e) f)d)b)

Fig. 30.1a–f Errors and conflicts in a pin insertion task: (a) successful insertion; (b–f) are unsuccessful insertion with(1) errors if the pin and the two other components are considered as one unit in a system, or (2) conflicts if the pin isa unit and the two other components are considered as another unit in a system [30.1]

unit, whereas a conflict involves two or more units ina system. An error at a unit may cause other errors orconflicts, for instance, a workstation that cannot pro-vide the required number of products to an assemblyline (a conflict) because one machine at the workstationbreaks down (an error). Similarly, a conflict may causeother errors and conflicts, for instance, a machine thatdid not receive required products (an error) because theautomated guided vehicles that carry the products col-lided when they were moving toward each other on thesame path (a conflict). These phenomena, errors lead-ing to other errors or conflicts, and conflicts leading toother errors or conflicts, are called error and conflictpropagation.

Errors and conflicts are different but related. Thedefinition of the two terms is often subject to the un-derstanding and modeling of a system and its units.Mathematical equations can help define errors and con-

PartC

30.1

Page 3: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

Automating Errors and Conflicts Prognostics and Prevention 30.1 Definitions 505

Table 30.2 Examples of errors and conflicts in service automation

Error Conflict

• The engine of an airplane shuts down unexpect-edly during the flight• A patient’s electronic medical records are acci-dently deleted during system recovery• A pacemaker stops working• Traffic lights go off due to lightening• A vending machine does not deliver drinks orsnacks after the payment• Automatic doors do not open• An elevator stops between two floors• A cellphone automatically initiates phone callsdue to a software glitch

• The time between two flights in an itinerary gen-erated by an online booking system is too short fortransition from one flight to the other• A ticket machine sells more tickets than the num-ber of available seats• An ATM machine dispenses $ 250 whena customer withdraws $ 260• A translation software incorrectly interprets text• Two surgeries are scheduled in the same roomdue to a glitch in a sensor that determines if theroom is empty

flicts. An error is defined as

∃E[ur,i (t)

], if ϑi (t)

Dissatisfy−−−−−→ conr (t) . (30.1)

E[ur,i (t)

]is an error, ui (t) is unit i in a system at time t,

ϑi (t) is unit i’s state at time t that describes what has oc-curred with unit i by time t, conr (t) denotes constraint r

in the system at time t, andDissatisfy−−−−−→ denotes that a con-

straint is not satisfied. Similarly, a conflict is definedas

∃C [nr (t)] , if θi (t)Dissatisfy−−−−−→ conr (t) . (30.2)

C [nr (t)] is a conflict and nr (t) is a network of unitsthat need to satisfy conr (t) at time t. The use of con-straints helps define errors and conflicts unambiguously.A constraint is the system specification, expectation,comparison objective or acceptable difference betweendifferent units’ goals, plans, tasks or other activities.Tables 30.1 and 30.2 illustrate errors and conflicts inautomation with some typical examples. There are alsohuman errors and conflicts that exist in automationsystems. Figure 30.1 describes the difference betweenerrors and conflicts in pin insertion.

This Chapter provides a theoretical background andillustrates applications of how to prevent errors and con-flicts automatically in production and service. Differentterms have been used to describe the concept of er-rors and conflicts, for instance, failure (e.g., [30.2–5]),fault (e.g., [30.4, 6]), exception (e.g., [30.7]), and flaw(e.g., [30.8]). Error and conflict are the most popularterms appearing in literature (e.g., [30.3,4,6,9–15]). The

related terms listed here are also useful descriptions oferrors and conflicts. Depending on the context, some ofthese terms are interchangeable with error; some are in-terchangeable with conflict; and the rest refer to botherror and conflict.

Eight key functions have been identified as usefulto prevent errors and conflicts automatically as de-scribed below [30.16–19]. Functions 5–8 prevent errorsand conflicts with the support of functions 1–4. Func-tions 6–8 prevent errors and conflicts by managingthose that have already occurred. Function 5, prognos-tics, is the only function that actively determines whicherrors and conflicts will occur, and prevents them. Allother seven functions are designed to manage errorsand conflicts that have already occurred, although asa result they can prevent future errors and conflictsdirectly or indirectly. Figure 30.2 describes error andconflict propagation and their relationship with the eightfunctions:

1. Detection is a procedure to determine if an error ora conflict has occurred.

2. Identification is a procedure to identify the observa-tion variables most relevant to diagnosing an erroror conflict; it answers the question: Which of themhas already occurred?

3. Isolation is a procedure to determine the exact loca-tion of an error or conflict. Isolation provides moreinformation than identification function, in whichonly the observation variables associated with theerror or conflict are determined. Isolation does notprovide as much information as the diagnosticsfunction, however, in which the type, magnitude,

PartC

30.1

Page 4: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

506 Part C Automation Design: Theory, Elements, and Methods

Propagation

Prognostics

Diagnostics

DetectionIdentification

Isolation

Error recoveryConflict resolutionException handling

Occurrence

PropagationPropagation E [u(r2, i, t)]C [n(r1, t)]

u(i, t)n(r1, t)

Errorconflicts

Errorconflicts

Fig. 30.2 Error and conflict propagation and eight functions to pre-vent errors and conflicts

and time of the error or conflict are determined. Iso-lation answers the question: Where has an error orconflict occurred?

4. Diagnostics is a procedure to determine which er-ror or conflict has occurred, what their specific

characteristics are, or the cause of the observed out-of-control status.

5. Prognostics is a procedure to prevent errors and con-flicts through analysis and prediction of error andconflict propagation.

6. Error recovery is a procedure to remove or mitigatethe effect of an error.

7. Conflict resolution is a procedure to resolve a con-flict.

8. Exception handling is a procedure to manage ex-ceptions. Exceptions are deviations from an idealprocess that uses the available resources to achievethe task requirement (goal) in an optimal way.

There has been extensive research on the eight func-tions, except prognostics. Various models, methods,tools, and algorithms have been developed to automatethe management of errors and conflicts in productionand service. Their main limitation is that most of themare designed for a specific application area, or evena specific error or conflict. The main challenge of au-tomating the management of errors and conflicts is howto prevent them through prognostics, which is supportedby the other seven functions and requires substantialresearch and developments.

30.2 Error Prognostics and Prevention Applications

30.2.1 Error Detection in Assemblyand Inspection

As the first step to prevent errors, error detectionhas attracted much attention, especially in assemblyand inspection; for instance, researchers [30.3] havestudied an integrated sensor-based control system fora flexible assembly cell which includes error detectionfunction. An error knowledge base has been devel-oped to store information about previous errors that hadoccurred in assembly operations, and corresponding re-covery programs which had been used to correct them.The knowledge base provides support for both errordetection and recovery. In addition, a similar machine-learning approach to error detection and recovery inassembly has been discussed. To realize error recovery,failure diagnostics has been emphasized as a necessarystep after the detection and before the recovery. It isnoted that, in assembly, error detection and recovery areoften integrated.

Automatic inspection has been applied in variousmanufacturing processes to detect, identify, and isolate

errors or defects with computer vision. It is mostly usedto detect defects on printed circuit board [30.20–22] anddirt in paper pulps [30.23, 24]. The use of robots hasenabled automatic inspection of hazardous materials(e.g., [30.25]) and in environments that human opera-tors cannot access, e.g., pipelines [30.26]. Automatic in-spection has also been adopted to detect errors in manyother products such as fuel pellets [30.27], printing thecontents of soft drink cans [30.28], oranges [30.29], air-craft components [30.30], and microdrills [30.31]. Thekey technologies involved in automatic inspection in-clude but are not limited to computer or machine vision,feature extraction, and pattern recognition [30.32–34].

30.2.2 Process Monitoringand Error Management

Process monitoring, or fault detection and diagnosticsin industrial systems, has become a new subdisciplinewithin the broad subject of control and signal pro-cessing [30.35]. Three approaches to manage faults forprocess monitoring are summarized in Fig. 30.3. The

PartC

30.2

Page 5: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

Automating Errors and Conflicts Prognostics and Prevention 30.2 Error Prognostics and Prevention Applications 507

analytical approach generates features using detailedmathematical models. Faults can be detected and di-agnosed by comparing the observed features with thefeatures associated with normal operating conditionsdirectly or after some transformation [30.19]. The data-driven approach applies statistical tools on large amountof data obtained from complex systems. Many qual-ity control methods are examples of the data-drivenapproach. The knowledge-based approach uses qualita-tive models to detect and analyze faults. It is especiallysuited for systems in which detailed mathematical mod-els are not available. Among these three approaches, thedata-driven approach is considered most promising be-cause of its solid theoretical foundation compared withthe knowledge-based approach and its ability to dealwith large amount of data compared with the analyti-cal approach. The knowledge-based approach, however,has gained much attention recently. Many errors andconflicts can be detected and diagnosed only by expertswho have extensive knowledge and experience, whichneed to be modeled and captured to automate error andconflict prognostics and prevention.

30.2.3 Hardware Testing Algorithms

The three fault management approaches discussed inSect. 30.2.2 can also be classified according to the waythat a system is modeled. In the analytical approach,quantitative models are used which require the completespecification of system components, state variables, ob-served variables, and functional relationships amongthem for the purpose of fault management. The data-driven approach can be considered as the effort todevelop qualitative models in which previous and cur-rent data obtained from a system are used. Qualitativemodels usually require less information about a sys-tem than do quantitative models. The knowledge-basedapproach uses qualitative models and other types ofmodels; for instance, pattern recognition techniquesuse multivariate statistical tools and employ qualitativemodels, whereas the signed directed graph is a typicaldependence model which represents the cause–effectrelationships in the form of a directed graph [30.36].

Similar to algorithms used in quantitative andqualitative models, optimal and near-optimal test se-quences have been developed to diagnose faults inhardware [30.36–45]. The goal of the test sequencingproblem is to design a test algorithm that is able tounambiguously identify the occurrence of any systemstate (faulty or fault-free state) using the test in the testset and minimizes the expected testing cost [30.37].

Analyticalapproach

Data-drivenapproach

Parameter estimation

Observers

Parity relationsShewhart charts

Cumulative sum (CUSUM) chartsExponentially weighted movingaverage (EWMA) charts

Univariatestatisticalmonitoring

Principal component analysis (PCA)

Signed directed graph (SDG)

Symptom tree model (STM)

Artificial neural networks (ANN)

Self-organizing map (SOM)

Fisher discriminant analysis (FDA)

Partial least squares (PLS)

Canonical variate analysis (CVA)

Multivariatestatisticaltechniques

Knowledge-basedapproach

Causalanalysistechniques

Expert systems

Patternrecognitiontechniques

Fig. 30.3 Techniques of fault management in process monitoring

S0, S1,S2, S3,

S4

S0, S3

S0 S3

T3

T2

p

p f

S2 S1

T2

p f

S1, S2,S4

S1, S2 S4

T1

p f

f

p Test passes

f Test fails

OR node

AND node

Fig. 30.4 Single-fault test strategy

The test sequencing problem belongs to the generalclass of binary identification problems. The problem todiagnose single fault is a perfectly observed Markovdecision problem (MDP). The solution to the MDP isa deterministic AND/OR binary decision tree with ORnodes labeled by the suspect set of system states andAND nodes denoting tests (decisions) (Fig. 30.4). It is

PartC

30.2

Page 6: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

508 Part C Automation Design: Theory, Elements, and Methods

1 2 3 4 5

6 7 8 9 10

11 12 13 14 15

Component with test

Component without test

Fig. 30.5 Digraph model of an example system

well known that the construction of the optimal decisiontree is an NP-complete problem [30.37].

To subdue the computational explosion of the opti-mal test sequencing problem, algorithms that integrateconcepts from information theory and heuristic searchhave been developed and were first used to diag-nose faults in electronic and electromechanical systemswith a single fault [30.37]. An X-Windows-based soft-ware tool, the testability engineering and maintenancesystem (TEAMS), has been developed for testabil-ity analysis of large systems containing as many as

Table 30.3 D-matrix of the example system derived from Fig. 30.5

State/test T1(5) T2(6) T3(8) T4(11) T5(12) T6(13) T7(14) T8(15)

S1(1) 0 1 0 1 1 0 0 0

S2(2) 0 0 1 1 0 1 1 0

S3(3) 0 0 0 0 0 0 0 1

S4(4) 0 0 1 0 1 0 1 0

S5(5) 1 0 0 0 0 0 1 0

S6(6) 0 1 0 0 1 0 0 0

S7(7) 0 0 0 1 0 0 0 0

S8(8) 0 0 1 0 0 0 1 0

S9(9) 0 0 0 0 1 0 0 0

S10(10) 0 0 0 0 0 0 0 1

S11(11) 0 0 0 1 0 0 0 0

S12(12) 0 0 0 0 1 0 0 0

S13(13) 0 0 0 0 0 1 0 0

S14(14) 0 0 0 0 0 0 1 0

S15(15) 0 0 0 0 0 0 0 1

50 000 faults and 45 000 test points [30.36]. TEAMScan be used to model individual systems and gener-ate near-optimal diagnostic procedures. Research ontest sequencing then expanded to diagnose multiplefaults [30.41–45] in various real-world systems includ-ing the Space Shuttle’s main propulsion system. Testsequencing algorithms with unreliable tests [30.40] andmultivalued tests [30.45] have also been studied.

To diagnose a single fault in a system, the relation-ship between the faulty states and tests can be modeledby directed graph (digraph model) (Fig. 30.5). Oncea system is described in a diagraph model, the full or-der dependences among failure states and tests can becaptured by a binary test matrix, also called a depen-dency matrix (D-matrix, Table 30.3). Other researchershave used digraph model to diagnose faults in hyper-cube microprocessors [30.46]. The directed graph isa powerful tool to describe dependences among systemcomponents and tests.

Three important issues have been brought to lightby extensive research on test sequencing problem andshould be considered when diagnosing faults for hard-ware:

1. The order of dependences. The first-order cause–effect dependence between two nodes, i. e., howa faulty node affects another node directly, isthe simplest dependence relationship between twonodes. Earlier research did not consider the de-pendences among nodes [30.37, 38], whereas in

PartC

30.2

Page 7: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

Automating Errors and Conflicts Prognostics and Prevention 30.2 Error Prognostics and Prevention Applications 509

most recent research different algorithms and teststrategies have been developed with the considera-tion of not only the first-order, but also high-orderdependences among nodes [30.43–45]. The high-order dependences describe relationships betweennodes that are related to each other through othernodes.

2. Types of faults. Faults can be classified into twocategories: functional faults and general faults.A component or unit in a complex system mayhave more than one function. Each function maybecome faulty. A component may therefore haveone or more functional faults, each of which in-volves only one function of the component. Generalfaults are those faults that cause faults in all func-tions of a component. If a component has a generalfault, all its functions are faulty. Models that de-scribe only general faults are often called worst-casemodels [30.36] because of their poor diagnosingability;

3. Fault propagation time. Systems can be classifiedinto two categories: zero-time and nonzero-timesystems [30.45]. Fault propagation in zero-time sys-tems is instantaneous to an observer, whereas innonzero-time systems it is several orders of magni-tude slower than the response time of the observer.Zero-time systems can be abstracted by taking thepropagation times to be zero.

Another interesting aspect of the test sequencingproblem is the list of assumptions that have been dis-cussed in several articles, which are useful guidelinesfor the development of algorithms for hardware testing:

1. At most one faulty state (component or unit) ina system at any time [30.37]. This may be achievedif the system is tested frequently enough [30.42].

2. All faults are permanent faults [30.37].3. Tests can identify system states unambiguously

[30.37]. In other words, a faulty state is either iden-tified or not identified. There is not a situation suchas: There is a 60% probability that a faulty state hasoccurred.

4. Tests are 100% reliable [30.40,45]. Both false posi-tive and false negative rates are zero.

5. Tests do not have common setup operations [30.42].This assumption has been proposed to simplify thecost comparison among tests.

6. Faults are independent [30.42].7. Failure states that are replaced/repaired are 100%

functional [30.42].8. Systems are zero-time systems [30.45].

Note the critical difference between assumptions 3and 4. Assumption 3 is related to diagnostics ability.When an unambiguous test detects a fault, the con-clusion is that the fault has definitely occurred with100% probability. Nevertheless, this conclusion couldbe wrong if the false positive rate is not zero. This is thetest (diagnostics) reliability described in assumption 4.When an unambiguous test does not detect a fault, theconclusion is that the fault has not occurred with 100%probability. Similarly, this conclusion could be wrongif the false negative rate is not zero. Unambiguous testshave better diagnostics ability than ambiguous tests. Ifa fault has occurred, ambiguous tests conclude that thefault has occurred with a probability less than one. Sim-ilarly, if the fault has not occurred, ambiguous testsconclude that the fault has not occurred with a proba-bility less than one. In summary, if assumption 3 is true,a test gives only two results: a fault has occurred or hasnot occurred, always with probability 1. If both assump-tions 3 and 4 are true, (1) a fault must have occurred ifthe test concludes that it has occurred, and (2) a faultmust have not occurred if the test concludes that it hasnot occurred.

30.2.4 Error Detection in Software Design

The most prevalent method to detect errors in soft-ware is model checking. As Clarke et al. [30.47] state,model checking is a method to verify algorithmicallyif the model of software or hardware design satisfiesgiven requirements and specifications through exhaus-tive enumeration of all the states reachable by thesystem and the behaviors that traverse them. Modelchecking has been successfully applied to identify in-correct hardware and protocol designs, and recentlythere has been a surge in work on applying it to reasonabout a wide variety of software artifacts; for exam-ple, model checking frameworks have been applied toreason about software process models, (e.g., [30.48]),different families of software requirements models(e.g., [30.49]), architectural frameworks (e.g., [30.50]),design models (e.g., [30.51]), and system implemen-tations (e.g., [30.52–55]). The potential of modelchecking technology for (1) detecting coding errorsthat are hard to detect using existing quality assurancemethods, e.g., bugs that arise from unanticipated in-terleavings in concurrent programs, and (2) verifyingthat system models and implementations satisfy crucialtemporal properties and other lightweight specificationshas led a number of international corporations andgovernment research laboratories such as Microsoft,

PartC

30.2

Page 8: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

510 Part C Automation Design: Theory, Elements, and Methods

IBM, Lucent, NEC, the National Aeronautics and SpaceAdministration (NASA), and the Jet Propulsion Labo-ratory (JPL) to fund their own software model checkingprojects.

A drawback of model checking is the state-explosion problem. Software tends to be less structuredthan hardware and is considered as a concurrent butasynchronous system. In other words, two independentprocesses in software executing concurrently in eitherorder result in the same global state [30.47]. Failing toexecute checking because of too many states is a partic-ularly serious problem for software. Several methods,including symbolic representation, partial order reduc-tion, compositional reasoning, abstraction, symmetry,and induction, have been developed either to decreasethe number of states in the model or to accommodatemore states, although none of them has been able tosolve the problem by allowing a general number ofstates in the system.

Based on the observation that software modelchecking has been particularly successful when it canbe optimized by taking into account properties of a spe-cific application domain, Hatcliff and colleagues havedeveloped Bogor [30.56], which is a highly modularmodel-checking framework that can be tailored to spe-cific domains. Bogor’s extensible modeling languageallows new modeling primitives that correspond to do-main properties to be incorporated into the modelinglanguage as first-class citizens. Bogor’s modular archi-tecture enables its core model-checking algorithms tobe replaced by optimized domain-specific algorithms.Bogor has been incorporated into Cadena and tailoredto checking avionics designs in the common object re-quest broker architecture (CORBA) component model(CCM), yielding orders of magnitude reduction inverification costs. Specifically, Bogor’s modeling lan-guage has been extended with primitives to captureCCM interfaces and a real-time CORBA (RT-CORBA)event channel interface, and Bogor’s scheduling andstate-space exploration algorithms were replaced witha scheduling algorithm that captures the particularscheduling strategy of the RT-CORBA event chan-nel and a customized state-space storage strategy thattakes advantage of the periodic computation of avionicssoftware.

Despite this successful customizable strategy, thereare additional issues that need to be addressed whenincorporating model checking into an overall de-sign/development methodology. A basic problem con-cerns incorrect or incomplete specifications: beforeverification, specifications in some logical formalism

(usually temporal logic) need to be extracted fromdesign requirements (properties). Model checking canverify if a model of the design satisfies a given speci-fication. It is impossible, however, to determine if thederived specifications are consistent with or cover alldesign properties that the system should satisfy. Thatis, it is unknown if the design satisfies any unspeci-fied properties, which are often assumed by designers.Even if all necessary properties are verified throughmodel checking, code generated to implement the de-sign is not guaranteed to meet design specifications, ormore importantly, design properties. Model-based soft-ware testing is being studied to connect the two ends insoftware design: requirements and code.

The detection of design errors in software engineer-ing has received much attention. In addition to modelchecking and software testing, for instance, Miceliet al. [30.8] has proposed a metric-based technique fordesign flaw detection and correction. In parallel com-puting, synchronization errors are major problems anda nonintrusive detection method for synchronization er-rors using execution replay has been developed [30.14].Besides, concurrent error detection (CED) is wellknown for detecting errors in distributed computing sys-tems and its use of duplications [30.9, 57], which issometimes considered a drawback.

30.2.5 Error Detection and Diagnosticsin Discrete-Event Systems

Recently, Petri nets have been applied in fault detectionand diagnostics [30.58–60] and fault analysis [30.61–63]. Petri nets are formal modeling and analysis toolfor discrete-event or asynchronous systems. For hybridsystems that have both event-driven and time-driven(synchronous) elements, Petri nets can be extended toglobal Petri nets to model both discrete-time and eventelements. To detect and diagnose faults in discrete-event systems (DES), Petri nets can be used togetherwith finite-state machines (FSM) [30.64, 65]. The no-tion of diagnosability and a construction procedure forthe diagnoser have been developed to detect faults indiagnosable systems [30.64]. A summary of the use ofPetri nets in error detection and recovery before the1990s can be found in the work of Zhou and DiCe-sare [30.66].

To detect and diagnose faults with Petri nets, someof the places in a Petri net are assumed observable andothers are not. All transitions in the Petri net are alsounobservable. Unobservable places, i. e., faults, indi-cate that the number of tokens in those places is not

PartC

30.2

Page 9: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

Automating Errors and Conflicts Prognostics and Prevention 30.2 Error Prognostics and Prevention Applications 511

observable, whereas unobservable transitions indicatethat their occurrences cannot be observed [30.58, 60].The objective of the detection and diagnostics is toidentify the occurrence and type of a fault based onobservable places within finite steps of observationafter the occurrence of the fault. It is clear that todetect and diagnose faults with Petri nets, system mod-eling is complex and time consuming because faultytransitions and places must be included in a model.Research on this subject has mainly involved the exten-sion of previous work using FSM and has made limitedprogress.

Faults in discrete-event systems can be diagnosedwith the decentralized approach [30.67]. Distributeddiagnostics can be performed by either diagnoserscommunicating with each other directly or through a co-ordinator. Alternatively, diagnostics decisions can bemade completely locally without combining the infor-mation gathered [30.67]. The decentralized approach isa viable direction for error detection and diagnostics inlarge and complex systems.

30.2.6 Error Detection in Serviceand Healthcare Industries

Errors tend to occur frequently in certain service in-dustries that involve intensive human operations. Asthe use of computers and other automation devices,e.g., handwriting recognition and sorting machinesin postal service, becomes increasingly popular, er-rors can be effectively and automatically preventedand reduced to minimum in many service indus-tries including delivery, transportation, e-Business, ande-Commerce. In some other service industries, espe-cially in healthcare systems, error detection is criticaland limited research has been conducted to help de-velop systems that can automatically detect humanerrors and other types of errors [30.68–72]. Severalsystems and modeling tools have been studied andapplied to detect errors in health industries with thehelp of automation devices (e.g., [30.73–76]). Muchmore research needs to be conducted to advance thedevelopment of automated error detection in service in-dustries.

30.2.7 Error Detection and PreventionAlgorithms for Productionand Service Automation

The fundamental work system has evolved frommanual power, human–machine system, computer-

aided and computer-integrated systems, and then toe-Work [30.77], which enables distributed and decen-tralized operations where errors and conflicts propagateand affect not only the local workstation, but theentire production/service network. Agent-based algo-rithms, e.g., (30.3), have been developed to detectand prevent errors in the process of providing a sin-gle product/service in a sequential production/serviceline [30.78, 79]. Qi is the performance of unit i. U ′

mand L ′

m are the upper limit and lower limit, respec-tively, of the acceptable performance of unit m. Um andLm are the upper limit and lower limit, respectively, ofthe acceptable level of the quality of a product/serviceafter the operation of unit m. Units 1 through m −1complete their operation on a product/service beforeunit m starts its operation on the same product/service.An agent deployed at unit m executes (30.3) to preventerrors

∃E(um) , if

{

Um − L ′m <

m−1∑

i=1

Qi

}

∪{

Lm −U ′m >

m−1∑

i=1

Qi

}

. (30.3)

In the process of providing multiple products/services,traditionally, the centralized algorithm (30.4) is used topredict errors in a sequential production/service line.Ii (0) is the quantity of available raw materials for uniti at time 0. ηi is the probability a product/service iswithin specifications after being operated by unit i,assuming the product/service is within specificationsbefore being operated by unit i. ϕm(t) is the needednumber of qualified products/services after the opera-tion of unit m at time t. Equation (30.4) predicts attime 0 the potential errors that may occur at unit m attime t. Equation (30.4) is executed by a central controlunit that is aware of Ii (0) and ηi of all units. Equa-tion (30.4) often has low reliability, i. e., high falsepositive rates (errors are predicted but do not occur),or low preventability, i. e., high false negative rate (er-rors occur but are not predicted), because it is difficultto obtain accurate ηi when there are many units in thesystem.

∃E[um(t)] , ifm

mini=1

{

Ii (0)×m∏

i

ηi

}

< ϕm(t)

(30.4)

To improve reliability and preventability, agent-basederror prevention algorithms, e.g., (30.5), have been de-

PartC

30.2

Page 10: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

512 Part C Automation Design: Theory, Elements, and Methods

Error Initial response

Known causes

Unknown causes

Contingentaction

Root causesafter erroranalysis

Preventiveaction

Circumstance

Fig. 30.6 Incident mapping

veloped to prevent errors in the process of providingmultiple products/services [30.80]. Cm(t′) is the num-ber of cumulative conformities produced by unit m bytime t′. Nm(t′) is the number of cumulative noncon-formities produced by unit m by time t′. An agentdeployed at unit m executes (30.5) by using infor-mation about unit m − 1, i. e., Im−1(t′), ηm−1, andCm−1(t′) to prevent errors that may occur at time t,t′ < t. Multiple agents deployed at different units canexecute (30.5) simultaneously to prevent errors. Eachagent can have its own attitude, i. e., optimistic orpessimistic, toward the possible occurrence of errors.Additional details about agent-based error preventionalgorithms can be found in the work by Chen andNof [30.80]:

∃E[um(t)] if min[Im(t′), Im−1(t′) ×ηm−1

+Cm−1(t′)− Nm(t′)−Cm(t′)

]×ηm +Cm(t′)

< ϕm(t), t′ < t . (30.5)

30.2.8 Error-Prevention Culture (EPC)

To prevent errors effectively, an organization is ex-pected to cultivate an enduring error-prevention culture(EPC) [30.81], i. e., the organization knows what to doto prevent errors when no one is telling it what to do.The EPC model has five components [30.81]:

1. Performance management: the human performancesystem helps manage valuable assets and involvesfive key areas: (a) an environment to minimizeerrors, (b) human resources that are capable of per-forming tasks, (c) task monitoring to audit work,(d) feedback provided by individuals or teamsthrough collaboration, and (e) consequences pro-vided to encourage or discourage people for theirbehaviors.

2. System alignment: an organization’s operating sys-tems must be aligned to get work done withdiscipline, routines, and best practices.

3. Technical excellence: an organization must promoteshared technical and operational understanding ofhow a process, system or asset should technicallyperform.

4. Standardization: standardization supports error pre-vention with a balanced combination of goodmanufacturing practices.

5. Problem-resolution skills: an organization needspeople with effective statistical diagnostics andissue-resolution skills to address operational pro-cess challenges.

Not all errors can be prevented manually and/or byautomation systems. When an error does occur, incidentmapping (Fig. 30.6) [30.81] as one of the exception-handling tools can be used to analyze the error andproactively prevent future errors.

30.3 Conflict Prognostics and Prevention

Conflicts can be categorized into three classes [30.82]:goal conflicts, plan conflicts, and belief conflicts. Goalsof an agent are modeled with an intended goal struc-ture (IGS; e.g., Fig. 30.7), which is extended froma goal structure tree [30.83]. Plans of an agent aremodeled with the extended project estimation and re-view technique (E-PERT) diagram (e.g., Fig. 30.8). Anagent has (1) a set of goals which are representedby circles (Fig. 30.7), or circles containing a number(Fig. 30.8), (2) activities such as Act 1 and Act 2 to

achieve the goals, (3) the time needed to complete anactivity, e.g., T1, and (4) resources, e.g., R1 and R2(Fig. 30.8). Goal conflicts are detected by comparinggoals by agents. Each agent has a PERT diagram andplan conflicts are detected if agents fail to merge PERTdiagrams or the merged PERT diagrams violate certainrules [30.82].

The three classes of conflicts can also be mod-eled by Petri nets with the help of four basicmodules [30.84]: sequence, parallel, decision, and

PartC

30.3

Page 11: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

Automating Errors and Conflicts Prognostics and Prevention 30.4 Integrated Error and Conflict Prognostics and Prevention 513

decision-free, to detect conflicts in a multiagent sys-tem. Each agent’s goal and plan are modeled byseparate Petri nets [30.85], and many Petri nets areintegrated using a bottom-up approach [30.66, 84]with three types of operations [30.85]: AND, OR,and precedence. The synthesized Petri net is ana-lyzed to detect conflicts. Only normal transitions andplaces are modeled in Petri nets for conflict detec-tion. The Petri-net-based approach for conflict detectiondeveloped so far has been rather limited. It has empha-sized more the modeling of a system and its agentsthan the analysis process through which conflicts aredetected.

The three common characteristics of available con-flict detection approaches are: (1) they use the agentconcept because a conflict involves at least two units ina system; (2) an agent is modeled for multiple times be-cause each agent has at least two distinct attributes: goaland plan; and (3) they not only detect, but mainly pre-vent conflicts because goals and plans are determinedbefore agents start any activities to achieve them. Themain difference between the IGS and PERT approach,and the Petri net approach is that agents communi-cate with each other to detect conflicts in the formerapproach whereas a centralized control unit analyzesthe integrated Petri net to detect conflicts in the lat-ter approach [30.85]. The Petri net approach does notdetect conflicts using agents, although systems are mod-eled with agent technology. Conflict detection has beenmostly applied in collaborative design [30.86–88]. Theability to detect conflicts in distributed design activitiesis vital to their success because multiple designers tendto pursue individual (local) goals prior to consideringcommon (global) goals.

TimeAgent A's IGS

A0

Agent A's IGS

A0

Agent A's IGS

A0

A1

A1

A5 A6A4

Fig. 30.7 Development of agent A’s intended goal structure (IGS)over time

Agent 1

Agent 2

Agent 3

5

2

3 4

1

Act6, T6

Act1, T1

Act3, T3

Act5, T5

Act2, T2

Act4, T4

R3R1

R2

R2

R1

R176

Act7, T7

R38

Act8, T8

R3, R4

Dummy

Dummy

Fig. 30.8 Merged project estimation and review technique (PERT)diagram

30.4 Integrated Error and Conflict Prognostics and Prevention

30.4.1 Active Middleware

Middleware was originally defined as software that con-nects two separate applications or separate productsand serves as the glue between two applications; forexample, in Fig. 30.9, middleware can link several dif-ferent database systems to several different web servers.The middleware allows users to request data from anydatabase system that is connected to the middleware us-ing the form displayed on the web browser of one of theweb servers.

Active middleware is one of the four circlesof the “e-” in e-Work as defined by the PRISM

Center (Production, Robotics, and Integration Soft-ware for Manufacturing & Management) at PurdueUniversity [30.77]. Six major components in activemiddleware have been identified [30.89, 90]: model-ing tool, workflows, task/activity database, decisionsupport system (DSS), multiagent system (MAS), andcollaborative work protocols. Active middleware hasbeen developed to optimize the performance of in-teractions in heterogeneous, autonomous, and distrib-uted (HAD) environments, and is able to provide ane-Work platform and enables a universal model forerror and conflict prognostics and prevention in a dis-tributed environment. Figure 30.10 shows the structure

PartC

30.4

Page 12: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

514 Part C Automation Design: Theory, Elements, and Methods

Database 2 Database 3 Database nDatabase 1

Server 2 Server 3 Server mServer 1

Middleware

Fig. 30.9 Middleware in a database server system

Distributed enterprises

Enterprises IIEnterprises I

HAD information systems: Engineeringsystems, planning decision systems

MAS

User: Human/machine

Workflows

DSSCooperative

workprotocols

Task/activitydatabase

Modelingtool

Middleware

Distributeddatabases

Fig. 30.10 Active middleware architecture (after [30.89])

Error and conflict announcement

CEDP

Receive

Send

CEDP

CEDA

Errorknowledge base

Error detection

Detection policygeneration

Conflictevaluation

Error and conflictannouncement

Fig. 30.11 Conflict and error detection model (CEDM)

of the active middleware; each component is describedbelow:

1. Modeling tool: The goal of a modeling tool is to cre-ate a representation model for a multiagent system.The model can be transformed to next-level models,which will be the base of the system implementa-tion.

2. Workflows: Workflows describe the sequence and re-lations of tasks in a system. Workflows store theanswer to two questions: (1) Which agent will ben-efit from the task when it is completed by one ormore given agents? (2) Which task must be finishedbefore other tasks can begin? The workflows arespecific to the given system, and can be managedby a workflow management system (WFMS).

3. Task/activity database: This database is used torecord and help allocate tasks. There are many tasksin a large system such as those applied in auto-motive industries. Certain tasks are performed byseveral agents, and others are performed by oneagent. The database records all task information andthe progress of tasks (activity) and helps allocateand reallocate tasks if required.

4. Decision support system (DSS): DSS for the ac-tive middleware is like the operating system fora computer. In addition, DSS already has programsrunning for monitoring, analysis, and optimization.It can allocate/delete/create tasks, bring in or takeoff agents, and change workflows.

5. Multiagent system (MAS): MAS includes all agentsin a system. It stores information about each agent,for example, capacity and number of agents, func-tions of an agent, working time, and effective dateand expiry date of the agent.

6. Cooperative work protocols: Cooperative workprotocols define communication and interactionprotocols between components of active middle-ware. It is noted that communication between agentsalso includes communication between componentsbecause active middleware includes all agents ina system.

30.4.2 Conflict and Error Detection Model

A conflict and error detection model (CEDM;Fig. 30.11) that is supported by the conflict and er-ror detection protocol (CEDP, part of collaborative

PartC

30.4

Page 13: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

Automating Errors and Conflicts Prognostics and Prevention 30.5 Error Recovery and Conflict Resolution 515

work protocols) and conflict and error detection agents(CEDAs, part of MAS) has been developed [30.91] todetect errors and conflicts in different network topolo-gies. The CEDM integrates CEDP, CEDAs, and fourerror and conflict detection components (Fig. 30.11).A CEDA is deployed at each unit of a system to(1) detect errors and conflicts by three components (de-tection policy generation, error detection, and conflictevaluation), which interact with and are supported byerror knowledge base, and (2) communicate with otherCEDAs to send and receive error and conflict announce-ments with the support of CEDP. The CEDM has beenapplied to four different network topologies and the re-sults show that the performance of CEDM is sometimescounterintuitive, i. e., it performs better on networks thatseem more complex. To be able to detect both errors andconflicts is desired when they exist in the same system.Because errors are different from conflicts, the activi-ties to detect them are often different and need to beintegrated.

30.4.3 Performance Measures

Performance measures are necessary for the evaluationand comparison of various error and conflict prognos-tics and prevention methods. Several measures havealready been defined and developed in previous re-search:

1. Detection latency: The time between the instant thatan error occurs and the instant that the error is de-tected [30.10, 91].

2. Error coverage: The percentage of detected errorswith respect to the total number of errors [30.10].

3. Cost: The overhead caused by including error de-tection capability with respect to the system withoutthe capability [30.10].

4. Conflict severity: The severity of a conflict. It is thesum of the severity caused by the conflict at eachinvolving unit [30.91].

5. Detectability: The ability of a detection method.It is a function of detection accuracy, cost, andtime [30.92].

6. Preventability: The ratio of the number of errors pre-vented divided by the total number of errors [30.80].

7. Reliability: The ratio of the number of errors pre-vented divided by the number of errors identified orpredicted, or the ratio of the number of errors de-tected divided by the total number of errors [30.40,45, 80].

Other performance measures, e.g., total damage andcost–benefit ratio, can be developed to compare differ-ent methods. Appropriate performance measures helpdetermine how a specific method performs in differentsituations and are often required when there are multiplemethods available.

30.5 Error Recovery and Conflict Resolution

When an error or a conflict occurs and is detected,identified, isolated or diagnosed, there are three possi-ble consequences: (1) other errors or conflicts that arecaused by the error or conflict have occurred; (2) othererrors or conflicts that are caused by the error or con-flict will (probably) occur; (3) other errors or conflicts,or the same error or conflict, will (probably) occur ifthe current error or conflict is not recovered or resolved,respectively. One of the objectives of error recoveryand conflict resolution is to avoid the third consequencewhen an error or a conflict occurs. They are thereforepart of error and conflict prognostics and prevention.

There has been extensive research on automated er-ror recovery and conflict resolution, which are oftendomain specific. Many methods have been developedand applied in various real-world applications in whichthe main objective of error recovery and conflict res-olution is to keep the production or service flowing;

for instance, Fig. 30.12 shows a recovery tree for rheo-stat pick-up and insertion, which is programmed forautomatic error recovery. Traditionally, error recoveryand conflict resolution are not considered as an ap-proach to prevent errors and conflicts. In the next twoSections, we describe two examples, error recovery inrobotics [30.93] and conflict resolution in collaborativefacility design [30.88, 94], to illustrate how to performthese two functions automatically.

30.5.1 Error Recovery

Error recovery cannot be avoided when using robotsbecause errors are an inherent characteristic of roboticapplications [30.95] that are often not fault tolerant.Most error recovery applications implement prepro-grammed nonintelligent corrective actions [30.95–98].Due to the large number of possible errors and the

PartC

30.5

Page 14: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

516 Part C Automation Design: Theory, Elements, and Methods

Move to next rheostat

Move to next rheostat

Move to next rheostat

Move to next rheostat

Release rheostat

Move to next rheostat

Move to next rheostat

Recalibrate robot

Return to startFreeze; Calloperator

Return to start

Go to next feeder

Return to startFreeze; Calloperator

Search for insertion

Return to start

Freeze; Calloperator

Return to start

Recalibraterobot

Set failure-counter to 0

Return to start

Is feeder aligned?

Return to start

Discardrheostat

Incrementfailure-counter

Counter < 3?

Return to start

Start

Rheostat positionedcorrectly?

Rheostat available?

Rheostat not caught?

Rheostat inserted?

End

Fig. 30.12 Recovery tree for rheostat pick-up and insertion recovery. A branch may only be entered once; on successbranch downward; on failure branch to right if possible, otherwise branch left; when the end of a branch is reached,unless otherwise specified return to last sensing position; “?” signifies sensing position where sensors or variables areevaluated (after [30.1])

inherent complexity of recovery actions, to automateerror recovery fully without human interventions is dif-ficult. The emerging trend in error recovery is to equip

Table 30.4 Multiapproach conflict resolution in collab-orative design (Mcr) structure [30.88, 94] (after [30.94],courtesy Elsevier, 2008)�

PartC

30.5

Page 15: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

Automating Errors and Conflicts Prognostics and Prevention 30.5 Error Recovery and Conflict Resolution 517

Stage Strategy Steps to achieve conflict resolution Methodologies and tools

Mcr(1) Directnegotiation

1. Agent prepares resolution proposal and sends to counterparts2. Counterpart agents evaluate proposal. If they accept it, go tostep 5; otherwise go to step 33. Counterpart agents prepare a counteroffer and send it backto originating agent4. Agent evaluates the counteroffer. If accepted go to step 5;otherwise go to Mcr(2)5. End of the conflict resolution process

Heuristics;Knowledge-basedinteractions;Multiagent systems

Mcr(2) Third-partymediation

1. Third-party agent prepares resolution proposal and sends tocounterparts2. Counterpart agents evaluate the proposal. If accepted, go tostep 5; otherwise go to step 33. Counterpart agents prepare a counteroffer and send it backto the third-party agent4. Third-party agent evaluates counteroffer. If accepted go tostep 5; otherwise go to Mcr(3)5. End of the conflict resolution process

Heuristics;Knowledge-basedinteractions;Multiagent systems;PESUADER [30.99]

Mcr(3) Incorporationofadditionalparties

1. Specialized agent prepares resolution proposal and sends tocounterparts2. Counterpart agents evaluate the proposal. If accepted, go tostep 5; otherwise go to step 33. Counterpart agents prepare a counteroffer and send it backto the specialized agent4. Specialized agent evaluates counteroffer. If accepted go tostep 5; otherwise go to Mcr(4)5. End of the conflict resolution process

Heuristics;Knowledge-basedinteractions;Expert systems

Mcr(4) Persuasion 1. Third-party agent prepares persuasive arguments and sendsto counterparts2. Counterpart agents evaluate the arguments3. If the arguments are effective, go to step 4; otherwise go toMcr(5)4. End of the conflict resolution process

PERSUADER [30.99];Case-based reasoning

Mcr(5) Arbitration 1. If conflict management and analysis results in common pro-posals (X), conflict resolution is achieved through managementand analysis2. If conflict management and analysis results in mutually ex-clusive proposals (Y), conflict resolution is achieved thoughconflict confrontation3. If conflict management and analysis results in no conflictresolution proposals (Z), conflict resolution must be used

Graph model forconflict resolution(GMCR) [30.100] forconflict management andanalysis;Adaptive neural-fuzzyinference system(ANFIS) [30.101] forconflict confrontation;Dependencyanalysis [30.102] andproduct flow analysis forconflict resolution

PartC

30.5

Page 16: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

518 Part C Automation Design: Theory, Elements, and Methods

Table 30.5 Summary of error and conflict prognostics and prevention theories, applications, and open challenges

Applications Assembly and Process monitoring Hardware testing Software testinginspection

Methods/ Control theory; Analytical Data-driven Knowledge- Information theory; Model checking;

technologies Knowledge base; based Heuristic search Bogor;

Computer/machine Cadena;

vision; Concurrent error

Robotics; detection (CED)

Feature extraction;

Pattern recognition

Functions

Detection × × × × × ×

Diagnostics × × × × ×

Identification × × × × × ×

Isolation × × × × × ×

Error recovery ×

Conflict

resolution

Prognostics × × ×

Exception ×handling

Errors/conflicts E E E E E E

Centralized/ C C C C C C

decentralized

Strengths Integration of error Accurate and Can process Does not Accurate Thorough

detection and reliable large amount require detailed and verification

recovery of data system reliable with formal methods

information

Weaknesses Domain specific; Require Rely on the Results are Difficult to derive State explosion;

Lack of general mathematical quantity, subjective and optimal algorithms Duplications needed

methods models that quality, and may not be to minimize cost; in CED;

are often timeliness reliable Time consuming Cannot deal with

not available of data for large systems incorrect or

incomplete

specifications

[30.3, 20–34] [30.17, 19, 35] [30.36–45] [30.8, 9, 14, 47–57]

systems with human intelligence so that they can cor-rect errors through reasoning and high-level decisionmaking. An example of an intelligent error recoverysystem is the neural-fuzzy system for error recovery(NEFUSER) [30.93]).

The NEFUSER is both an intelligent system anda design tool of fuzzy logic and neural-fuzzy modelsfor error detection and recovery. The NEFUSER has

been applied to a single robot working in an assem-bly cell. The NEFUSER enables interactions amongthe robot, the operator, and computer-supported ap-plications. It interprets data and information collectedby the robot and provided by the operator, analyzesdata and information with fuzzy logic and/or neural-fuzzy models, and makes appropriate error recoverydecisions. The NEFUSER has learning ability to im-

PartC

30.5

Page 17: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

Automating Errors and Conflicts Prognostics and Prevention 30.5 Error Recovery and Conflict Resolution 519

Table 30.5 (cont.)

Applications Discrete event Collaborative design Production and servicesystem

Methods/ Petri net; Intended goal Facility Detection and Conflict and Fuzzy logic;

technologies Finite-state structure (IGS); description prevention error detection Artificial

machine (FSM) Project evaluation language algorithms; model (CEDM); intelligence

and review (FDL); Reliability Active

technique (PERT); Mcr; theory; middleware

Petri net; CDMS Process

Conflict detection modeling;

and management Workflow

system (CDMS)

Functions

Detection × × × ×

Diagnostics

Identification × × × ×

Isolation × × × ×

Error recovery ×

Conflict ×

resolution

Prognostics × × ×

Exception × ×

handling

Errors/conflicts E C C E E/C E

Centralized/ C/D C/D C/D C/D D C/D

decentralized

Strengths Formal method Modeling Integration of Reliable; Short detection Correct errors

applicable to of systems traditional human Easy to apply time through

various with agent-based conflict reso- reasoning and

systems technology lutions and high-level

computer-based decision making

learning

Weaknesses State explosion An agent The adaptability Limited to Needs further Needs further

for large may be modeled of the methods to sequential development development

systems; for multiple times other design production and and for various

System due to many activities has not service lines; validation applications

modeling is conflicts been validated Domain

complex and it is involved specific

time-consuming

[30.58–67] [30.66, 82–88] [30.77, 86, 88, 94] [30.68–80] [30.77, 89–91] [30.93, 95–98, 103]

[30.104–114]

prove corrective actions and adapt to different errors.The NEFUSER therefore increases the level of automa-tion by decreasing the number of times that the robothas to stop and the operator has to intervene due toerrors.

Figure 30.13 shows the interactions between therobot, the operator, and computer-supported applica-

tions. The NEFUSER is the error recovery brain andis programmed and run on MATLAB, which providesa friendly windows-oriented fuzzy inference system(FIS) that incorporates the graphical user interface toolsof the fuzzy logic toolbox [30.103]. The example inFig. 30.13 includes a robot and an operator in an as-sembly cell. In general, the NEFUSER design for error

PartC

30.5

Page 18: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

520 Part C Automation Design: Theory, Elements, and Methods

NEFUSER

Requesthelp

Interactwith thesystem

Recoverystrategy

Recoveryinstructions

Operations andrecovery actions

Assist

Sensing

Sensor data

Sensorinformation

Operator

ControllerRobot

Productionprocess

Fig. 30.13 Interactions with NEFUSER (after [30.93])

recovery includes three main tasks: (1) design the FIS,(2) manage and evaluate information, and (3) train theFIS with real data and information.

30.5.2 Conflict Resolution

There is a growing demand for knowledge-intensivecollaboration in distributed design [30.94, 113, 114].

Conflict detection has been studied extensively in col-laborative design, as has conflict resolution, whichis often the next step after a conflict is detected.There has been extensive research on conflict resolution(e.g., [30.105–110]). Recently, a multiapproach methodto conflict resolution in collaborative design has beenintroduced with the development of the facility descrip-tion language–conflict resolution (FDL-CR) [30.88].The critical role of computer-supported conflict resolu-tion in distributed organizations has been discussed ingreat detail [30.77, 104, 111, 112]. In addition, Ceroniand Velasquez [30.86] have developed the conflict de-tection and management system (CDMS) and theirwork shows that both product complexity and numberof participating designers have a statistically significanteffect on the ratio of conflicts resolved to those detected,but that only complexity had a statistically significanteffect on design duration.

Based on the previous work, most recently, a newmethod, Mcr (Table 30.4), has been developed toautomatically resolve conflict situations common incollaborative facility design using computer-supporttools [30.88, 94]. The method uses both traditionalhuman conflict-resolution approaches that have beenused successfully by others and principles of con-flict prevention to improve design performance andapply computer-based learning to improve usefulness.A graph model for conflict resolution is used to facil-itate conflict modeling and analysis. The performanceof the new method has been validated by implementingits conflict resolution capabilities in the FDL, a com-puter tool for collaborative facility design, and byapplying FDL-CR, to resolve typical conflict situations.Table 30.4 describes the Mcr structure.

Table 30.5 summarizes error and conflict prognos-tics and prevention methods and technologies in variousproduction and service applications.

30.6 Emerging Trends

30.6.1 Decentralized and Agent-BasedError and Conflict Prognosticsand Prevention

Most error and conflict prognostics and preventionmethods developed so far are centralized approaches(Table 30.5) in which a central control unit controlsdata and information and executes some or all eightfunctions to prevent errors and conflicts. The central-ized approach often requires substantial time to execute

various functions and the central control unit oftenpossesses incomplete or incorrect data and informa-tion [30.80]. These disadvantages become apparentwhen a system has many units that need to be examinedfor errors and conflicts.

To overcome the disadvantages of the centralizedapproach, the decentralized approach that takes advan-tage of the parallel activities of multiple agents has beendeveloped [30.16, 67, 79, 80, 91]. In the decentralizedapproach, distributed agents detect, identify or isolate

PartC

30.6

Page 19: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

Automating Errors and Conflicts Prognostics and Prevention 30.7 Conclusion 521

errors and conflicts at individual units of a system, andcommunicate with each other to diagnose and preventerrors and conflicts. The main challenge of the decen-tralized approach is to develop robust protocols thatcan ensure effective communications between agents.Further research is needed to develop and improve de-centralized approaches for implementation in variousapplications.

30.6.2 Intelligent Error and ConflictPrognostics and Prevention

Compared with humans, automation systems performbetter when they are used to prevent errors andconflicts through the violation of specifications or vi-olation in comparisons [30.13]. Humans, however, havethe ability to prevent errors and conflicts throughthe violation of expectations, i. e., with tacit knowl-edge and high-level decision making. To increase theeffectiveness degree of automation of error and con-flict prognostics and prevention, it is necessary toequip automation systems with human intelligencethrough appropriate modeling techniques such as fuzzylogic, pattern recognition, and artificial neural net-works. There has been some preliminary work toincorporate high-level human intelligence in error de-tection and recovery (e.g., [30.3, 93]) and conflictresolution [30.88, 94]. Additional work is needed todevelop self-learning, self-improving artificial intelli-gence systems for error and conflict prognostics andprevention.

30.6.3 Graph and Network Theories

The performance of an error and conflict prognosticsand prevention method is significantly influenced bythe number of units in a system and their relationship.A system can be viewed as a graph or a network withmany nodes, each of which represents a unit in the sys-tem. The relationship between units is represented bythe link between nodes. The study of network topolo-gies has a long history stretching back at least to the1730s. The classic model of a network, the random net-work, was first discussed in the early 1950s [30.115]and was rediscovered and analyzed in a series of papers

published in the late 1950s and early 1960s [30.116–118]. Most recently, several network models have beendiscovered and extensively studied, for instance, thesmall-world network (e.g., [30.119]), the scale-freenetwork (e.g., [30.120–123]), and the Bose–Einsteincondensation network [30.124]. Bioinspired networkmodels for collaborative control have recently beenstudied by Nof [30.125] (see also Chap. 75 for moredetails).

Because the same prognostics and preventionmethod may perform quite differently on networks withdifferent topologies and attributes, or with the samenetwork topology and attributes but with different pa-rameters, it is imperative to study the performance ofprognostics and prevention methods with respect to dif-ferent networks for the best match between methodsand networks. There is ample room for research, de-velopment, and implementation of error and conflictprognostics and prevention methods supported by graphand network theories.

30.6.4 Financial Modelsfor Prognostics Economy

Most errors and conflicts must be detected, isolated,identified, diagnosed or prevented. Certain errors andconflicts, however, may be tolerable in certain systems,i. e., fault-tolerant systems. Also, the cost of automatingsome or all eight functions of error and conflict prognos-tics and prevention may far exceed the damages causedby certain errors and conflicts. In both situations, cost–benefit analyses can be used to determine if an error ora conflict needs to be dealt with. In general, financialmodels are used to analyze the economy of prognosticsand prevention methods for specific errors and con-flicts, to help decide which of the eight functions willbe executed and how they will be executed, e.g., the fre-quency. There has been limited research on how to usefinancial models to help justify the automation of er-ror and conflict prognostics and prevention [30.92,126].One of the challenges is how to appropriately evalu-ate or assess the damage of errors and conflicts, e.g.,short-term damage, long-term damage, and intangibledamage. Additional research is needed to address theseeconomical decisions.

30.7 Conclusion

In this Chapter we have discussed the eight func-tions that automate error and conflict prognostics and

prevention and their applications in various produc-tion and service areas. Prognostics and prevention

PartC

30.7

Page 20: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

522 Part C Automation Design: Theory, Elements, and Methods

methods for errors and conflicts are developed basedon extensive theoretical advancements in many sci-ence and engineering domains, and have been suc-cessfully applied to various real-world problems. Assystems and networks become larger and more com-

plex, such as global enterprises and the Internet,error and conflict prognostics and prevention be-come more important and the focus is shifting frompassive response to active prognostics and preven-tion.

References

30.1 S.Y. Nof, W.E. Wilhelm, H.-J. Warnecke: IndustrialAssembly (Chapman Hall, New York 1997)

30.2 L.S. Lopes, L.M. Camarinha-Matos: A machinelearning approach to error detection and recoveryin assembly, Proc. IEEE/RSJ Int. Conf. Intell. Robot.Syst. 95, ’Human Robot Interaction and Coopera-tive Robots’, Vol. 3 (1995) pp. 197–203

30.3 H. Najjari, S.J. Steiner: Integrated sensor-basedcontrol system for a flexible assembly, Mechatron-ics 7(3), 231–262 (1997)

30.4 A. Steininger, C. Scherrer: On finding an optimalcombination of error detection mechanisms basedon results of fault injection experiments, Proc. 27thAnnu. Int. Symp. Fault-Toler. Comput., FTCS-27,Digest of Papers (1997) pp. 238–247

30.5 K.A. Toguyeni, E. Craye, J.C. Gentina: Framework todesign a distributed diagnosis in FMS, Proc. IEEEInt. Conf. Syst. Man. Cybern. 4, 2774–2779 (1996)

30.6 J.F. Kao: Optimal recovery strategies for manufac-turing systems, Eur. J. Oper. Res. 80(2), 252–263(1995)

30.7 M. Bruccoleri, Z.J. Pasek: Operational issues inreconfigurable manufacturing systems: exceptionhandling, Proc. 5th Biannu. World Autom. Congr.(2002)

30.8 T. Miceli, H.A. Sahraoui, R. Godin: A metric basedtechnique for design flaws detection and correc-tion, Proc. 14th IEEE Int. Conf. Autom. Softw. Eng.(1999) pp. 307–310

30.9 C. Bolchini, W. Fornaciari, F. Salice, D. Sciuto: Con-current error detection at architectural level, Proc.11th Int. Symp. Syst. Synth. (1998) pp. 72–75

30.10 C. Bolchini, L. Pomante, F. Salice, D. Sciuto: Re-liability properties assessment at system level:a co-design framework, J. Electron. Test. 18(3),351–356 (2002)

30.11 M.D. Jeng: Petri nets for modeling automatedmanufacturing systems with error recovery, IEEETrans. Robot. Autom. 13(5), 752–760 (1997)

30.12 G.A. Kanawati, V.S.S. Nair, N. Krishnamurthy,J.A. Abraham: Evaluation of integrated system-level checks for on-line error detection, Proc. IEEEInt. Comput. Perform. Dependability Symp. (1996)pp. 292–301

30.13 B.D. Klein: How do actuaries use data contain-ing errors?: models of error detection and errorcorrection, Inf. Resour. Manag. J. 10(4), 27–36 (1997)

30.14 M. Ronsse, K. Bosschere: Non-intrusive detectionof synchronization errors using execution replay,Autom. Softw. Eng. 9(1), 95–121 (2002)

30.15 O. Svenson, I. Salo: Latency and mode of error de-tection in a process industry, Reliab. Eng. Syst. Saf.73(1), 83–90 (2001)

30.16 X.W. Chen, S.Y. Nof: Prognostics and diagnostics ofconflicts and errors over e-Work networks, Proc.19th Int. Conf. Production Research (2007)

30.17 J. Gertler: Fault Detection and Diagnosis in Engi-neering Systems (Marcel Dekker, New York 1998)

30.18 M. Klein, C. Dellarocas: A knowledge-basedapproach to handling exceptions in workflow sys-tems, Comput. Support. Coop. Work 9, 399–412(2000)

30.19 A. Raich, A. Cinar: Statistical process monitoringand disturbance diagnosis in multivariable con-tinuous processes, AIChE Journal 42(4), 995–1009(1996)

30.20 C.-Y. Chang, J.-W. Chang, M.D. Jeng: An unsuper-vised self-organizing neural network for automaticsemiconductor wafer defect inspection, IEEE Int.Conf. Robot. Autom. ICRA (2005) pp. 3000–3005

30.21 M. Moganti, F. Ercal: Automatic PCB inspection sys-tems, IEEE Potentials 14(3), 6–10 (1995)

30.22 H. Rau, C.-H. Wu: Automatic optical inspectionfor detecting defects on printed circuit board in-ner layers, Int. J. Adv. Manuf. Technol. 25(9–10),940–946 (2005)

30.23 J.A. Calderon-Martinez, P. Campoy-Cervera: Anapplication of convolutional neural networks forautomatic inspection, IEEE Conf. Cybern. Intell.Syst. (2006) pp. 1–6

30.24 F. Duarte, H. Arauio, A. Dourado: Automatic systemfor dirt in pulp inspection using hierarchical imagesegmentation, Comput. Ind. Eng. 37(1–2), 343–346(1999)

30.25 J.C. Wilson, P.A. Berardo: Automatic inspection ofhazardous materials by mobile robot, Proc. IEEEInt. Conf. Syst. Man. Cybern. 4, 3280–3285 (1995)

30.26 J.Y. Choi, H. Lim, B.-J. Yi: Semi-automatic pipelineinspection robot systems, SICE-ICASE Int. Jt. Conf.(2006) pp. 2266–2269

30.27 L.V. Finogenoy, A.V. Beloborodov, V.I. Ladygin,Y.V. Chugui, N.G. Zagoruiko, S.Y. Gulvaevskii,Y.S. Shul’man, P.I. Lavrenyuk, Y.V. Pimenov: Anoptoelectronic system for automatic inspection of

PartC

30

Page 21: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

Automating Errors and Conflicts Prognostics and Prevention References 523

the external view of fuel pellets, Russ. J. Nondestr.Test. 43(10), 692–699 (2007)

30.28 C.W. Ni: Automatic inspection of the printing con-tents of soft drink cans by image processinganalysis, Proc. SPIE 3652, 86–93 (2004)

30.29 J. Cai, G. Zhang, Z. Zhou: The application ofarea-reconstruction operator in automatic vi-sual inspection of quality control, Proc. WorldCongr. Intell. Control Autom. (WCICA), Vol. 2 (2006)pp. 10111–10115

30.30 O. Erne, T. Walz, A. Ettemeyer: Automatic shearog-raphy inspection systems for aircraft componentsin production, Proc. SPIE 3824, 326–328 (1999)

30.31 C.K. Huang, L.G. Wang, H.C. Tang, Y.S. Tarng: Auto-matic laser inspection of outer diameter, run-outtaper of micro-drills, J. Mater. Process. Technol.171(2), 306–313 (2006)

30.32 L. Chen, X. Wang, M. Suzuki, N. Yoshimura: Opti-mizing the lighting in automatic inspection systemusing Monte Carlo method, Jpn. J. Appl. Phys., Part1 38(10), 6123–6129 (1999)

30.33 W.C. Godoi, R.R. da Silva, V. Swinka-Filho: Patternrecognition in the automatic inspection of flaws inpolymeric insulators, Insight Nondestr. Test. Cond.Monit. 47(10), 608–614 (2005)

30.34 U.S. Khan, J. Igbal, M.A. Khan: Automatic inspec-tion system using machine vision, Proc. 34th Appl.Imag. Pattern Recognit. Workshop (2005) pp. 210–215

30.35 L.H. Chiang, R.D. Braatz, E. Russell: Fault Detec-tion and Diagnosis in Industrial Systems (Springer,London New York 2001)

30.36 S. Deb, K.R. Pattipati, V. Raghavan, M. Shakeri,R. Shrestha: Multi-signal flow graphs: a novel ap-proach for system testability analysis and faultdiagnosis, IEEE Aerosp. Electron. Syst. Mag. 10(5),14–25 (1995)

30.37 K.R. Pattipati, M.G. Alexandridis: Application ofheuristic search and information theory to sequen-tial fault diagnosis, IEEE Trans. Syst. Man. Cybern.20(4), 872–887 (1990)

30.38 K.R. Pattipati, M. Dontamsetty: On a generalizedtest sequencing problem, IEEE Trans. Syst. Man.Cybern. 22(2), 392–396 (1992)

30.39 V. Raghavan, M. Shakeri, K. Pattipati: Optimal andnear-optimal test sequencing algorithms with re-alistic test models, IEEE Trans. Syst. Man. Cybern. A29(1), 11–26 (1999)

30.40 V. Raghavan, M. Shakeri, K. Pattipati: Test se-quencing algorithms with unreliable tests, IEEETrans. Syst. Man. Cybern. A 29(4), 347–357 (1999)

30.41 M. Shakeri, K.R. Pattipati, V. Raghavan, A. Patter-son-Hine, T. Kell: Sequential Test Strategies forMultiple Fault Isolation (IEEE, Atlanta 1995)

30.42 M. Shakeri, V. Raghavan, K.R. Pattipati, A. Patter-son-Hine: Sequential testing algorithms for multi-ple fault diagnosis, IEEE Trans. Syst. Man. Cybern.A 30(1), 1–14 (2000)

30.43 F. Tu, K. Pattipati, S. Deb, V.N. Malepati: MultipleFault Diagnosis in Graph-Based Systems (Inter-national Society for Optical Engineering, Orlando2002)

30.44 F. Tu, K.R. Pattipati: Rollout strategies for sequen-tial fault diagnosis, IEEE Trans. Syst. Man. Cybern.A 33(1), 86–99 (2003)

30.45 F. Tu, K.R. Pattipati, S. Deb, V.N. Malepati: Com-putationally efficient algorithms for multiple faultdiagnosis in large graph-based systems, IEEE Trans.Syst. Man. Cybern. A 33(1), 73–85 (2003)

30.46 C. Feng, L.N. Bhuyan, F. Lombardi: Adaptivesystem-level diagnosis for hypercube multiproces-sors, IEEE Trans. Comput. 45(10), 1157–1170 (1996)

30.47 E.M. Clarke, O. Grumberg, D.A. Peled: Model Check-ing (MIT Press, Cambridge 2000)

30.48 C. Karamanolis, D. Giannakopolou, J. Magee,S. Wheather: Model checking of workflow schemas,4th Int. Enterp. Distrib. Object Comp. Conf. (2000)pp. 170–181

30.49 W. Chan, R.J. Anderson, P. Beame, D. Notkin,D.H. Jones, W.E. Warner: Optimizing symbolicmodel checking for state charts, IEEE Trans. Softw.Eng. 27(2), 170–190 (2001)

30.50 D. Garlan, S. Khersonsky, J.S. Kim: Model check-ing publish-subscribe systems, Proc. 10th Int. SPINWorkshop Model Checking Softw. (2003)

30.51 J. Hatcliff, W. Deng, M. Dwyer, G. Jung,V.P. Ranganath: Cadena: An integrated develop-ment, analysis, and verification environment forcomponent-based systems, Proc. 2003 Int. Conf.Softw. Eng. (ICSE 2003) (Portland 2003)

30.52 T. Ball, S. Rajamani: Bebop: a symbolic mod-elchecker for Boolean programs, Proc. 7th Int. SPINWorkshop, Lect. Notes Comput. Sci. 1885, 113–130(2000)

30.53 G. Brat, K. Havelund, S. Park, W. Visser: JavaPathFinder – a second generation of a Java model-checker, Proc. Workshop Adv. Verif. (2000)

30.54 J.C. Corbett, M.B. Dwyer, J. Hatcliff, S. Laubach,C.S. Pasareanu, Robby, H. Zheng: Bandera: Ex-tracting finite-state models from Java source code,Proc. 22nd Int. Conf. Softw. Eng. (2000)

30.55 P. Godefroid: Model-checking for programminglanguages using VeriSoft, Proc. 24th ACM Symp.Princ. Program. Lang. (POPL’97) (1997) pp. 174–186

30.56 Robby, M.B. Dwyer, J. Hatcliff: Bogor: An extensibleand highly-modular model checking framework,Proc. 9th European Softw. Eng. Conf. held jointlywith the 11th ACM SIGSOFT Symp. Found. Softw. Eng.(2003)

30.57 S. Mitra, E.J. McCluskey: Diversity techniques forconcurrent error detection, Proc. IEEE 2nd Int.Symp. Qual. Electron. Des. IEEE Comput. Soc., 249–250 (2001)

30.58 S.-L. Chung, C.-C. Wu, M. Jeng: Failure Diagnosis:A Case Study on Modeling and Analysis by Petri Nets(IEEE, Washington 2003)

PartC

30

Page 22: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

524 Part C Automation Design: Theory, Elements, and Methods

30.59 P.S. Georgilakis, J.A. Katsigiannis, K.P. Valavanis,A.T. Souflaris: A systematic stochastic Petri netbased methodology for transformer fault diagno-sis and repair actions, J. Intell. Robot. Syst. TheoryAppl. 45(2), 181–201 (2006)

30.60 T. Ushio, I. Onishi, K. Okuda: Fault Detection Basedon Petri Net Models with Faulty Behaviors (IEEE, SanDiego 1998)

30.61 M. Rezai, M.R. Ito, P.D. Lawrence: Modeling andSimulation of Hybrid Control Systems by GlobalPetri Nets (IEEE, Seattle 1995)

30.62 M. Rezai, P.D. Lawrence, M.R. Ito: Analysis of Faultsin Hybrid Systems by Global Petri Nets (IEEE, Van-couver 1995)

30.63 M. Rezai, P.D. Lawrence, M.B. Ito: Hybrid Modelingand Simulation of Manufacturing Systems (IEEE,Los Angeles 1997)

30.64 M. Sampath, R. Sengupta, S. Lafortune, K. Sin-namohideen, D. Teneketzis: Diagnosability ofdiscrete-event systems, IEEE Trans. Autom. Control40(9), 1555–1575 (1995)

30.65 S.H. Zad, R.H. Kwong, W.M. Wonham: Fault diag-nosis in discrete-event systems: framework andmodel reduction, IEEE Trans. Autom. Control 48(7),1199–1212 (2003)

30.66 M. Zhou, F. DiCesare: Petri Net Synthesis for DiscreteEvent Control of Manufacturing Systems (Kluwer,Boston 1993)

30.67 Q. Wenbin, R. Kumar: Decentralized failure diag-nosis of discrete event systems, IEEE Trans. Syst.Man. Cybern. A 36(2), 384–395 (2006)

30.68 A. Brall: Human reliability issues in medical care -a customer viewpoint, Proc. Annu. Reliab. Maint.Symp. (2006) pp. 46–50

30.69 H. Furukawa: Challenge for preventing medicationerrors-learn from errors-: what is the most ef-fective label display to prevent medication errorfor injectable drug? Proc. 12th Int. Conf. Hum.-Comput. Interact.: HCI Intell. Multimodal Interact.Environ., Lect. Notes Comput. Sci. 4553, 437–442(2007)

30.70 G. Huang, G. Medlam, J. Lee, S. Billingsley, J.-P. Bissonnette, J. Ringash, G. Kane, D.C. Hodgson:Error in the delivery of radiation therapy: resultsof a quality assurance review, Int. J. Radiat. Oncol.Biol. Phys. 61(5), 1590–1595 (2005)

30.71 A.-S. Nyssen, A. Blavier: A study in anesthesia,Ergonomics 49(5/6), 517–525 (2006)

30.72 K.T. Unruh, W. Pratt: Patients as actors: thepatient’s role in detecting, preventing, and recov-ering from medical errors, Int. J. Med. Inform. 76(1),236–244 (2007)

30.73 C.C. Chao, W.Y. Jen, M.C. Hung, Y.C. Li, Y.P. Chi:An innovative mobile approach for patient safetyservices: the case of a Taiwan health care provider,Technovation 27(6–7), 342–361 (2007)

30.74 S. Malhotra, D. Jordan, E. Shortliffe, V.L. Patel:Workflow modeling in critical care: piecing to-

gether your own puzzle, J. Biomed. Inform. 40(2),81–92 (2007)

30.75 T.J. Morris, J. Pajak, F. Havlik, J. Kenyon,D. Calcagni: Battlefield medical informationsystem-tactical (BMIST): The application of mobilecomputing technologies to support health surveil-lance in the Department of Defense, Telemed. J.e-Health 12(4), 409–416 (2006)

30.76 M. Rajendran, B.S. Dhillon: Human error in healthcare systems: bibliography, Int. J. Reliab. Qual. Saf.Eng. 10(1), 99–117 (2003)

30.77 S.Y. Nof: Design of effective e-Work: review ofmodels, tools, and emerging challenges, Product.Plan. Control 14(8), 681–703 (2003)

30.78 X. Chen: Error detection and prediction agentsand their algorithms. M.S. Thesis (School of Indus-trial Engineering, Purdue University, West Lafayette2005)

30.79 X.W. Chen, S.Y. Nof: Error detection and predictionalgorithms: application in robotics, J. Intell. Robot.Syst. 48(2), 225–252 (2007)

30.80 X.W. Chen, S.Y. Nof: Agent-based error preventionalgorithms, submitted to the IEEE Trans. Autom.Sci. Eng. (2008)

30.81 K. Duffy: Safety for profit: Building an error-prevention culture, Ind. Eng. Mag. 9, 41–45(2008)

30.82 K.S. Barber, T.H. Liu, S. Ramaswamy: Conflict de-tection during plan integration for multi-agentsystems, IEEE Trans. Syst. Man. Cybern. B 31(4),616–628 (2001)

30.83 G.M.P. O’Hare, N. Jennings: Foundations of Dis-tributed Artificial Intelligence (Wiley, New York1996)

30.84 M. Zhou, F. DiCesare, A.A. Desrochers: A hybridmethodology for synthesis of Petri net models formanufacturing systems, IEEE Trans. Robot. Autom.8(3), 350–361 (1992)

30.85 J.-Y. Shiau: A formalism for conflict detection andresolution in a multi-agent system. Ph.D. Thesis(Arizona State University, Arizona 2002)

30.86 J.A. Ceroni, A.A. Velásquez: Conflict detection andresolution in distributed design, Prod. Plan. Con-trol 14(8), 734–742 (2003)

30.87 T. Jiang, G.E. Nevill Jr: Conflict cause identifica-tion in web-based concurrent engineering designsystem, Concurr. Eng. Res. Appl. 10(1), 15–26(2002)

30.88 M.A. Lara, S.Y. Nof: Computer-supported conflictresolution for collaborative facility designers, Int.J. Prod. Res. 41(2), 207–233 (2003)

30.89 P. Anussornnitisarn, S.Y. Nof: The design of activemiddleware for e-Work interactions, PRISM Res.Memorandum (School of Industrial Engineering,Purdue University, West Lafayette 2001)

30.90 P. Anussornnitisarn, S.Y. Nof: e-Work: the chal-lenge of the next generation ERP systems, Prod.Plan. Control 14(8), 753–765 (2003)

PartC

30

Page 23: 30. Automating Errors and Conflicts Prognostics Automating ...extras.springer.com/2009/978-3-540-78830-0/... · 30. Automating Errors and Conflicts PrognosticsAutomating E and Prevention

Automating Errors and Conflicts Prognostics and Prevention References 525

30.91 X.W. Chen, S.Y. Nof: An agent-based conflict anderror detection model, submitted to Int. J. Prod.Res. (2008)

30.92 C.L. Yang, S.Y. Nof: Analysis, detection policy, andperformance measures of detection task plan-ning errors and conflicts, PRISM Res. Memorandum,2004-P2 (School of Industrial Engineering, PurdueUniversity, West Lafayette 2004)

30.93 J. Avila-Soria: Interactive Error Recovery for RoboticAssembly Using a Neural-Fuzzy Approach. MasterThesis (School of Industrial Engineering, PurdueUniversity, West Lafayette 1999)

30.94 J.D. Velásquez, M.A. Lara, S.Y. Nof: Systematic res-olution of conflict situation in collaborative facilitydesign, Int. J. Prod. Econ. 116(1), 139–153 (2008),(2008)

30.95 S.Y. Nof, O.Z. Maimon, R.G. Wilhelm: Experimentsfor Planning Error-Recovery Programs in RoboticWork, Proc. Int. Comput. Eng. Conf. Exhib. 2, 253–264 (1987)

30.96 M. Imai, K. Hiraki, Y. Anzai: Human-robot interfacewith attention, Syst. Comput. Jpn. 26(12), 83–95(1995)

30.97 T.C. Lueth, U.M. Nassal, U. Rembold: Reliability andintegrated capabilities of locomotion and manipu-lation for autonomous robot assembly, Robot.Auton. Syst. 14, 185–198 (1995)

30.98 H.-J. Wu, S.B. Joshi: Error recovery in MPSG-basedcontrollers for shop floor control, Proc. IEEE Int.Conf. Robot. Autom. ICRA 2, 1374–1379 (1994)

30.99 K. Sycara: Negotiation planning: An AI approach,Eur. J. Oper. Res. 46(2), 216–234 (1990)

30.100 L. Fang, K.W. Hipel, D.M. Kilgour: Interactive Deci-sion Making (Wiley, New York 1993)

30.101 J.-S.R. Jang: ANFIS: Adaptive-network-based fuzzyinference systems, IEEE Trans. Syst. Man. Cybern.23, 665–685 (1993)

30.102 A. Kusiak, J. Wang: Dependency analysis in con-straint negotiation, IEEE Trans. Syst. Man. Cybern.25(9), 1301–1313 (1995)

30.103 J.-S.R. Jang, N. Gulley: Fuzzy Systems Toolbox forUse with MATLAB (The Math Works Inc., 1997)

30.104 C.Y. Huang, J.A. Ceroni, S.Y. Nof: Agility of net-worked enterprises: parallelism, error recovery andconflict resolution, Comput. Ind. 42, 73–78 (2000)

30.105 M. Klein, S.C.-Y. Lu: Conflict resolution in cooper-ative design, Artif. Intell. Eng. 4(4), 168–180 (1989)

30.106 M. Klein: Supporting conflict resolution in cooper-ative design systems, IEEE Trans. Syst. Man. Cybern.21(6), 1379–1390 (1991)

30.107 M. Klein: Capturing design rationale in concur-rent engineering teams, IEEE Computer 26(1), 39–47(1993)

30.108 M. Klein: Conflict management as part of an inte-grated exception handling approach, Artif. Intell.Eng. Des. Anal. Manuf. 9, 259–267 (1995)

30.109 X. Li, X.H. Zhou, X.Y. Ruan: Study on conflictmanagement for collaborative design system, J.Shanghai Jiaotong University (English ed.) 5(2), 88–93 (2000)

30.110 X. Li, X.H. Zhou, X.Y. Ruan: Conflict management inclosely coupled collaborative design system, Int. J.Comput. Integr. Manuf. 15(4), 345–352 (2000)

30.111 S.Y. Nof: Tools and models of e-Work, Proc. 5th Int.Conf. Simul. AI (Mexico City 2000) pp. 249–258

30.112 S.Y. Nof: Collaborative e-Work and e-Manufacturing:challenges for production and logistics managers,J. Intell. Manuf. 17(6), 689–701 (2006)

30.113 X.F. Zha, H. Du: Knowledge-intensive collaborativedesign modeling and support part I: review, dis-tributed models and framework, Comput. Ind. 57,39–55 (2006)

30.114 X.F. Zha, H. Du: Knowledge-intensive collabora-tive design modeling and support part II: systemimplementation and application, Comput. Ind. 57,56–71 (2006)

30.115 R. Solomonoff, A. Rapoport: Connectivity of ran-dom nets, Bull. Mater. Biophys. 13, 107–117(1951)

30.116 P. Erdos, A. Renyi: On random graphs, Publ. Math.Debr. 6, 290–291 (1959)

30.117 P. Erdos, A. Renyi: On the evolution of randomgraphs, Magy. Tud. Akad. Mat. Kutato Int. Kozl. 5,17–61 (1960)

30.118 P. Erdos, A. Renyi: On the strenth of connectednessof a random graph, Acta Mater. Acad. Sci. Hung. 12,261–267 (1961)

30.119 D.J. Watts, S.H. Strogatz: Collective dynamics of‘small-world’ networks, Nature 393(6684), 440–442 (1998)

30.120 R. Albert, H. Jeong, A.L. Barabasi: Internet: Diam-eter of the World-Wide Web, Nature 401(6749),130–131 (1999)

30.121 A.L. Barabasi, R. Albert: Emergence of scalingin random networks, Science 286(5439), 509–512(1999)

30.122 A. Broder, R. Kumar, F. Maghoul, P. Raghavan,S. Rajagopalan, R. Stata, A. Tomkins, J. Wiener:Graph structure in the Web, Comput. Netw. 33(1),309–320 (2000)

30.123 D.J. de Solla Price: Networks of scientific papers,Science 149, 510–515 (1965)

30.124 G. Bianconi, A.L. Barabasi: Bose-Einstein conden-sation in complex networks, Phys. Rev. Lett. 86(24),5632–5635 (2001)

30.125 S.Y. Nof: Collaborative control theory for e-Work,e-Production, and e-Service, Annu. Rev. Control31(2), 281–292 (2007)

30.126 C.L. Yang, X. Chen, S.Y. Nof: Design of a productionconflict and error detection model with active pro-tocols and agents, Proc. 18th Int. Conf. Prod. Res.(2005)

PartC

30