exception handling in workflow systems › documentation › documents › ... · exception...

23
Applied Intelligence 13, 125–147, 2000 c 2000 Kluwer Academic Publishers. Manufactured in The Netherlands. Exception Handling in Workflow Systems ZONGWEI LUO, AMIT SHETH, KRYS KOCHUT AND JOHN MILLER Large Scale Distributed Information System Lab, * 415 GSRC, Computer Science Department, University of Georgia Athens, GA 30602, USA [email protected] [email protected] [email protected] [email protected] Abstract. In this paper, defeasible workflow is proposed as a framework to support exception handling for work- flow management. By using the “justified” ECA rules to capture more contexts in workflow modeling, defeasible workflow uses context dependent reasoning to enhance the exception handling capability of workflow management systems. In particular, this limits possible alternative exception handler candidates in dealing with exceptional situations. Furthermore, a case-based reasoning (CBR) mechanism with integrated human involvement is used to improve the exception handling capabilities. This involves collecting cases to capture experiences in handling ex- ceptions, retrieving similar prior exception handling cases, and reusing the exception handling experiences captured in those cases in new situations. Keywords: case-based reasoning (CBR), context-dependent reasoning, exception handling, ontology, workflow system 1. Introduction Workflow technology is considered nowadays as an essential technique to integrate distributed and often heterogeneous applications and information systems as well as to improve the effectiveness and produc- tivity of business processes [1–4]. A workflow man- agement system (WfMS) is a set of tools providing support for the necessary services of workflow creation, workflow enactment, and administration and monitor- ing of workflow processes, which consist of a network of tasks. These workflow processes are constructed to conform to their workflow specifications. However, due to foreseen or unforeseen situations, such as sys- tem malfunctions due to failure of physical compo- nents or changes in business environment, deviations (exceptions) of those workflow processes from their * http://lsdis.cs.uga.edu specifications are unavoidable. Workflow systems need exception handling mechanism to deal with those de- viations. Exception-handling constructs, as well as the underlying mechanisms, should be sufficiently gen- eral to cover various aspects of exception handling in a uniform way. In addition, it helps to separate the mod- ules for handling exceptional situations from the mod- ules for the normal cases. We believe that designing an integrated human-computer process may provide better performance than moving toward an entirely automated process in exception handling. Let’s take a look at a motivating workflow applica- tion to characterize the scope of exception handling (see Fig. 1). This infant transportation application involves the transportation of a very low birth weight infant of less than 750 grams, at or below 25–26 weeks ges- tation, from a rural hospital located within 100 miles from the Neonatal Intensive Care Unit (NICU) at the Medical College of Georgia (MCG). Such transporta- tion usually takes up to 2.5 hours. In the ambulance

Upload: others

Post on 24-Jun-2020

48 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

Applied Intelligence 13, 125–147, 2000c© 2000 Kluwer Academic Publishers. Manufactured in The Netherlands.

Exception Handling in Workflow Systems

ZONGWEI LUO, AMIT SHETH, KRYS KOCHUT AND JOHN MILLERLarge Scale Distributed Information System Lab,∗415 GSRC, Computer Science Department, University of

Georgia Athens, GA 30602, [email protected]

[email protected]

[email protected]

[email protected]

Abstract. In this paper, defeasible workflow is proposed as a framework to support exception handling for work-flow management. By using the “justified” ECA rules to capture more contexts in workflow modeling, defeasibleworkflow uses context dependent reasoning to enhance the exception handling capability of workflow managementsystems. In particular, this limits possible alternative exception handler candidates in dealing with exceptionalsituations. Furthermore, a case-based reasoning (CBR) mechanism with integrated human involvement is used toimprove the exception handling capabilities. This involves collecting cases to capture experiences in handling ex-ceptions, retrieving similar prior exception handling cases, and reusing the exception handling experiences capturedin those cases in new situations.

Keywords: case-based reasoning (CBR), context-dependent reasoning, exception handling, ontology, workflowsystem

1. Introduction

Workflow technology is considered nowadays as anessential technique to integrate distributed and oftenheterogeneous applications and information systemsas well as to improve the effectiveness and produc-tivity of business processes [1–4]. A workflow man-agement system (WfMS) is a set of tools providingsupport for the necessary services of workflow creation,workflow enactment, and administration and monitor-ing of workflow processes, which consist of a networkof tasks. These workflow processes are constructedto conform to their workflow specifications. However,due to foreseen or unforeseen situations, such as sys-tem malfunctions due to failure of physical compo-nents or changes in business environment, deviations(exceptions) of those workflow processes from their

∗http://lsdis.cs.uga.edu

specifications are unavoidable. Workflow systems needexception handling mechanism to deal with those de-viations. Exception-handling constructs, as well as theunderlying mechanisms, should be sufficiently gen-eral to cover various aspects of exception handling ina uniform way. In addition, it helps to separate the mod-ules for handling exceptional situations from the mod-ules for the normal cases. We believe that designing anintegrated human-computer process may provide betterperformance than moving toward an entirely automatedprocess in exception handling.

Let’s take a look at a motivating workflow applica-tion to characterize the scope of exception handling (seeFig. 1). This infant transportation application involvesthe transportation of a very low birth weight infant ofless than 750 grams, at or below 25–26 weeks ges-tation, from a rural hospital located within 100 milesfrom the Neonatal Intensive Care Unit (NICU) at theMedical College of Georgia (MCG). Such transporta-tion usually takes up to 2.5 hours. In the ambulance

Page 2: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

126 Luo et al.

Figure 1. High-level workflow for newborn infant transportation.

there are two or three healthcare professionals whoperform different roles. During the transport the am-bulance personnel perform the standard procedures toobtain medical data for the infant. The first step of thisapplication involves the task that allocates resources,such as healthcare professionals, equipment, and so on.The last step is to prepare for admission to the NICU.

If exceptions are raised during the transport, cor-rective actions must take place. The decision of thecorrective procedures involves collaboration and co-ordination between the ambulance personnel and theconsultants at MCG’s NICU. This application is verydynamic because the changes to the infant’s health sta-tus as indicated by the vital signs such as the known riskfactors may lead to changes in the treatment plan. Suchchanges can occur rapidly. For example, a low weightinfant can dehydrate in as few as ten minutes whilean adult would take at least several hours to reach thesame severity. Such changes to the infant’s status aremodeled as exceptions. Consider a “normal” treatmentplan as shown at the top of Fig. 2. Occurrence of heartmurmur that is known as a risk factor related to car-diac disease would be modeled as an exception (fromthe normally expected and correspondingly modeledprocess). One way of handling such an exception is

to change the process such that the cardiovascular re-lated task is performed earlier than what was originallyplanned (as shown at the bottom of Fig. 2).

This healthcare application raises several require-ments for workflow systems to support:r The processes in this application are very dynamic.

To support coordination of such processes, a system-atic way of workflow evolution, including dynamicstructure modifications should be worked out.r There are potential collaboration activities in thistransport process; e.g., the healthcare professionalsmay need advice from specialists in the NICU. Theadvice is context based. The results from the col-laborations may affect the progress of the ongoingprocess coordination.r The health professionals on board are not necessarilyexperienced in every aspect of intensive care. A caserepository used in the case-based reasoning (CBR)[5] based exception-handling system stores valuableexperience learned to help them make decisions.r Exceptions are not avoidable in such an environment.An abnormal situation can cause special attention forhealthcare professionals. Those abnormal situationsshould be resolved as soon as possible due to the

Page 3: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

Exception Handling in Workflow Systems 127

Figure 2. Treatment plan change in infant transportation task.

nature of this application—newborn infant transport.Prior experience gained in handling similar abnormalsituations can facilitate the exception resolution pro-cess. At the same time the set of exception handlersthat need to be checked, can be limited by capturingmore workflow execution context.

Approaches to address the first two requirements, al-though topics of our research are beyond the scope ofthis paper. In this paper, we discuss an approach of de-feasible workflow to address the last two requirements.Defeasible workflows target workflow applicationsin the dynamic and uncertain business environmentby modeling organizational processes through justi-fied ECA (JECA) rules developed to support context-dependent reasoning processes. There are three majorcontributions to exception handling in workflow man-agement systems in this approach.

1. By using the JECA rules to capture more con-texts in workflow modeling, defeasible workflowenhances the exception handling capabilities ofWfMSs through supporting context dependent rea-soning in dealing with uncertainties. It can limit pos-sible alternative exception handlers in dealing withexceptional situations.

2. It considers workflow evolution, which is an activeresearch topic, a good candidate to exception han-dling, which is usually called adaptive exceptionhandling. That is, possible modifications of work-flows are considered as exception handlers, alongwith other candidates such as ignore, retry, work-flow recovery, and so on. Thus, several modificationprimitives are given in defeasible workflow that willensure the modified workflows meet the correctnesscriteria established.

3. A case-based reasoning (CBR) [5] based excep-tion handling mechanism with integrated humaninvolvements is used in defeasible workflow tosupport exception-handling processes. This mech-anism enhances the exception handling capabilitiesthrough collecting cases to capture experiences inhandling exceptions, retrieving similar prior excep-tion handling cases, and reusing the exception han-dling experiences captured in those cases in newsituations.

The organization of this paper is as follows: We givea perspective on exceptions in Section 2 that providesdirections on how exception aware systems should bebuilt. We introduce defeasible workflow in Section 3that is used to support such healthcare applications

Page 4: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

128 Luo et al.

like the infant transport. In Section 4, we introducea CBR-based approach for the exception handling. InSection 5, we discuss related works. Finally, Section 6concludes this paper.

2. What is an Exception?

Exceptions in our view refer to facts or situations thatare not modeled by the information systems or devia-tions between what we plan and what actually happen.Exceptions are raised to signal errors, faults, failures,and other deviations. They depend on what we wantand what we can achieve. For example, in the infanttransport application, space provided by an ambulanceis limited. So is the transport time. The exception han-dling mechanisms might be different from that used inNICUs because of the differences in time, space andplaces. In most realistic situations and non-trivial sys-tems, there are always interests conflicts between whatwe want and what we can achieve. It is more accept-able to design a system that can operate as best as itcan; when there is an exception, it can be handled bythe system. We call such a system an exception-awaresystem.

Exceptions provide great opportunities for the sys-tems to learn, correct themselves, and evolve. To buildan exception aware system, it is beneficial to clarifythe nature of exceptions to get guidelines in the sys-tems development. As shown in Fig. 3, known, de-tectable, and resolvable form three dimensions for the

Figure 3. Three-dimensional analyses of exceptions.

exception knowledge space. Theknowndimension isusually captured through exception specification. Su-pervision is one of the approaches to enlarge exceptionknowledge space in thedetectabledimension. To re-solve exceptions, capable exception handlers shouldbe available that make up theresolvabledimension ofthe exception knowledge space. Any position in thisexception knowledge space can be represented as anexception point. The exception knowledge of an ex-ception aware system is the set of all those points.r Known: Every person’s knowledge is limited. The

same is true for a system. The world is governed byrules that either we know or we are still investigat-ing, and our knowledge continues to expand throughlearning. As this learning process continues and un-known or uncertainties become known, the decisionmay be revised and other uncertainties may be con-sidered. When a system cannot meet the new situ-ation, exceptions occur. We call these kinds of ex-ceptions unknown exceptions, since they are beyondthe system’s current knowledge. Otherwise, we con-sider them known. For example, during the transportof the newborn infant, an unknown exception maybe caused by an abnormal situation that has not beenmet before. A basic solution to unknown exceptionsis to build an open system that is able to learn andcan be adapted to handle those exceptions.r Detectable: Exceptions can be classified based on asystem’s capabilities to detect an exception. If sys-tems can notice the occurrence of an exception then

Page 5: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

Exception Handling in Workflow Systems 129

we call it a detectable exception; otherwise we con-sider it undetectable. An unknown exception is usu-ally undetectable because there is no way for thesystem to know about it until further improvementto the system is accomplished to achieve that capa-bility. Sometimes an unknown exception might bedetected as another known exception. Detection ofan exception depends on the system’s capabilities.For example, if there is no equipment on board tomeasure certain situations, e.g., neurologic checkingof the newborn infant, exceptions related to neuro-logic situations might not be detected. Moreover,if there is no mapping from the errors occurring inthe measuring equipment in which an equipment re-lated exception should be raised to a workflow sys-tem’s exception, that equipment exception will notbe detected either. If the system can notice that thereis an exception, it may be possible for the systemto derive capable exception handlers to handle suchexceptional situations.r Resolvable: Undetectable exceptions are not resolv-able at the time of occurrence, for their occurrencesare unknown to the system. Also, there are certainknown exceptions that may be ignored during sys-tem modeling time for certain specific reasons, suchas the frequency of their occurrences is so low, theeffect caused by the exceptions to the system canbe ignored. When such exceptions actually occur,the system cannot handle them (e.g., the Y2K bug).For example, on board there are necessary medicinesfor commonly occurring health conditions for new-born infants. When a medicine is not on board butis needed for a situation that rarely occurs, then thecorresponding exception to the process is not resolv-able at that time. Based on the system’s handlingcapability, exceptions can be categorized into twocategories: resolvable and irresolvable. When ex-ceptions occur, the system can derive a solution toresolve the deviations. Such exceptions are called re-solvable exceptions. When exceptions occur, but thesystem cannot derive a solution to solve the deviationto meet the requirement, then it is called irresolvableexceptions.

Based on the above perspective, exception awaresystems should be built to have adequate initial ex-ception knowledge represented as exception points. Todeal with exceptional situations, a system actually findsan exception point in the exception knowledge spacethrough propagation and masking. Propagation allows

a system to propagate an exception to a more appropri-ate system component to find an appropriate exceptionpoint. Masking usually means when there are severalexception points, the one that is close to where excep-tion is detected is the best candidate. Defeasible work-flow is proposed in Section 3 to support a systematicway of propagation and masking.

3. Defeasible Workflow

Organizational processes are often dynamic. Theyevolve over time and often involve uncertainty. Toadapt to its environment, a workflow should be flex-ible enough that necessary modifications to its speci-fications and instances are allowed. They need to becomplemented with execution support or run-time so-lutions such as dynamic scheduling, dynamic resourcebinding, runtime workflow specification, and infras-tructure reconfiguration. For example, uncertainties inclinical decision-making processes, such as incompletedata in the report, should be eliminated as early as pos-sible, preferably at the design stage. However, if this isnot possible at runtime exceptions should be raised andhandled. The clinical decision-making is a process bywhich alternative strategies of care are considered andselected. Those alternatives can be potentially limitedif more useful contexts are available during the clini-cal decision-making. Furthermore, the following basicfacts about clinical decision processes [6] have shownthat in very complicated situations, necessary contextsshould be captured in order to make correct decisionsefficiently:

r “Human and environmental influences are frequentlycomplex, uncertain and difficult to control.” Neces-sary context can help analyze the complex situations.r “Decisions should be clear-cut and error-free. Thenursing and medical professions have expressed aninterest in the precision and objectivity by which de-cisions are made, while simultaneously retaining anindividualistic, holistic approach to patient manage-ment.” The context-dependent reasoning approach isone of the solutions to support such decision-makingprocesses.r “Standardization of management plans is usually ac-companied by paying less attention to differences inpatients’ needs.” This actually means such standard-ization should consider individual differences thatcan only be achieved by adding exceptions to thestandardization. Those exceptions are actually used

Page 6: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

130 Luo et al.

to capture necessary context when the standardizedplans are enforced.

Defeasible workflows target workflow applicationsin the dynamic and uncertain business environmentby capturing more workflow contexts and supportingcontext-dependent reasoning processes. They are char-acterized as follows:

r Organizational processes are modeled through JECArules (see below).r When no default consequences can be derived duringthe execution, or the conflicts occur that cannot beresolved in the default evaluation, exceptions will beraised. WfMSs will try to derive capable handlers tohandle the raised exceptions.r If none of the derived handlers are suitable, or han-dlers cannot be derived, past experiences in handlingexceptions will be sought and reused by retrievingand adapting similar exception handling cases storedin the case repository.r Human interaction will be necessary if no accept-able solutions can be automatically derived. Solu-tions provided by a person, who is usually a spe-cialist, will be abstracted, and stored into the caserepository.

3.1. JECA Rules

We have extended the well known ECA rules spec-ification as Justified Event-Condition-Action (JECA)rules to model business logic based on work in context-dependent reasoning [7]. There are several workflowprototypes (e.g. [8, 9]) that have adopted ECA rulesas modeling tools. However, the contexts that can becaptured by ECA rules are limited. The C in an ECArule, which is used to capture rules evaluation context,is used as a condition that should be satisfied so that theaction specified in that ECA rule can be executed. Thatis, an ECA rule, once triggered, can only be denied if itscondition cannot be satisfied. This makes ECA rulesincapable in modeling workflows in uncertain busi-ness environments. In JECA rules, justification (J) pro-vides a reasoning context for the evaluations of ECArules to support context dependent reasoning processesin dealing with uncertainties. Each JECA ruler (j, e,c, a) consists of four parts as follows:

r Justification (j): Justification forms the reasoningcontext in which evaluation of the specific JECArule to be performed. Usually it is specified as a dis-qualifier, i.e., a JECA rule will be disqualified if itsjustification is evaluated true.r Event (e): when event occurs, related JECA ruleswill be evaluated. We say the JECA rules are trig-gered.r Condition (c): logic constraints to be satisfied sothe action in the rule can be taken if the rule is notdisqualified.r Action (a): necessary actions to be taken if this ruleis not disqualified, and events occur, conditions aremet.

Consider a simple rule-processing algorithm givenin Algorithm 3.1. This algorithm executes JECA rulesby picking up a triggered rule one by one. When thereare no more triggered rules, execution will terminate.

while there are triggered JECA rules do:

1. find a triggered JECA rule r2. evaluate r’s condition and justification3. if r’s condition is true, and justification is not true

then execute r’s action

Algorithm 3.1 A simple JECA rule execution algorithm

Consider the following example of exception han-dling rule in dealing with an exception related to cardiacdiseased. In this rule, the exception is related to certaintype of cardiac disease d, and the corrective action is totake an initial assessment for this type of cardiac dis-ease d. The justification provides a reasoning context(as commonly followed in this medical specialty) thatif there are one or more blood family members of oppo-site sex who had this type of cardiac disease d before,then this disease cannot be this type of cardiac diseased. The initial assessment can only be performed if thejustification doesn’t disqualify this JECA rule.

Event : Cardiac disease d related exception eventCondition : Risk factor is heart murmurs (related tothe type Cardiac Disease d)Action : Initial Assessment for Cardiac Disease dJustification : Blood family member of opposite sexdoes have this type of Cardiac disease d

The above example, if modeled in ECA rules, canbe:

Page 7: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

Exception Handling in Workflow Systems 131

Event : Cardiac disease d related exception eventCondition : Risk factor is heart murmur (related tothe type Cardiac Disease d) and Blood family mem-ber of opposite sex doesn’t have this type of Cardiacdisease dAction : Initial Assessment for Cardiac Disease d.

When a cardiac disease d related exception eventactually occurs, necessary contexts should be availableso that the condition in that ECA rule can be checked,i.e., the risk factor is heart murmur and blood familymember of opposite sex does have this type of cardiacdisease. If the condition “Blood family member of op-posite sex does have this type of cardiac disease d”cannot be checked true, then this ECA rule cannot besatisfied. Is it correct not to model that condition of“Blood family member of opposite sex does have thistype of cardiac disease d” in the ECA rule? The ECArule then becomes:

Event: Cardiac disease d related exception eventCondition: Risk factor is heart murmur (related tothe type Cardiac Disease d)Action: Initial Assessment for Cardiac Disease d.

However, the answer is no, because this ECA rulecannot model that the action should not be executedwhen condition “Blood family member of opposite sexdoes have this type of cardiac disease d” is true. In theapproach of JECA rules, the condition of “Blood familymember of opposite sex does have this type of cardiacdisease d” is a justification for that JECA rule. If thejustification cannot be evaluated true, then the JECArule will not be disqualified. So the action in that JECArule can be executed if the risk factor of heart murmuris true.

3.2. JECA Rules Evaluation

To apply the context-dependent reasoning processes[7], those JECA rules are transformed into a form that isusually used in the context-dependent reasoning, calleddefault[7]. A default has three parts,prerequisite(P),justification(J), andconsequence(C) specified in logicexpressions. Those defaults are constructed throughJECA rules with mappings from J to justifications, Cto prerequisite and A to consequence. Usually E in aJECA rule is associated with the justifications and pre-requisite in the default that the JECA rule is mapped

to. In order to apply the defaults, the prerequisite ofthe defaults should be proved. It is required the de-faults should bejustified. This means the justificationsshould not deny the applicability of the default. Con-sequence, the third part in default, forms a belief setthrough extension computation [7]. Please refer to [7]for detailed information about extension computation.Conflicts such as inconsistencies existing in the be-lief set should be resolved by non-monotonic reasoningprocess in which reasoning results may become invalidif more facts become available, or reasoning contexthas changed. An algorithm of execution of JECA rulesthrough default evaluation is given in Algorithm 3.2.

while there are triggered defaults do:

1. find related defaults2. evaluate defaults through context dependent

reasoning, resolve any conflicts in the belief set3. actions in the belief set will be executed

Algorithm 3.2 JECA rule processing through defaultevaluation

The defaults will be clustered into several sets calleddomains. Each default is associated with a local rea-soning unit. If the local reasoning unit cannot make anacceptable decision, another reasoning unit at that do-main that can get access to domain context will be con-sulted. A global reasoning unit is responsible for deci-sion making in the global context. The local reasoningunit can also make a decision according to the partiallyavailable information without consulting another localunit or a unit at a higher level. This clustered reasoningarchitecture is well suited in workflow management be-cause organizational processes modeled by workflowsare clustered and/or layered in nature in which localdecisions can often be reached. The criteria used inclustering can be either one or a combination of thefollowing:r Security: For security reasons, defaults must be clus-

tered so sensitive information can be exchanged in acontrolled manner. In the infant transport examplecertain medical records that may contain sensitiveinformation, such as HIV, abnormal behavior of theinfant’s parents, should be securely controlled.r Organization: It is natural to cluster defaults into sev-eral domains to model the organizational structure inthat organization. In the infant transport example the

Page 8: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

132 Luo et al.

defaults can be grouped according to the organiza-tional roles those health professionals can play.r Performance: To provide better performance, it isnecessary to cluster the defaults into several domainsin which defaults can be more efficiently found andevaluated.

3.3. Conflict Resolution

During JECA rules evaluation, dynamic resolution isused to select default values (resources, algorithms,routing, numerical values, etc.) and resolve conflicts.Inconsistent information as well as coarse rule granu-larity in JECA rules can lead to conflicts during defaultevaluation. The conflicts may lead to exceptions if theconflicts cannot be resolved in the default evaluation.Conflicts in default evaluation can be either one or moreof the following:r Temporal conflict: defaults formed in subsequent

time result in conflicts. For example, the healthcareprofessionals make two decisions separately. Thedecisions are specified in defaults that are fed intothe workflow systems in the order the decisions aremade. It is possible that the two decisions may resultin conflicts. The first decision may involve certainkinds of drugs to be applied. The second decisionmay involve drugs that should not be applied afterthe drugs in the first decision are applied.r Role conflict: defaults formed or evaluated by dif-ferent role may result in conflicts. The healthcareprofessionals on board may have different opinionsabout the situation of the infant. During the assign-ment of healthcare professionals to the transport ofinfant, there may exist role conflicts, such as agentsnot available, an agent who can play that the care-giver role is not suitable to attend the infant.r Semantic conflict: An example that might lead tosemantic conflicts is alternative interpretation of theconsequence of defaults. This is possible due to thenature of non-monotonic reasoning that is contextdependent. If the contexts are different, interpreta-tions in different contexts may also be different.

To resolve those conflicts, the defaults are assignedpriorities, or more generally, are ordered by partial or-dering relations regulated in business policies (organi-zational, security, etc.). That is, some defaults may bepreferable to others, and assuming they are applicable,they should be used first. To resolve any inconsisten-cies, polices are based on one of the following criteria:

r Special: Preference will be given to exceptions overnormal cases. For example, whenever a local deci-sion cannot be made in the default evaluation, anexception is always raised.r Hierarchical: Consequences derived at higher posi-tions in the hierarchy will be favored over those atlower positions. In the infant transport application,if the medical situation of the infant becomes seri-ous, the decision made from the personnel at a higherposition who can be responsible for the action takenwill be preferred.r Temporal: Consequences from latest defaults willbe favored over earlier ones. In the infant transportapplication, the personnel on board may consult theexperts in the MCG’s care unit (NICU). During thedecision making process, the experts in the NICUmight give one suggestion at one time, and later theymight come up with another suggestion. The laterone might be given a higher priority during the de-cision making process.r Semantic: Interpretations that are more plausiblewill be preferred to less plausible ones. In the ex-ample application, if a symptom is discovered and ifit might be caused by several different reasons, thepersonnel on board may derive one that he believesis the most reasonable.

3.4. Rule Graphs

In rule execution, there is a danger of non-termination.In this section, we give a sufficient condition for the ter-mination of rule execution over a JECA rule set by ex-tending the rule graphs in [10]. Furthermore we adaptthe results from rule analysis [11] for workflow model-ing. Further details about rule analysis related proofsappear in [10, 11]. We ensure termination of evalua-tion of J and C in JECA rule evaluation, we limit thelogic expressions used in specifying J and C compo-nents of JECA rules to be quantifier free. Furthermore,we assume that facts that need to be checked in ruleevaluations are finite.

Consider a JECA ruler (j, e, c, a). When event (e)occurs,r is triggered. In other words, event (e) triggersJECA ruler . Action of JECA rules can generate eventsthat can trigger other JECA rules, which may includethemselves. The action of those triggered JECA rulesmay further trigger more JECA rules. This series ofJECA rules triggering forms a triggering graph. Con-sider an arbitrary JECA rule setR. Triggering graph(TG) overR is a directed graph where each node cor-responds to a ruleri that belongs toR, and a directed

Page 9: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

Exception Handling in Workflow Systems 133

arc (ri , rk) means that the execution of ruleri generatesevents that trigger rulerk. Consider two JECA rulesri (j, e, c, a) andr j (j, e, c, a). When condition (c) ofri is true, JECA ruleri is activated. If action (a) ofr j

can change the condition (c) ofri from false to true,we sayri is activated byr j , or r j activatesri . Whenjustification (j) of ri is not true, JECA ruleri is justi-fied. If action (a) ofr j can not change the justification(j) of ri from false to true, we sayri is justified byr j ,or r j justifiesri . Consider an arbitrary JECA rule setR. Activation graph (AG) overR is a directed graphwhere each node corresponds to a ruleri that belongsto R, and a directed arc (ri , rk) means ruleri activatesrule rk. Justification graph (JG) overR is a directedgraph where each node corresponds to a ruleri thatbelongs toR, and a directed arc (ri , rk) means ruler j

justifies rulerk.Consider a JECA rule setR, and TG, AG and JG over

R. An irreducible rule set overR is a subset ofR, andincludes only those JECA rules whose incoming arcsare in all the directed graphs, TG, AG and JG. This irre-ducible rule set is generated by iterations of discardinga rule that does not have an incoming arc in TG, or AG,or JG, and remove all its outgoing arcs. The iterationscontinue until all the rules have been removed, or untilal the remaining rules have incoming arcs in all the di-rected graphs, TG, AG, and JG. If the irreducible ruleset overR is empty, then rule execution onR is guaran-teed to terminate [11]. This is a sufficient condition fortermination of rule execution over rule setR. Assumethat rule execution will not terminate if the irreduciblerule set overR is empty. If rule execution will notterminate, according to the rule execution algorithm inAlgorithm 3.1 or 3.2, there are always triggered rules,and some of those triggered rules’ condition must betrue and justification must not be true. There must ex-ist at least one triggering cycle, activation cycle andjustification cycle involving the same rules, sayr 1 andr 2, in the same direction. That is,r 1 trigger r 2, r 1activatesr 2, andr 1 justifiesr 2. Thus, bothr 1 andr 2have incoming arcs in all directed graph, TG, AG, andJG. So the irreducible set overR is not empty. Thiscontradicts with the assumption.

3.5. Workflow Modeling

In our approach, a workflow system is considered as areactive system that maintains an ongoing interactionwith its environment. It is assumed the all variablesthat describe the properties of workflows are takenfrom a set of variables, called workflow variable set

or vocabulary. Instances of those variables form theworkflow environment. Situation of the activities inworkflows at a certain point of time is called aworkflowstatethat is specified through task states and data statesand the status of workflow environment. Inter-statedependence constraints are enforced throughworkflowtransitionsthat are specified in JECA rules. Thus, eachworkflow stateS is associated with a JECA rule setRspecifying workflow transitions. An initial workflowstate is where a workflow starts. It has a special in-coming transition, specified by a special JECA rule,called initial rule. The initial rule only triggers thoserules associated with initial workflow states. A finalworkflow state is where a workflow terminates. It isassociated with a special JECA rule, calledfinal rule.The final rule cannot trigger any rules. Aworkflowisa series of workflow states linked by workflow transi-tions, starting from an initial workflow state, ending ata final workflow state.

Consider a JECA ruler and the workflow transitionδ specified byr . If r is triggered, activated, and justi-fied, r is enabled.δ is enabled ifr is enabled. GivenJECA rulesri , rk andr j , if ri enablesrk andrk enablesr j , thenri transitively enablesr j . Given JECA rulesr1, r2, . . . , rn, (n >= 3) r1 transitively enablesrn ifri enablesri+1, 1<= i < n. A workflow execution se-quence consists of a series of workflow states linked byenabled JECA rules. The workflow execution is said tobe in adeadlockif the last workflow state is not final.A workflow execution sequence is an execution his-tory of a workflow instance. Each workflow instanceis associated with aworkflow specification(also calledworkflow type).

Consider a workflow specification, and associatedJECA rule setR. Workflow graph (WG) is a directedgraph, wherer each node corresponds to a ruleri belongs toR.r A directed arc (ri , rk) means ruleri enables rulerk.r If ri in a direct arc (ri , rk) does not have incoming

arcs,ri is the initial rule.r If rk in a direct arc (ri , rk) does not have outgoingarcs,rk is the final rule.r Initial rule transitively enables final rule.r Irreducible rule set obtained fromR is empty.

A workflow specification is correct if all possibleworkflow execution sequences start from an initialworkflow state, and end at a final state. Consider aworkflow specification, and associated JECA rule setR. If a workflow graph exists, the initial rule tran-sitively enables the final rule. All workflow execution

Page 10: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

134 Luo et al.

sequences can only start from initial workflow state be-cause the initial rule is the only node in workflow graphthat does not have any incoming arcs, but has outgoingarcs. Similarly, all workflow execution sequences canonly end at final workflow state because the final rule isthe only node in workflow graph that does not have anyoutgoing arcs, but has incoming arcs. Since irreduciblerule set obtained fromR is empty, rule execution isguaranteed to terminate. Thus, all workflow executionsequences can only start from an initial workflow state,and will end at a final state. So if a workflow graph canbe obtained from its associated workflow specification,the specification is correct.

3.6. Exception Modeling

To handle exceptions, JECA rules are used to model theexception handling process. Event part captures excep-tional events. Condition and justification part identifiesthe context. Action part specifies the operations tohandle the exceptions. Possible operations that can bespecified in the action part are as follows.

r Masking. It includes corrective actions such as ig-nore, retry, workflow recovery operations, modifica-tions of workflows, etc.r Propagation. It sends out warnings and/or propa-gates the exceptions.

Figure 4. Three layered exception model.

r Recording. It records the exception situations forfuture reference

We characterize the types of exceptions into threebroad categories (see Fig. 4) based on our early workin error handling [12]. The infrastructure exceptionsand application exceptions are mapped to workflow ex-ceptions that include system exceptions and user excep-tions. That is, an exception that occurs at infrastructurelayer will trigger a mapped exception at workflow layer.The same is true for an application exception. Thismapping scheme is adopted due to the heterogeneousnature of exceptions in applications and infrastructure.r Infrastructure exceptions: these exceptions result

from the malfunctioning of the underlying infras-tructure that supports the WfMSs. These exceptionsinclude hardware errors such as computer systemcrashes, errors resulting from network partitioningproblems, errors resulting from interaction with theWeb, errors returned due to failures within the ObjectRequest Broker (ORB) environment, etc. In the new-born infant transport workflow, an infrastructure ex-ception can be caused by an error in the telecommu-nication media between the ambulance and NICU.r Workflow exceptions: All exceptions form a hier-archy rooted at class Exception. Two basic groupsof exceptions include system exceptions and user-defined exceptions. A variety of system excep-tions identify a number of possible system-related

Page 11: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

Exception Handling in Workflow Systems 135

deviations in the services provided by the workflowsystem. Examples of this include a crash of the work-flow enactment component that could lead to errorsin enforcing inter-task dependencies, or errors in re-covering failed workflow component after a crash,etc. User-defined exceptions are specified by theworkflow designer and identify possible application-dependent deviations in task realizations. Specificuser-defined exceptions depend on the specific work-flow application. An exception handler is a de-scription of action(s) that workflow runtime compo-nent(s), or possibly a workflow application, is goingto perform in order to respond to the exception.r Application exceptions: these exceptions are closelytied to each of the tasks, or groups of tasks withinthe workflow. Due to its dependency on applicationlevel semantics, these exceptions are also termed aslogical exceptions. For example, one such exceptioncould involve database login errors that might bereturned to a workflow task that tries to execute atransaction without having permission to do so ata particular DBMS. A runtime exception within atask that is caused due to memory leaks is anotherexample of application exception. In the newborninfant transport workflow, an application exceptioncan be caused by an error in health professionalsassignment such as agents could not be found for theroles required.

3.7. Exception Detection

One of the important tasks in exception handling is todetect exceptional situations. The objective of excep-tion detection is to capture deviations in the systems.Exception detection can be achieved by supervising theworkflow system components’ external inputs and out-puts, and comparing their behavior with the specifiedbehavior of the system. In this approach, the supervi-sor predicts a single likely behavior of the system and,if the observed behavior does not match the prediction,rolls back and creates a new prediction of the validbehavior. An exceptional situation is identified whenthe supervisor has explored all valid behaviors withoutmatching the observed behavior, resulted from whichan exception is raised. In the following, we are goingto discuss several supervision techniques.r Under-specified specification supervision: Under-

specified specification often results in a non-deter-ministic system specification. Non-determinism in

specification is advantageous because the specifica-tion writers can avoid stating irrelevant behavior asmandatory, freeing the software designer to choosea behavioral alternative that would yield a more de-sirable implementation. The major difficulty in su-pervising such systems is that the supervisor mustaccount for all possible behaviors that are permissi-ble under the non-determinism present in the speci-fication.r Hierarchical supervision: this type of supervisionsis very natural due to the hierarchical constructionof WfMSs. The hierarchical approach differs fromcommon one-layer approaches in that supervision issplit into several sub problems such as tracking theexternal behavior of the target system and detailedinternal behavior checking.r Time supervision: In time critical software, a cor-rect output that is not produced within a specifiedresponse time interval may also constitute an excep-tion. Automatic detection of time-related exceptionsis achieved through tracking the state of a workflowas well as the elapsed times between specified re-quest and response pairs. System’s time behavioris derived directly from the workflow application’srequirement specifications. Time supervision allowsexceptions in a workflow system to be detected inreal-time based the specification of its external be-havior. This feature is of particular benefit because aWfMS often needs to integrate legacy systems, soft-ware component purchased from other vendors, andtasks developed by third parties.

3.8. Exception Handling

We have conducted several workflow projects such asmodeling and development for real-world workflow ap-plications (e.g., the statewide immunization trackingapplication [13]), and in using flexible transactions inmulti-system telecommunication applications [14]. Inthe following, we formulate some of the essential re-quirements for exception handling based on our priorexperiences and our understanding of the current stateof workflow technology and its real-world or realisticapplications [2, 15].

r Support specification for exception handling: We al-low an application developer to specify various typesof user-defined exceptions to catch anticipated fail-ures or deviations.

Page 12: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

136 Luo et al.

r Support task-specific exception handling: Tasks areviewed as black boxes for a WfMS. The workflowenactment service doesn’t have to be concerned withthe realization of the tasks. The workflow tasks, ingeneral, are more complex than database transac-tions, and represent a logical activity in the overallorganizational workflow. It is therefore critical tobe able to detect the exceptions returned by arbitrarytasks.r Localize exceptions: The first attempt to handle ex-ceptions is to isolate and mask them. This can pre-vent the problem affecting other parts of the system.r Support exception handling by forward recovery: wegive a general discussion about exception handlingthrough workflow recovery.r Support human-assisted exception handling: It isimpossible to guarantee the success of exceptionhandling mechanism due to the non-deterministicnature of exceptions. Therefore, the involvement of ahuman is critical for resolving erroneous conditionsthat could not be dealt with by the WfMS automati-cally. Thus, human involvement is a very importantelement for exception handling.

We adopt an exception handling mechanism that in-volves exception masking and propagation (see Fig. 5).As shown in Fig. 5, a task has four states,initial , exe-cute, fail, andcomplete[16]. In theexecutestate, thetask is actually being executed. When the executionfinishes, the task either entersfail state orcomplete

Figure 5. Exception masking and propagation in defeasible workflow.

state. The task can throw many exceptions. An excep-tion is masked if a local exception handler can handleit. Otherwise, the exception will be propagated. Thereare two types of exception propagation:structural ex-ception propagationand knowledge based exceptionpropagation. The structural exception propagation fol-lows the control (or calling) structure of the workflowinter-task dependencies enforcement. The knowledgebased exception propagation can directly propagate ex-ceptions to knowledgeable workflow components suchas a CBR based exception handling component andhuman agents.

Exception masking prevents an exception in one partof the workflow systems from affecting other parts ofthe systems. Our intention of masking exceptions triesto capture an exception as close to its point of occur-rence as possible, and a suitable handler at the samelevel of that point will first handle this exception. Ex-ception masking includes corrective actions such asignore, warnings, retry, procedural actions, workflowrecovery, workflow modifications, suspend/stop, andso on. In the following, two categories of exception-masking techniques (hierarchical and grouping) arediscussed that are used to organize the correctiveactions.

r Hierarchical exception masking: If a component de-pends on lower-level components to correctly pro-vide its service, then an exception of a certain type ata lower level of abstraction can result in an exception

Page 13: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

Exception Handling in Workflow Systems 137

of a different type at a higher level of abstraction. Ex-ception propagation among components situated atdifferent abstraction levels of the hierarchy can be acomplex phenomenon. For example, if WfMS needsservices provided by infrastructure, such as ORB,Internet, with arbitrary exception semantics, WfMSwill likely have arbitrary exception semantics, un-less it has some means to check the correctness ofthe results provided by infrastructure. In such hierar-chical systems exception handling provides a conve-nient way to propagate information about exceptiondetection across abstraction levels and to mask low-level exceptions from higher-level components.r Group exception masking: One way to ensure thata service remains available to clients despite serverfailures is to implement the service by a group ofredundant, physically independent servers, so that ifsome of these fail, the remaining ones can providethe service. Group masking can mask the excep-tions of an individual member, whenever the groupresponds as specified to service requestors despitethe individual’s exceptional situation. With groupmasking, individual member exceptions are entirelyhidden from service requestors by the group manage-ment mechanisms. Group masking uses redundantinformation to deliverer the correct service.

3.8.1. Exception Handling Through Workflow Recov-ery. Like [17], we believe that ignore, retry, back-ward recovery are good exception handling candidatesthat can be used in different situations. For exam-ple, retry is not applicable if the workflow applica-tions cannot be retried. Serious exceptional situationscannot be ignored. Furthermore, in real-world work-flow applications we have seen that most tasks arenon-transactional, and often involve long-lived tasks,thereby not supporting the strict ACID properties oftransactions [18]. Hence, although desirable, it mightnot be possible to recover failed non-transactional tasksusing backward recovery. The use of backward re-covery for most human-oriented tasks is not a viablesolution since most erroneous actions once performedcannot be undone. It might be possible for the hu-man to rectify all the inconsistencies caused due tothe error and redo the actions without affecting othertasks or data objects within the workflow; however, itwould be rare to expect this behavior for most real-world human-tasks. Backward recovery is useful forpurely data-oriented tasks that are transactional tasksor networks. Besides we also need a forward recovery

based exception handling mechanism that would se-mantically undo, or compensate a partially failed task.

3.8.2. Exception Handling Through Dynamic Work-flow Changes. Due to foreseen and unforeseen sit-uations, i.e., exceptions, pre-defined workflows mayneed to be modified to adapt to such exceptional events.However, the resulted workflows must be correct even-tually, i.e., the workflow graph should exist after cor-rective actions are performed. Possible modificationsto workflows can be at instance level or at model level.The ways of modifications that will result in correctworkflows are as follows.r Modifying JECA rules: It can modify any part in a

JECA rule. To ensure a workflow graph exists in theresulted workflow, the enabling relationships shouldbe kept.r Inserting JECA rules: There are two ways of inser-tions. One way is to insert the rules parallel to anexisting rule. If the inserted rules have same en-abling relationship as the existing rule, the resultedworkflow remains correct. The other way is to inserta new ruler between two existing rulesr1 andr2 thatr1 enablesr2. Ther1 andr2 must be modified suchthatr1 enablesr , andr enablesr2.r Removing JECA rules: If the removal of a rule doesnot modify the enabling relationship, the resultedworkflow remains correct. Otherwise, if ruler is re-moved from two rulesr1 andr2 that thatr1 enablesr ,andr enablesr2, rulesr1 andr2 need to be modifiedsuch that rulesr1 enablesr2.r Commutative change: For example, if enabling re-lationship between rulesr1 andr2 is commutative,r1 enablesr2 can be changed tor2 enablesr1.r Combination of the above.

3.8.3. User Exception Support.Users can antici-pate and provide solutions to deal with certain excep-tional situations. To support user exception handling,we provide tools to allow workflow application de-signers to define their own exceptions, called user ex-ceptions. In the exception design, it is necessary toprovide mappings between exceptions and exceptionsources (see Fig. 6). Exception sources include er-rors, faults, failures, and other exceptional situations.Like [19], we view constraints as exception sourcestoo. That is, when constraints are broken, exceptionsmay be raised. By providing such a mapping mecha-nism, workflow application designers can decide which

Page 14: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

138 Luo et al.

Figure 6. Mappings between exceptions, exception sources, and exception handlers.

exceptions should be raised when those abnormal sit-uations occur.

Also, workflow application designers can providemappings from exceptions to exception handlers sothat when such exceptions are raised, systems canknow which exception handlers are good candidates.The candidates for exception handlers are ignoring,warning, retry, suspend/stop/resume, workflow recov-ery operations (e.g., backward recovery, forward re-covery, alternative tasks, etc.), workflow modificationsand evolutions, and other exception masking and prop-agation operations. Those mappings will be specifiedin JECA rules.

The following JECA rule,

Event: taskAssignExceptionCondition: agent not available and task priority

is urgentAction: task-reassignmentJustification: newborn infant transport to NICU,

shows a mapping example of a task assignment excep-tion in the infant transport application. In the resourceallocation step, if there are no human agents avail-able who can play the health professional role, thentaskAssignException exception will be raised. Sincethe infant should be transported immediately, anotherqualified healthcare professional for the healthcare pro-fessional role must be found.

Figure 7 shows an example of user exception-handling scenario that involves taskAssignExceptionexception. This exception will be declared as a taskAs-signException class. By using the mappings designed,workflow systems will register the exception detectorand handler for that taskAssignException exception.When the task assignment exceptional situation as de-fined by users is detected, a taskAssignException israised. The taskAssignException exception object willbe forwarded to the registered exception handler viathe exception adapter. The exception adapter in thisscenario act as a bridge that de-couples exception de-tectors and handlers.

3.8.4. Knowledge Based Exception Handling.Ex-ceptional situations are usually very complicated. Aknowledge-based approach is a good candidate indealing with such complicated situations. It can helpworkflow designers and participants better manage theexceptions that can occur during the enactment of aworkflow by capturing and managing the knowledgeabout what types of exceptions can occur in workflows,how these exceptions can be detected, and how theycan be resolved. Exception knowledge bases shouldbe generic and reusable. This approach should allowthe users to navigate through that knowledge base tofind support for his decision on how to handle a cer-tain exception. An explanatory module could also be

Page 15: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

Exception Handling in Workflow Systems 139

Figure 7. User exception handling scenario.

incorporated into the knowledge systems to explainthe exceptional situation and solutions. We proposeda CBR-based exception handling mechanism that ad-dresses these requirements.

4. CBR-Based Exception Handling

In this section, we pay more attention on dealing withuncertainties, experience acquisition, and supportinghuman involvement in exception handling through aCBR-based exception handling mechanism. In the in-fant transport application, when an exceptional situa-tion is discovered the personnel on board should makedecisions, assisted by the process automation mecha-nism, as to which tasks should be taken. A CBR-basedexception handling mechanism with integrated human-computer processes is used to facilitate such decision-makings in handling exceptions. When the case-basedreasoning component is notified about the exceptionalsituation of the infant, similar cases stored in the caserepository will be retrieved and analyzed. The resultfrom the analysis will be reused to facilitate the deci-sion making process in exception handling.

4.1. Case Resolution

The CBR based approach models how reuse of storedexperiences contributes to expertise [5]. In this ap-proach, new problems are solved by retrieving storedinformation about previous problem solving steps andadapting it to suggest solutions to the new problems.The results are then added to the case repository forfuture use. A case usually consists of three parts: prob-lem, solution, and effect as illustrated in Fig. 8. Prob-lems will be obtained through user or system input.There are solution candidates for those problems. Theselection of the output solution will be based on theanalysis of the effects of those solutions.

During the workflow execution, if an exception ispropagated to the CBR based exception-handling com-ponent, the case-based reasoning process is used to de-rive an acceptable exception handler. Human involve-ment is needed when acceptable exception handlerscannot be automatically obtained. Solutions given bya person will also be incorporated into the case repos-itory. Effects of the exception handler candidates onthe workflow system and applications will be evaluated.

Page 16: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

140 Luo et al.

Figure 8. Exception handling via CBR based mechanism.

Thus, when the exception is handled, necessary modifi-cations to the workflow systems or applications may bemade. There are four steps in the exception handling byexception handler that involve case-based reasoning:r Retrieval of the most similar cases to the identified

exceptional situation,r Analysis of the solution from the most similar cases,r Adaptation of the most similar cases, andr Updating of the system by adding the verified solu-tion to the case repository.

4.2. Case-Based Reasoning Architecture

As shown in Fig. 9, the case-based reasoning architec-ture consists of the following components.r Abstractor: abstracts the exceptional situation based

on ontology.r Retrieval component: retrieves related cases fromcase repository.r Analyzer: analyzes the solution in the cases, andtries to derive a possible solution.r Retainer: writes back the new case into the reposi-tory and outputs the solutions.r Case repository: it is the place where cases arestored.r Editor: provides GUI interface to modify cases.r Browser: provides GUI interface to allow users tobrowse the case repository.r Explainer: explains the cases to users through GUIinterface. This component facilitates users to under-stand cases, such as what the cases are, why thesolutions are effective, and their effects.

r Ontology & Concept component: manages the con-cept in multiple domains.r Similarity Measure component: provides similar-ity measure algorithms during case retrieval. Casesare often heterogeneous. Different cases need differ-ent similarity measure algorithm. This is achievedthrough separation of this component from retrievalcomponent.

When exceptions occur, the abstractor identifies theexceptional situation via input (system input or userinput). It extracts information from input. Output ofabstractor is fed to retrieval component. It retrieves sim-ilar cases for analysis by the case analyzer. An accept-able solution may be derived at this stage. That is, thesolution of a similar case can be applied. The new de-rived case will be forwarded to retainer. Retainer maywrite back the new case and will construct solutionsand supply it to systems or users. In some situationsno similar cases can be found or the analyzer is notconfident about the similarity measurement. That is,the value of the similarity measure is too low. In suchsituations, the case editor will allow users to interact inderiving acceptable solutions. During the interaction,the explainer helps users to understand the cases.

4.3. Ontology-Based Case Management

Since this case-based reasoning system needs to applyin various domains (such as healthcare, telecommuni-cation, finance, etc.), we use ontology to describe theconcepts used in various domains. Figure 10 showsan example of a high level ontology for disorders inmedical intensive care we developed for the healthcare

Page 17: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

Exception Handling in Workflow Systems 141

Figure 9. Case-based reasoning architecture in defeasible workflow.

Figure 10. A high level ontology for disorders in medical intensive care.

workflow application. In this CBR based exception-handling approach, we usually need to acquire, repre-sent, index and adapt existing cases to effectively applythe case-based reasoning process. In this paper, we ac-quire new cases by learning through exceptions. Thatis, we create new cases when exceptions occur. Weadopt an ontology-based case management:

r Abstraction: we extract information from excep-tional situations and represent them as cases viaontology.

r Retrieval: we use ontology to organize cases. Theretrieval of similar cases is based on ontology-basedsimilarity measurement.r Adaptation: adaptation increases the usability of ex-isting cases.

4.3.1. Case Abstraction. Cases are derived from ex-ception instances. An instance of the class Exception,or one of its subclasses, represents an exception. Theexception instance or object should carry informa-tion from the point at which exception occurs to the

Page 18: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

142 Luo et al.

component that detects it. Due to exception maskingand propagation, an exception object may pass amongdifferent components situated at the same or at differentlayers. During the propagation, information containedin the exception object may increase to record the han-dling history.

That is, an exception object may change during thepropagation. An instance of an exception can be serial-ized when necessary to achieve a persistent exceptionobject. In situations when case-based reasoning com-ponent is called to derive a handler for an exception,the information contained in that exception object isused in case-based reasoning to identify the exceptionalsituation.

Case abstraction involves two processes: determina-tion of facts about exceptional situations and extractionof texts that can be used to summarize an exceptionbased on ontology. The goal of extraction is to identifyexceptional situations and extract index terms that willgo into case repository. Rather than trying to determinespecific facts, the goal for user input or text summariza-tion is to extract a summary of an exceptional situation.The abstraction is used to represent the exceptional sit-uations for search purposes or as a way to enhance theability to browse a case repository. It is worth noticingthat the abstraction usually is not necessarily complete.It may be just a partial description about the exceptionalsituation.

4.3.2. Case Retrieval. The retrieval procedure of sim-ilar previous cases is based on the similarity measure

Figure 11. Example of two retrieved similar cases.

that takes into account both semantic and structuralsimilarities and differences between the cases. A simi-larity measure can be achieved by computing the near-est neighborhood function of the quantified degrees ofthose semantic similarities (e.g., risk factors, age) be-tween cases and structural similarities (e.g., AND, ORbuilding blocks) between workflow specifications. Toconduct a similarity based case retrieval, the similar-ities should be computed between targeted case (newsituation) and old cases for three components: prob-lem, context and resolutions. Each component mayhave its attributes that are application dependent. Theyare weighted according to the similarity measure algo-rithms used. The quantified value of similarity abouteach attribute between two cases is based on the on-tology, for example, shown in Fig. 10. Such similaritymeasure setting up information is stored in the simi-larity measure, and ontology and concept components(see Fig. 9).

For example, there is a new situation that a newborninfant Mike Clinton needs initial intravenous mainte-nance. His age is about 15 days. To derive the actualsolution, similar cases are retrieved. In this case, thereare two attributes that can describe the exceptional sit-uation, age and risk factor. Suppose there are twosimilar cases retrieved as shown in Fig. 11. Case ais retrieved because the new situation is described bythe same term as case A—intravenous maintenance.Case B is similar to the new situation because the heartmurmur is a kind of cardiovascular risk factor. Bothintravenous maintenance and heart murmur are related

Page 19: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

Exception Handling in Workflow Systems 143

Figure 12. Example of parameterized adaptation.

to the cardiovascular concept. In determining whichcase is more similar, a smaller weight is put on age(weight 1) than on the intravenous maintenance (50).That is, the risk factor is more important than age if theage difference is less than 2 weeks. By using the onelevel ontology tree (see Fig. 10), the problem differencebetween case A and new situation is 0, while the differ-ence between case B and new situation is at least 1 (be-cause there is at least one level concept difference in theontology tree). The similarity measure result betweencase A and new situation is 36((15− 9)2∗1+ 0∗1).The similarity measure result between case b and newsituation is 50((15− 15)2∗1+ 1∗50). Thus, case A isthe most similar case.

4.3.3. Case Adaptation. There are two main ap-proaches to realize case adaptation:r Problem adaptation: One way to adapt a case is to

enhance the partial description about the problem.Another way is to substitute conception realizationin a case.r Solution adaptation: There are two ways associatedwith solution adaptation [5]: (1) reuse the past casesolution (transformational reuse), and (2) reuse thepast methods that constructed the solution (deriva-tional reuse).

There might be combinations of the above twoapproaches. However, a case can be used withoutany modification, which is usually called NULL

adaptation. Partial matching during case retrieval is alsoone way to realize case adaptation. There are usuallytwo approaches to case modifications: parameterizedmodification and substitution modification. An exam-ple for parameterized modification is a physician mightchange the dose of a drug or other solutions accordingto the situations of a baby such as age or weight (seeFig. 12). Modification of a case by substitution usu-ally results from the fact that the case or part of thecase is not applicable in new situations or not avail-able. In the healthcare domain, a physician usually canprescribe different drugs to patients with similar symp-toms according to either patients’ demand or situationsof the drugs market (for example, a new, more effec-tive drug has come into the market). Figure 13 showsanother example for adaptation (modification) bysubstitution.

4.4. Integration of Case Management

There are two approaches for the integration of theCBR-based system into a WfMS. One is to integrateit as a built-in component. In this approach, an ex-ception can be propagated to the CBR-based compo-nent through knowledge based exception propagation(See Fig. 5). Since CBR-based system usually doesnot participate the workflow inter-task dependenciesenforcement, structural exception propagation that re-lies on control structures can also propagate the ex-ception to CBR-based system usually as the last re-sort to handle exceptions. The other approach is to

Page 20: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

144 Luo et al.

Figure 13. Example of adaptation by substitution.

Figure 14. Cooperation between WfMSs in two different organizations.

outsource the CBR-based system in the context ofinter-organizational workflows (See Fig. 14). Assumethe CBR-based system has been integrated using thefirst approach in a workflow system. The solution ofoutsourcing the CBR-based system to other workflowsystems becomes the cooperation between two work-flow systems.

When an exception occurs in the organization B, or-ganization B may want to document the exception han-dling process. The enactment component of WfMS inorganization B needs to acquire a reference to the ser-vice requestor of the CBR-based service of organiza-tion A. Such a handshaking is accomplished throughthe inter-operable gateway as shown in Fig. 13. Thisgateway specifies how a service requestor can be ob-tained, how a service can be activated once service

requestor is obtained and how the required acknowl-edgement is to be returned. An example of such a gate-way is jFlow [20], a proposal admitted by OMG to meetthe request for workflow interoperability. A more com-plicated gateway is proposed in [21]. The requestoris responsible for identifying the correct CBR-basedservice process. The process manager who acts as aprocess factory creates the process. The Requester isassociated with the process when it is created and willreceive states change notifications from the process.

5. Related Work

The importance of incorporating the exception han-dling mechanism into WfMSs has been identified byresearchers and projects (e.g., METEOR [16], WAMO

Page 21: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

Exception Handling in Workflow Systems 145

[17]). The role of exceptions in office information sys-tems has been discussed at length in [22]. The authorpresents a theoretical basis, based on Petri-Net, fordealing with different types of exceptions. This tax-onomy presented is purely driven by organizational se-mantics rather that being driven by a workflow processmodel. [23] reports several works in workflow ex-ception handling, which includes using ECA-rules tomodel expected exceptions [24], a general discussionabout exceptions in systems based on objected orienteddatabases [25]. [23] also reports taxonomy for excep-tions in workflow systems. That exception taxonomycan be reused in our CBR based exception-handlingsystem to help measure the similarities among casesduring case retrieval and analysis. Since exceptionalsituations are often very complicated, knowledge basedsystems are good candidates in such situations.

An exception is usually raised to signal exceptionalsituations like errors and failures. Error handling indatabase systems has typically been achieved by abort-ing transactions that result in an error [26]. In [27]a failure handling mechanism uses a combination ofprogramming language concepts and transaction pro-cessing techniques. However, aborting or canceling aworkflow task would not always be appropriate or nec-essary in a workflow environment. Tasks could encap-sulate diverse operations unlike a database transaction;the nature of the business process could be forgiven tosome errors thereby not requiring an undo operation.Therefore, the error handling semantics of traditionaltransactional processing systems are too rigid for work-flow systems.

“Ad-hoc” workflows [28, 29] solve problems caseby case. Our approach proposed in this paper supportmore than “ad-hoc” workflows. In the first place, weuse a context-dependent approach to support adaptiveexception handling. We support “ad-hoc” workflows inthe sense when exceptions occur, non-standard excep-tion handlers are not derivable, CBR-based approach isused as the last resort. In addition to solving problemsas in ad-hoc workflows, the CBR-based exception han-dling system collects exception-handling cases, derivesexception-handling patterns from the experiences cap-tured in exception handling, and tries to reuse the priorgained exception handling experiences in the future.

6. Conclusions

Exceptional situations can be very complicated phe-nomena. Fully automated exception handling processesgenerally are not possible [30]. Mechanisms based only

on “ad-hoc” exception handling are not acceptable ei-ther. An integrated human-computer exception han-dling mechanism should be present to support a sys-tematic way of exception masking and propagation.

In this paper, we have given a three-dimensional per-spective of exceptions that gives directions to buildexception-aware workflow systems. We have formu-lated a defeasible workflow based exception handlingapproach. We identified the requirements for dealingwith exceptions in heterogeneous workflow environ-ments. Our exception handling mechanism involvesmapping heterogeneous infrastructure and applicationspecific exceptions into workflow exceptions. Ignore,task retries, alternate tasks, workflow recovery, work-flow modifications, and other exception handlers havebeen integrated with the exception model to providea complete solution that involves masking and propa-gation for dealing with exceptions encountered duringworkflow enactment. In addition, we propose a CBR-based system, enhanced by ontology based knowledgemanagement, to learn from experience to provide asystem assisted decision-making method to facilitatethe understanding of exceptions and derivation of ac-ceptable exception handlers. This involves reusing theexperiences captured in prior exception handling cases.

Acknowledgments

Special thanks goes to Tarcisio Lima for providingbackground materials in developing ontology. We alsohave held discussions with our METOER team mem-bers Jorge Cardoso and Zhonqiao Li, etc. This workis partially supported by the NIST Advanced Techno-logy Program in Healthcare Information Infrastruc-ture Technology (under the HIIT contract, number70NANB5H1011) in partnership with Healthcare OpenSystems and Trials (HOST) consortium.

References

1. D. Georgakopoulos, M. Hornick, and A. Sheth, “An overviewof workflow management: From process modeling to workflowautomation infrastructure,”Distributed and Parallel Databases,vol. 3, no. 2, pp. 119–154, 1995.

2. A. Sheth, D. Georgakopoulos, S. Joosten, M. Rusinkiewicz, W.Scacchi, J. Wileden, and A. Wolf, “Report from the NSF work-shop on workflow and process automation in information sys-tems,” Technical Report, University of Georgia, UGA-CS-TR-96-003, 1996.

3. S. Jablonski and C. Bussler,Workflow Management: Model-ing Concepts, Architecture and Implementation, InternationalThomson Publishing, 1996.

Page 22: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

146 Luo et al.

4. A. Cichocki, A. Helal, M. Rusinkiewicz, and D. Woelk,Work-flow and Process Automation: Concepts and Technology,Kluwer Academic Publishers, 1997. ISBN 0-7923-8099-1.

5. A. Aamodt, “Case-based reasoning: Foundational issues, metho-dological variations, and system approaches,”Artificial Intelli-gence Communications, vol. 7, no. 1, IOS Press, 1994.

6. P. Meier and J. Paton,Clinical Decision Making in NeonatalIntensive Care, Grune & Stratton: Orlando, Florida, 1984.

7. V. Marek and M. Truszczynski,Non-Monotonic Logic, Context-Dependent Reasoning, Springer-Verlag, 1993.

8. S. Ceri, P. Grefen, and G. Sanchez, “WIDE: A distributed ar-chitecture for workflow management,” inProceedings of RIDE1997, Birmingham, UK, April 1997.

9. G. Kappel, S. Rausch-Schott, and W. Retschitzegger,Coordi-nation in Workflow Management Systems—A Rule-based Ap-proach, Springer, 1998. LNCS 1364.

10. E. Baralist, S. Ceri, and S. Paraboschi, “Improved rule analy-sis by means of triggering and activation graphs,” inProc. ofthe Second Workshop on Rules in Database Systems, Athens,Greece, September 1995, edited by T. Sellis, vol. LNCS 985,pp. 165–181.

11. N. Paton (ed.),Active Rules in Database Systems, Springer,1999.

12. D. Worah, A. Sheth, K. Kochut, and J. Miller, “An error handlingframework for the ORBWork workflow enactment service ofMETEOR,” Technical Report, Dept. of Computer Science, Univ.of Georgia, 1997.

13. A. Sheth, K.J. Kochut, J. Miller, D. Worah, S. Das, C. Lin, D.Palaniswami, J. Lynch, and Shevchenko, “Supporting state-wideimmunization tracking using multi-paradigm workflow technol-ogy,” in Proc. of the 22nd. Intnl. Conference on Very Large DataBases, Bombay, India, September 1996.

14. M. Ansari, L. Ness, M. Rusinkiewicz, and A. Sheth, “Usingflexible transactions to support multi-system telecommunicationapplications,” inProc. of the 18th Intl. Conference on Very LargeData Bases, Vancouver, Canada, August 1992, pp. 65–76.

15. A. Sheth and S. Joosten (eds.), inWorkshop on Workflow Man-agement: Research, Technology, Products, Applications andExperiences, Athens, GA, August 1996.

16. N. Krishnakumars and A. Sheth, “Managing heterogeneousmulti-system tasks to support enterprise-wide operations,”Jour-nal of Distributed and Parallel Database Systems, vol. 3, no. 2,1995.

17. J. Eder and W. Liebhart, “Contributions to exception handlingin workflow systems,” inEDBT Workshop on Workflow Man-agement Systems, Valencia, Spain, 1998.

18. D. Worah and A. Sheth, “Transactions in transactional work-flows,” in Advanced Transaction Models and Architectures,edited by S. Jajodia and L. Kerschberg, Kluwer Academic Pub-lishers: Boston, 1997.

19. A. Borgida and T. Murata, “Tolerating exceptions in workflows:A unified framework for data and processes,” inProceedings ofthe International Joint Conference on Work Activities Coordina-tion and Collaboration, WACC’99, San Francisco, CA, February22–25, 1999.

20. Joint Workflow Management Facility Revised Submission toBODTF RFP #2 Workflow Management Facility—CoCreate,Concentus, CSE, DAT, DEC, DSTC, EDS, FileNet, Fujitsu,Hitachi, Genesis, IBM, ICL, NIIIP, Oracle, Plexus, SNI, SSA,Xerox,http://www.omg.org/cgi-bin/doc?bom/98-06-07.

21. H. Ludwig and K. Whittingham, “Virtual enterprise coordi-nator—Agreement-driven gateways for cross-organizationalworkflow management,” inProceedings of the InternationalJoint Conference on Work Activities Coordination and Collab-oration, WACC’99, San Francisco, CA, Februrary 22–25, 1999.

22. H. Saastamoinen, “On the handling of exceptions in informationsystems,” Ph.D. Thesis, University of Jyvaskyla, 1995.

23. M. Klein, C. Dellarocas, and A. Bernstein (eds.),OnlineProceedings of CSCW98 Workshop Towards Adaptive WorkflowSystems, Seattle, WA, 1998.

24. F. Casati, “A discussion on approaches to handling exceptionsin workflows,” in CSCW98,Towards Adaptive Workflow Work-shop, Seattle, WA, 1998.

25. D. Chiu, K. Karlapalem, and Q. Li, “Exception handling withworkflow evolution in ADOME-WFMS: A taxonomy and res-olution techniques,” inCSCW98, Towards Adaptive WorkflowWorkshop, Seattle, WA, 1998.

26. J. Gray and A. Reuter,Transaction Processing: Concepts andTechniques, Morgan Kaufmann Publishers: San Mateo, CA,1993.

27. C. Hagen and G. Alonso, “Flexible exception handling in theOPERA process support system,” in18th International Confer-ence on Distributed Computing Systems (ICDCS), Amsterdam,The Netherlands, May 1998.

28. M. Voorhoeve and W. Aalst, “Ad-hoc workflow: Problems andsolutions,” inDEXA Workshop, 1997.

29. H. Wedekind, “Specifying indefinite workflow functions in ad-hoc dialogs,” inDEXA Workshop, 1997.

30. D. Strong and S. Miller, “Exceptions and exception handling incomputerized information processes,”ACM Trans. InformationSystem, vol. 13, no. 2, 1995, pp. 206–233.

Zongwei Luo is a Ph.D. candidate in computer science in the Univer-sity of Georgia. He works as a research assistant in the Large ScaleDistributed Information System Lab. His research interests includee-commerce, enterprise application integration, work coordinationand collaboration, and knowledge based systems. He is a studentmember of the IEEE and ACM. He received his BS, MS degreesin computer science and technology from Huazhong University ofScience and Technology, China.

Amit Sheth directs the Large Scale Distributed Information Sys-tems (LSDIS) Lab and is a full Professor in the Department of Com-puter Science. He also founded and serves as president of InfocosmInc. (http://www.infocosm.com). His technical interests includeenterprise integration, semantics in global information systems, dig-ital libraries, e-commerce, and digital media. He received his BEin electrical and electronics engineering from the Birla Institute ofTechnology and Science, Phiani, India, and his MS and Ph.D. incomputer and information technology from Ohio State University.He is a member of the IEEE and ACM.

Krys Koehut is an Associate Professor of Computer Science at theUniversity of Georgia. His research focus on Database systems, com-puter systems for genome mapping, user interfaces, ComputationalGenetics. Dr. Kochut obtained his Ph.D. from Louisiana State Uni-versity in 1987. His recent course instructions include Compilers,Software Engineering, and Advanced Software Engineering.

Page 23: Exception Handling in Workflow Systems › documentation › documents › ... · exception handling mechanism to deal with those de-viations. Exception-handling constructs, as well

Exception Handling in Workflow Systems 147

John Miller is an Associate Professor and the Graduate Coordi-nator in the Department of Computer Science at the University ofGeorgia. His research interests include Database Systems, Simu-lation and Workflow as well as Parallel and Distributed Systems.Dr. Miller received the BS degree in Applied Mathematics fromNorthwestern University in 1980, and the MS and Ph.D. in Informa-tion and Computer Science from the Georgia Institute of Technology

in 1982 and 1986, respectively. During his undergraduate education,he worked as a programmer at the Princeton Plasma Physics Labora-tory. Dr. Miller is the author of over 60 technical papers in the areasof Database, Simulation and Workflow. He has been active in theorganizational structures of research conferences in all these threeareas. He has also been a Guest Editor for the International Journalin Computer Simulation as well as IEEE Potentials.