structuring acyclic process models

21
Structuring acyclic process models Artem Polyvyanyy a,n , Luciano Garcı ´a-Ban ˜ uelos b , Marlon Dumas b a Hasso-Plattner-Institute, University of Potsdam, Prof.-Dr.-Helmert-Str. 2-3, D-14482 Potsdam, Germany b Institute of Computer Science, University of Tartu, J. Liivi 2, Tartu 50409, Estonia article info Available online 28 October 2011 Keywords: Process modeling Structured modeling Structuring Model equivalence Petri net unfolding Modular decomposition abstract This article studies the problem of transforming a process model with an arbitrary topology into an equivalent well-structured process model. While this problem has received significant attention, there is still no full characterization of the class of unstructured process models that can be transformed into well-structured ones, nor an automated method for structuring any process model that belongs to this class. This article fills this gap in the context of acyclic process models. The article defines a necessary and sufficient condition for an unstructured acyclic process model to have an equivalent well-structured process model under fully concurrent bisimulation, as well as a complete structuring method. The method has been implemented as a tool that takes process models captured in the BPMN and EPC notations as input. The article also reports on an empirical evaluation of the structuring method using a repository of process models from commercial practice. & 2011 Elsevier Ltd All rights reserved. 1. Introduction In contemporary business process modeling notations, such as the Business Process Model and Notation (BPMN) [1] and Event-driven Process Chain (EPC) [2], a process model is composed of nodes (e.g., tasks, events, gateways) connected by a ‘‘flow’’ relation. Although these notations allow process models to have almost any topology, it is often desirable that models abide by some structural rules. In this respect, a well-known property of process models is that of (well-)structuredness [3], meaning that for every node with multiple outgoing arcs (a split) there is a corresponding node with multiple incoming arcs (a join), and vice versa, such that the set of nodes between the split and the join forms a single-entry-single-exit (SESE) region; otherwise the process model is unstruc- tured. For example, Fig. 1(a) shows an unstructured process model (splits u, v and joins w, x do not have a corresponding node), while Fig. 1(b) shows an equivalent structured model. The models are captured using BPMN language. Fig. 1(b) uses short names for tasks ða, b, c, ...Þ, which appear next to each task in Fig. 1(a). Observe that the equivalent structured model captures the same infor- mation about potential concurrent executions of tasks as the unstructured model. As will be explained later, such a relation on process models corresponds to the notion of fully concurrent bisimulation [4]. This article studies the problem of automatically transforming unstructured process models into equiva- lent well-structured models. The motivations for such a transformation are manifold. Firstly, it has been empiri- cally shown that structured process models are easier to comprehend and less error-prone than unstructured ones [5]. Similarly, it has been shown in [6,7] that structured- ness and modularity (e.g., by usage of procedures) improves software maintainability. Thus, a transforma- tion from an unstructured into a structured process model can be used as a refactoring technique for increasing process model understandability. Secondly, a number of existing process model analysis techniques only work for structured models. For example, a method for calculating Contents lists available at SciVerse ScienceDirect journal homepage: www.elsevier.com/locate/infosys Information Systems 0306-4379/$ - see front matter & 2011 Elsevier Ltd All rights reserved. doi:10.1016/j.is.2011.10.005 n Corresponding author. Tel.: þ49 331 5509 195. E-mail addresses: [email protected] (A. Polyvyanyy), [email protected] (L. Garcı ´a-Ban ˜ uelos), [email protected] (M. Dumas). Information Systems 37 (2012) 518–538

Upload: artem-polyvyanyy

Post on 05-Sep-2016

216 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Structuring acyclic process models

Contents lists available at SciVerse ScienceDirect

Information Systems

Information Systems 37 (2012) 518–538

0306-43

doi:10.1

n Corr

E-m

(A. Poly

marlon.

journal homepage: www.elsevier.com/locate/infosys

Structuring acyclic process models

Artem Polyvyanyy a,n, Luciano Garcıa-Banuelos b, Marlon Dumas b

a Hasso-Plattner-Institute, University of Potsdam, Prof.-Dr.-Helmert-Str. 2-3, D-14482 Potsdam, Germanyb Institute of Computer Science, University of Tartu, J. Liivi 2, Tartu 50409, Estonia

a r t i c l e i n f o

Available online 28 October 2011

Keywords:

Process modeling

Structured modeling

Structuring

Model equivalence

Petri net unfolding

Modular decomposition

79/$ - see front matter & 2011 Elsevier Ltd A

016/j.is.2011.10.005

esponding author. Tel.: þ49 331 5509 195.

ail addresses: [email protected]

vyanyy), [email protected] (L. Garcıa-Banu

[email protected] (M. Dumas).

a b s t r a c t

This article studies the problem of transforming a process model with an arbitrary

topology into an equivalent well-structured process model. While this problem has

received significant attention, there is still no full characterization of the class of

unstructured process models that can be transformed into well-structured ones, nor an

automated method for structuring any process model that belongs to this class. This

article fills this gap in the context of acyclic process models. The article defines a

necessary and sufficient condition for an unstructured acyclic process model to have an

equivalent well-structured process model under fully concurrent bisimulation, as well

as a complete structuring method. The method has been implemented as a tool that

takes process models captured in the BPMN and EPC notations as input. The article also

reports on an empirical evaluation of the structuring method using a repository of

process models from commercial practice.

& 2011 Elsevier Ltd All rights reserved.

1. Introduction

In contemporary business process modeling notations,such as the Business Process Model and Notation (BPMN)[1] and Event-driven Process Chain (EPC) [2], a processmodel is composed of nodes (e.g., tasks, events, gateways)connected by a ‘‘flow’’ relation. Although these notationsallow process models to have almost any topology, it isoften desirable that models abide by some structuralrules. In this respect, a well-known property of processmodels is that of (well-)structuredness [3], meaning thatfor every node with multiple outgoing arcs (a split) thereis a corresponding node with multiple incoming arcs (ajoin), and vice versa, such that the set of nodes betweenthe split and the join forms a single-entry-single-exit(SESE) region; otherwise the process model is unstruc-

tured. For example, Fig. 1(a) shows an unstructuredprocess model (splits u, v and joins w, x do not have a

ll rights reserved.

dam.de

elos),

corresponding node), while Fig. 1(b) shows an equivalentstructured model. The models are captured using BPMNlanguage. Fig. 1(b) uses short names for tasks ða,b,c, . . .Þ,which appear next to each task in Fig. 1(a). Observe thatthe equivalent structured model captures the same infor-mation about potential concurrent executions of tasks asthe unstructured model. As will be explained later, such arelation on process models corresponds to the notion offully concurrent bisimulation [4].

This article studies the problem of automaticallytransforming unstructured process models into equiva-lent well-structured models. The motivations for such atransformation are manifold. Firstly, it has been empiri-cally shown that structured process models are easier tocomprehend and less error-prone than unstructured ones[5]. Similarly, it has been shown in [6,7] that structured-ness and modularity (e.g., by usage of procedures)improves software maintainability. Thus, a transforma-tion from an unstructured into a structured process modelcan be used as a refactoring technique for increasingprocess model understandability. Secondly, a number ofexisting process model analysis techniques only work forstructured models. For example, a method for calculating

Page 2: Structuring acyclic process models

Pay by cash

Pay by cheque

Update account

ApproveR1

P1

Reject payment request

Inform customer

B1

P2

P3

a

b

c

d

e f

i ot

u

v

w

x y z

b

a

P3

i ov w x y z

P1

B1

P2

B2 B3

c

d

e f

Fig. 1. (a) An unstructured acyclic process model and (b) its equivalent

structured version.

A. Polyvyanyy et al. / Information Systems 37 (2012) 518–538 519

cycle time and capacity requirements of structured pro-cess models is outlined in [8], while a method for analyz-ing time constraints in structured process models ispresented in [9]. By transforming unstructured processmodels into structured ones, we can extend the applic-ability of these techniques to a larger class of models.Thirdly, a transformation from unstructured to structuredprocess models can be used to implement converters fromgraph-oriented process modeling languages to structuredprocess modeling languages, e.g., from BPMN to BPEL [10].

In the context of flowcharts without parallel splits andjoins it has been shown that any unstructured flowchartcan be transformed into a structured one [11].1 If we addparallel splits and joins, this result no longer holds: Thereexist unstructured process models that do not haveequivalent structured ones [3]. Several authors haveattempted to classify the sources of unstructuredness inprocess models [12–14] and to define automated methodsfor structuring process models [15–17]. However, thesemethods are incomplete: There is currently no full char-acterization of the class of inherently unstructured processmodels, i.e., unstructured process models that have noequivalent structured model. Also, none of the existingstructuring methods is complete. In fact, this problem hasnot been fully solved even for acyclic process models. Thisarticle fills this gap.

To streamline the presentation, we make severalassumptions. Firstly, we consider process models to becomposed of nodes (tasks, events, gateways) and controlflow relations. In terms of BPMN, this means that we

1 In the case of multi-exit cycles, this transformation requires the

use of ‘‘break’’ statements or boolean variables in the transformed

model.

abstract away from other process model elements such asartifacts, annotations, associations, groups, pools, lanes,message flows, sub-process invocations and attributesassociated with sub-process invocations, e.g., repetition.Nonetheless, the proposed method is applicable even ifthese types of elements are present in the input model.Simply put, these ancillary elements and attributes needto be moved along with the tasks or events to which theyare attached. In the same vein, we do not distinguishbetween events and tasks since, for the purpose of thetransformation, both of them can be treated equally.Secondly, we consider only sound process models [18].This restriction is natural since soundness is a widely-accepted correctness criterion for process models. Thirdly,we consider process models in which every node has onlyone incoming or one outgoing arc. This restriction ismerely syntactical because one can trivially split a nodewith multiple incoming and multiple outgoing arcs intotwo nodes: one node with a single outgoing arc and theother with a single incoming arc. Finally, we do not dealwith the following BPMN constructs: or gateways, com-plex gateways, error events and non-interrupting events.Lifting this latter restriction is left as future work.

Given this setting, the main contribution of the articleis a method for transforming an unstructured acyclic

process model into an equivalent structured one when-ever such transformation is possible. The main steps ofthe method are summarized in Fig. 2. The process modelis first decomposed into a ‘‘process structure tree’’: Ahierarchy of so-called process components, where eachcomponent corresponds to a SESE region. Each compo-nent can be seen as a process model by itself and can beclassified either as well-structured or unstructured. Theaim of the method is to transform every unstructuredcomponent into an equivalent structured one, whenpossible. To this end, the process component is firsttranslated into a Petri net [19]. A technique known asPetri net unfolding is then employed in order to discoverthe elementary ordering relations that exist betweentasks in the component. These ordering relations areanalyzed using a technique known as modular decomposi-

tion, which extracts the fundamental ‘‘modules’’ that existin the ordering relations graph. We demonstrate that ifthe resulting modules are all of certain types, the processcomponent can be transformed into an equivalent struc-tured one. Otherwise, the process component is inher-ently unstructured.

This article is an extended and revised version of anearlier conference paper [20]. The main enhancementswith respect to the conference paper are the ability todeal with process models with multiple source and multi-ple sink nodes, and an empirical evaluation of the struc-turing method using process models taken from industrialpractice.

The remainder of the article is structured as follows:The next section defines the notions of process model andrefined process structured tree and reviews related work.Next, Section 3 introduces a semantics of process modelsbased on Petri nets. Section 4 then introduces the beha-vioral equivalence used in this article, viz. fully concurrentbisimulation, and shows that two acyclic process models

Page 3: Structuring acyclic process models

Petri net

Well-structured process model

Process model

Proper complete

prefix unfolding

Ordering relations

graph

Unstructured process

components

Fig. 2. Overview of the proposed structuring method.

A. Polyvyanyy et al. / Information Systems 37 (2012) 518–538520

are equivalent under this equivalence notion if and only ifthey have the same set of ordering relations. This result isused in Section 5 to characterize the class of acyclicprocess models that can be structured and to define astructuring algorithm. In Section 5, it is assumed thatprocess models have a single source and a single sinknode. This restriction is lifted in Section 6. Section 7presents an empirical evaluation of the proposed struc-turing method on a process model collection taken fromindustrial practice. Finally, Section 8 concludes the articleand outlines directions for future work.

2. Background and related work

In this section we introduce the notion of processmodel (Section 2.1) and a technique for decomposingprocess models into process components (Section 2.2).Afterwards, in Section 2.3, we analyze related work withrespect to their ability to structure different types ofprocess components.

2.1. Process models

As discussed in the Introduction, we consider processmodels consisting of tasks and gateways, as captured inthe following definition.

Definition 2.1 (Process model). A process model is a tupleW ¼ ðA,G ,G ,C,A,mÞ, where A is a non-empty set of tasks,or activities, G is a set of and gateways, G is a set of xor

gateways (these sets are disjoint). We write G¼ G [ G

for all gateways and V ¼ A [ G for all nodes of the processmodel. CDV � V defines the control flow. A, t 2 A, is a setof names and m : A-A is a function that assigns each taska name.

A node v 2 V is a source node, if �v¼+, and it is a sink

node, if v� ¼+, where �v stands for a set of immediatepredecessors and v� stands for a set of immediate suc-cessors of node v. As mentioned in the Introduction, weassume that process models meet certain structuralrequirements. Every task a 2 A has at most one incomingand at most one outgoing arc, i.e., 9�a9r149a�9r1, whileeach gateway is either a split or a join: A gateway g 2 G isa split, if 9�g9¼ 149g�941. A gateway g 2 G is a join, if9�g94149g�9¼ 1. Source and sink nodes are tasks andevery node is on a path from some source task to somesink task.

We employ a notation similar to BPMN for visualiza-tion of process models. Fig. 1(a) shows a process modelwhich contains eight tasks, i.e., A¼ fi,a,b,c,d,e,f ,og. Notethat i and o are silent tasks, i.e., mðiÞ ¼ t¼ mðoÞ. Wevisualize silent tasks as start, intermediate, or end eventsin BPMN. Note that we use silent tasks for technical

purposes only, e.g., representation of source and sinktasks. An observable task is drawn as a rectangle thathas rounded corners with its name inside. Gateways arevisualized as diamonds: Gateways of type xor, or exclusive

gateways, use a marker which is shaped like an ‘‘� ’’inside the diamond shape. Gateways of type and, orparallel gateways, use a marker which is shaped like a‘‘þ ’’ inside the diamond shape. The set ft,u,v,w,x,y,zg isthe set of gateways of the process model in Fig. 1(a).Gateways t, w, x, and z are exclusive, while gateways u, v,and y are parallel. Gateways t, u, and v are splits, whilegateways w, x, y, and z are joins. Finally, control flow arcsare drawn as directed edges between the nodes of theprocess model.

2.2. Process components

The starting point of the proposed structuring methodis a parsing technique for process models presented in[21] (originally proposed in [22]). This parsing techniquedecomposes a process model into SESE regions, herebycalled process components. In other words, a process com-ponent is a subset of arcs of a process model such that thesubgraph induced by these arcs has a single entry node anda single exit node. These two nodes are called boundary

nodes as they connect the component with the rest of themodel. A process component is canonical, if it does notoverlap (on the set of arcs) with any other process compo-nent, meaning that any two canonical process componentsare either disjoint or one is contained in the other. Canonicalprocess components naturally form a hierarchy, leading tothe following definition.

Definition 2.2 (Refined process structure tree). The refined

process structure tree (RPST) of a process model is the set ofall its canonical process components.

The RPST of a given process model is unique and canbe computed in linear time [21]. Fig. 1(a) and (b) show theRPSTs in the form of dotted boxes superposed on theprocess models. For the sake of presentation, the sameRPSTs are depicted as trees in Fig. 3. Every region inside adotted box defines a canonical process component, whichis composed of arcs that are inside or intersect the region.For instance, component B1 in Fig. 1(a) has two boundarynodes: t and z. Node t is the entry and node z is the exit ofthe component. According to [21–23], every canonicalprocess component belongs to one out of four structuralclasses.

Definition 2.3 (Trivial, polygon, bond, rigid). Let C be aprocess component of a process model.

J

C is a trivial component, iff C is singleton, i.e., C

contains a single arc.

Page 4: Structuring acyclic process models

B1

P2 P3

P1

R1

B1

P3

P1

P2

B2 B3

Fig. 3. The (simplified) RPSTs of process models in Fig. 1.

Process component

Trivial Polygon Bond Rigid

Homogeneous Heterogeneous

XOR AND

Acyclic Cyclic

Acyclic Cyclic

Fig. 4. A taxonomy of components.

A. Polyvyanyy et al. / Information Systems 37 (2012) 518–538 521

J

C is a polygon component, iff there exists a sequenceðr0, . . . ,rnÞ, n 2 N, of canonical components of theprocess model, such that C ¼

Si ¼ ni ¼ 0 ri, the entry of C

is the entry of r0, the exit of C is the exit of rn, and theexit of rj is the entry of rjþ1, 0r jon.

J

C is a bond component, iff there exists a set R ofcanonical components of the process model, such thatC ¼

Sr2Rr and every component in R has the same

boundary nodes as C.

J

ia

b

c

d

ow

x

y

zR1

P1

Fig. 5. An inherently unstructured process model.

C is a rigid component, iff C is neither a trivial, nor apolygon, nor a bond component.

For instance in Fig. 1(a), polygon P1 is the root of theRPST and corresponds to the whole process model. Poly-gon P1 is composed of bond B1 and two trivial compo-nents fði,tÞg and fðz,oÞg. In turn, bond B1 contains polygonsP2 and P3. Observe that trivial components and polygonsthat are composed of two trivial components are notexplicitly visualized for simplicity reasons—neither inFig. 1 nor in Fig. 3. Note also that names of componentshint at their structural class, e.g., P1 is a polygon, B1 is abond, and R1 is a rigid component.

The RPST provides a basis for defining the notion ofwell-structuredness.

Definition 2.4 (Well-structured process model). A processmodel is (well-)structured, if and only if its RPST containsno rigid process component; otherwise the process modelis unstructured.

In a process model that satisfies the condition in theabove definition, every split has a corresponding join sothat the region between the split and the join is a SESEregion, and vice versa (by virtue of the definition of atrivial, polygon, and bond component). Every rigid processcomponent contains a node, either a split or a join, whichhas no corresponding node to define a SESE region.

The process model in Fig. 1(a) contains rigid R1 and is,therefore, unstructured. On the other hand, the processmodel in Fig. 1(b) is well-structured—its RPST contains norigid process component. Hence, if we had a method fortransforming every rigid component in the RPST into anequivalent structured component, we would be able totransform any process model into a structured model bytraversing the RPST bottom-up and replacing each rigidby its equivalent structured component. In the runningexample, this corresponds to replacing R1 (highlightedwith grey background in Fig. 3(a)) with components B2

and B3, as hinted in Fig. 3(b). Accordingly, the rest of thearticle focuses on how to structure rigid components.

Existing methods for structuring rigid componentsdiffer depending on the types of gateways present in therigid and whether the rigid contains cycles or not. In thisrespect, it is useful to further classify rigid components asfollows: A homogeneous rigid contains either only xor oronly and gateways. We call these rigids (homogeneous) xor

rigids and (homogeneous) and rigids, respectively. A het-erogeneous rigid contains a mixture of and/xor gateways.Heterogeneous and homogeneous xor rigids are furtherclassified into cyclic, if they contain at least one cyclicpath, or acyclic. Importantly, we do not classify homo-geneous and rigids as cyclic or acyclic as process modelswith cyclic and rigids are unsound [24]. Given this back-ground, a taxonomy of process components is provided inFig. 4.

2.3. Related work

A large body of work on flowcharts and GOTO programtransformation [11,25] has addressed the problem ofstructuring xor rigids. In some cases, these transforma-tions introduce additional boolean variables in order toencode part of the control flow, while in other cases theyrequire certain nodes to be duplicated.

One of the earliest studies on the problem of structur-ing process models with concurrency is that of Kiepus-zewski et al. [3]. The authors showed that not all acyclicand rigids can be structured by putting forward a counter-example, which is essentially the one presented in Fig. 5.The authors showed that there exists no well-structuredprocess model equivalent to this one under a behavioralequivalence notion which preserves the level of observedconcurrency, viz. fully concurrent bisimulation, which weshall discuss later. However, this work does not provide afull characterization of the class of models that canbe structured, nor it defines any automated structuringtechnique. Instead, some causes of unstructuredness are

Page 5: Structuring acyclic process models

A. Polyvyanyy et al. / Information Systems 37 (2012) 518–538522

explored. In a similar vein, [12] presents a taxonomy ofunstructuredness in process models, covering cyclic andacyclic rigids. Again, the taxonomy is incomplete, i.e., itdoes not cover all possible cases of models that can bestructured. Also, the authors do not define an automatedstructuring algorithm.

In [13], the authors outline a classification of processcomponents using region trees. Region trees enable anincremental approach to the analysis and structuring ofprocess models. However, the authors do not provide acomplete structuring method for acyclic heterogeneousrigids, e.g., the one in Fig. 1(a). Similar remarks apply to[26]. In [16], the authors propose a method for structuringcyclic xor rigids. In [15], the authors propose a method forstructuring xor rigids based on GOTO program transforma-tions, and extends this method to process graphs where xor

rigids are nested inside bonds. Again, this method cannotdeal neither with and rigids nor with heterogeneous rigids.

The problem of structuring process models is relevantin the context of designing translations between processdefinition languages, e.g., BPMN-to-BPEL transformations.BPMN-to-BPEL transformations, e.g., [17], treat rigid com-ponents as black-boxes which are translated using BPELlinks or event handlers, rather than seeking to transformthese rigids into structured components.

Recently, refactoring process models in large reposi-tories has gained considerable attention [27,28]. In thatcontext, structuring is required, for instance, to reducethe complexity of process models containing redundantelements, such as superfluous control flow arcs, whilepreserving the original behavior. An empirical studyreported in [29] revealed that significant size reductionsand reuse can be achieved by refactoring duplicate com-ponents in process model repositories. It is also suggestedthat further refactoring opportunities could be uncoveredonly after structuring the models in these repositories.

3. Execution semantics of process models

As outlined in the Introduction, the proposed structur-ing method relies on a Petri net semantics of processmodels. For the sake of making the article self-contained,we present below some standard definitions associated toPetri nets. We then introduce a mapping from processmodels to Petri nets.

Definition 3.1 (Petri net). A Petri net, or a net, is a tupleN¼ ðP,T ,FÞ, with P and T as finite disjoint sets of places andtransitions, and FDðP � TÞ [ ðT � PÞ as the flow relation.

We identify F with its characteristic function on the set

ðP � TÞ [ ðT � PÞ. For a node x 2 P [ T , �x¼ fy 2 P [ T 9

Fðy,xÞ ¼ 1g is a preset, whereas x� ¼ fy 2 P [ T 9 Fðx,yÞ ¼ 1g

is a postset of x. A node x 2 P [ T is an input (output) nodeof a node y 2 P [ T, if x 2 �y (x 2 y�). For XDP [ T ,�X ¼

Sx2X�x and X� ¼

Sx2Xx�. A place p 2 P is a source

place, if �p¼+ and it is a sink place, if p� ¼+. We

denote by Fþ the transitive closure of F, and by Fn—thereflexive and transitive closure of F.

Nets have precise execution semantics defined interms of a token game.

Definition 3.2 (Net semantics). Let N¼ ðP,T,FÞ be a net.

J

M : P-N0 is a marking of N assigning each place p 2 P

a number M(p) of tokens. [p] denotes the markingwhere place p contains just one token and all otherplaces contain no tokens. We identify M with themultiset containing M(p) copies of p for every p 2 P.

J

For any transition t 2 T and for any marking M of N, t

is enabled at M, denoted by ðN,MÞ½tS, iff 8p 2 �t :MðpÞZ1.

J

If t 2 T is enabled at M, then it can fire, which leads to anew marking M0, denoted by ðN,MÞ½tSðN,M0Þ. The newmarking M0 is defined by M0ðpÞ ¼MðpÞ�Fðp,tÞþFðt,pÞ,for each place p 2 P.

J

Let M0 be a marking. If ðN,M0Þ½t1SðN,M1Þ yðN,Mn�1Þ

½tnSðN,MnÞ are transition firings, then a sequence oftransitions s¼ ðt1, . . . ,tnÞ is a firing sequence leadingfrom M0 to Mn.

J

For any two markings M and M0 of N, M0 is reachable

from M in N, denoted by M0 2 ½N,MS, iff there exists afiring sequence s leading from M to M0. Note that s canbe the empty sequence. We have M 2 ½N,MS for everyM of N.

J

A net system, or a system, is a pair ðN,M0Þ, where N is anet and M0 is a marking of N. M0 is called the initial

marking of N.

We expect all nets to be T-restricted, i.e., every transi-tion of a net has at least one input place and at least oneoutput place. Otherwise, we assume the natural comple-tion of a net, i.e., the net gets modified so that everytransition without an input (output) place gets a freshinput (output) place. By Min(N) we denote the set ofsource places of net N. In the following, when we mentiona net N in the context of a net system, we assume N withits natural marking, i.e., the marking comprising onetoken at each place in Min(N) and no tokens elsewhere.

Workflow (WF-)nets are a subclass of Petri nets spe-cifically designed to represent workflow procedures [30].A WF-net is a net with two special places: one to mark thestart and the other the end of a workflow execution.

Definition 3.3 (WF-net, short-circuit net, WF-system). APetri net N¼ ðP,T ,FÞ is a workflow net, or a WF-net, iff N hasa dedicated source place i 2 P, N has a dedicated sinkplace o 2 P, and the short-circuit net N%

¼ ðP,T [ ft%g,F [fðo,t%Þ, ðt%,iÞgÞ, t%=2T , of N is strongly connected. A WF-system is a pair ðN,MiÞ, where Mi ¼ ½i�.

Soundness and safeness are basic properties of WF-systems [30]. Soundness states that every execution of aWF-system ends with a token in the sink place, and once atoken reaches the sink place, no other tokens remain inthe net. Safeness refers to the fact that there is never morethan one token in a place.

Definition 3.4 (Liveness, safeness, soundness).

J

A system ðN,M0Þ, N¼ ðP,T ,FÞ, is live, iff for every reach-able marking M 2 ½N,M0S and for every transitiont 2 T, there exists a marking M0 2 ½N,MS, such thatðN,M0Þ½tS.
Page 6: Structuring acyclic process models

A. Polyvyanyy et al. / Information Systems 37 (2012) 518–538 523

J

A system ðN,M0Þ is bounded, iff the set ½N,M0S is finite.A system ðN,M0Þ, N¼ ðP,T ,FÞ, is safe, iff 8M 2 ½N,M0S8p 2 P : MðpÞr1.

J

A WF-system ðN,MiÞ is sound, iff the short-circuitsystem ðN%,MiÞ is live and bounded.

Petri nets have a great expressive power. They can be

used to model a large variety of distributed systems. How-ever, it is often sufficient to reduce investigations to a sub-class of nets. In the following, we shall make extensive use ofa structural subclass of free-choice nets [31,32]. In a free-choice net two places that share an output transition maynot have any other output transitions and two transitionsthat share an input place may not have any other inputplaces.

Definition 3.5 (Free-choice net). A net N¼ ðP,T,FÞ is free-

choice, iff 8p 2 P,9p�941 : �ðp�Þ ¼ fpg.

It is often useful to distinguish between observableand silent transitions of a net. Accordingly, the notion of anet must be extended.

Definition 3.6 (Labeled net). A labeled net is a tupleN¼ ðP,T,F,T ,lÞ, where ðP,T ,FÞ is a net, T is a set of labels,such that t 2 T , and l : T-T is a function that assignseach transition a label. If lðtÞat, then t is observable;otherwise, t is silent. l is distinctive, if it is injective on theset of all observable transitions.

Observable transitions are designed to representactions of the distributed system that are visible to theoutside world, while silent transitions encode the internalactions of the system.

The execution semantics of process models is definedby means of a mapping to labeled free-choice Petri nets.

Definition 3.7 (WF-net of a process model). Let W ¼

ðA,G ,G ,C,A,mÞ be a process model. Let I and O be sourcetasks and sink tasks of W, respectively. The labeled netN¼ ðP,T,F,T ,lÞ that corresponds to W is defined by:

J

P¼ fpx9x 2 G g [ fpx,y9ðx,yÞ 2 C4y 2 A [ G g [ fpx9x 2I [ Og.

J

T ¼ ftx9x 2 A [ G g [ ftx,y9ðx,yÞ 2 C4x 2 G g. J F ¼ fðtx,pyÞ9ðx,yÞ 2 C4x 2 A [ G 4y 2 G g [ fðtx,px,yÞ9ðx,yÞ 2 C4x,y 2 A [ G g [ fðtx,y,pyÞ9ðx,yÞ 2 C4x,y 2 G g

[ fðtx,y,px,yÞ9ðx,yÞ 2 C4x 2 G 4y 2 A [ G g [ fðpx,tx,yÞ9ðx,yÞ 2 C4x 2 G g [ fðpx,y,tyÞ9ðx,yÞ 2 C4y 2 A [ G g [

fðpx,txÞ9x 2 Ig [ fðtx,pxÞ9x 2 Og.

J T ¼A, lðtxÞ ¼ mðxÞ, tx 2 T , x 2 A; otherwise lðtÞ ¼ t,t 2 T.

tatt,a

pitb

tt,b

tett,e

pt

pt,a

pt,b

pt,e

tu

tv

pa,u

pb,v

pw

px

tw,c

tx,dR1ti

Fig. 6. A WF-net that corresponds to

the process model in Fig. 1(a). The figure highlights the

For example, Fig. 6 shows the net that corresponds to

subnet that corresponds to rigid component R1 in Fig. 1(a)(inside the box with dotted borderline).

Definition 3.7 states that a task is mapped to a nettransition with a single input and a single output arc.An and gateway maps to a transition with multiple out-going arcs (and split) or multiple incoming arcs (and join).An xor gateway maps to a place with multiple outgoingarcs (xor split) or multiple incoming arcs (xor join).The places corresponding to xor splits are immediatelyfollowed by silent (t) transitions representing the branch-ing conditions, e.g., transitions tt,a, tt,b, and tt,e in Fig. 6.Note that silent transitions of a labeled net are drawn asempty rectangles.

Observe that every process model with a single sourcetask and a single sink task always maps onto a WF-net.A process model is sound, if and only if its correspondingWF-net is sound. In this article, we only consider soundprocess models. A sound free-choice WF-system is guar-anteed to be safe [33]. Hence, the rest of the article dealswith sound and safe process models.

4. Behavioral equivalence of process models

This section serves two purposes: (i) motivates theselection of fully concurrent bisimulation, among othernotions of behavioral equivalence, for structuring of processmodels, (ii) discusses a procedure for checking behavioralequivalence of special nets, viz. occurrence nets, by usingthe notion of ordering relations.

4.1. Fully concurrent bisimulation

An unstructured process model and its structuredversion are structurally different, but behaviorally equiva-lent. There exist many notions of behavioral equivalencefor concurrent systems [34]. A common notion is that ofbisimulation. Related notions are those of weak bisimula-

tion and branching bisimulation, which abstract away fromsilent transitions. These notions have been advocated asbeing suitable for comparing process models [18]. How-ever, we argue that they are not suitable for our purposes.These three notions adopt interleaving semantics, i.e., notwo tasks are executed exactly at the same time. Thus, aconcurrent system and its sequential simulation areconsidered equivalent. For example, Fig. 7 shows thesequential simulation of the net in Fig. 6. The net isstructured and weakly bisimilar with the net in Fig. 6,

tfpe,f

pz

tc

td

pw,c

px,d ty

pc,y

pd,y potz,o pz,o to

the process model in Fig. 1(a).

Page 7: Structuring acyclic process models

ta

tbt''c

t'''d

t''d

t'''c

tc

t'd

td

t'c

te tf

pi popz

px

py

pw

pc,d

pd,c

p'c,d

p'd,c

pe,fpw,e

pw,b

pw,atw,a

tw,b

tw,e

ti totz,o pz,o

px,ctx,c

px,d'

tx,d' py,c''ty,c''

py,d'''ty,d'''

Fig. 7. A sequential simulation of the net in Fig. 6.

A. Polyvyanyy et al. / Information Systems 37 (2012) 518–538524

but it contains no parallel branch. We could take anyprocess model, compute its sequential simulation, struc-ture this sequential net using GOTO program transforma-tions, and transform the resulting sequential net into astructured process model. This structuring method iscomplete, but if we start with a process model containingand gateways, we obtain a (much larger) structuredprocess model without any parallel branches.

Accordingly, we adopt a notion of equivalence thatpreserves the level of concurrency of observable transi-tions, viz. fully concurrent bisimulation [4]. Fully concur-rent bisimulation is defined in terms of concurrent runs ofa system, a.k.a. processes in the literature (but not to beconfused with ‘‘business processes’’ or workflows). Everyconcurrent run of a system can be expressed as anothernet with a particular structure, namely a causal net.

Definition 4.1 (Causal net). A net N¼ ðP,T ,FÞ is a causal

net, iff:

J

for each place p 2 P holds 9�p9r1 and 9p�9r1, and J N is acyclic, i.e., Fþ is irreflexive.

We also introduce ordering relations [35], which will beused throughout the article as an instrument for reason-

ing about the behavior of nets.

Definition 4.2 (Ordering relations). Let N¼ ðP,T ,FÞ be anet and let x,y 2 P [ T be two nodes of N.

J

x and y are in causal relation, written x N y, iffðx,yÞ 2 Fþ . We denote by N the inverse of N .

J

x and y are in conflict, written x #N y, iff there existdistinct transitions t1,t2 2 T , such that �t1 \ �t2a+,and ðt1,xÞ,ðt2,yÞ 2 Fn. If x #N x, then x is in self-conflict.

J

x and y are concurrent, written x JN y, iff neitherx N y, nor y N x, nor x #N y.

The set RN ¼ f N , N ,#N ,JNg forms the ordering relations

of N.

In the following, we omit the subscripts of orderingrelations where the context is clear. It is easy to see thatany two nodes in a causal net are either in causal relationor concurrent. In order to define a process, we lack thenotion of a cut. A cut of a net is the maximal set ofpairwise concurrent places with respect to set inclusion.Finally, a process is defined as follows.

Definition 4.3 (Process). A process p¼ ðNp,rÞ of asystem S¼ ðN,M0Þ, N¼ ðP,T ,FÞ, consists of a causal net

Np ¼ ðPp,Tp,FpÞ and a function r : Pp [ Tp-P [ T , suchthat:

J

rðPpÞDP,rðTpÞDT, J MinðNpÞ is a cut, which corresponds to the initial

marking M0, i.e., 8 p 2 P : M0ðpÞ ¼ 9r�1ðpÞ \MinðNpÞ9,and

J

8 t 2 Tp8p 2 P : ðFðp,rðtÞÞ ¼ 9r�1ðpÞ \ �t9Þ4 ðFðrðtÞ,pÞ ¼9r�1ðpÞ \ t�9Þ.

A process p of S is initial, iff Tp ¼+.

A process p0 is an extension of a process p if it ispossible to observe p before one observes p0. Conse-quently, process p is a prefix of p0.

Definition 4.4 (Prefix, process extension). Let p¼ ðNp,rÞ,Np ¼ ðPp,Tp,FpÞ, be a process of S¼ ðN,M0Þ, N¼ ðP,T ,FÞ. Letc be a cut of Np and let ck be the set fx 2 Pp [ Tp9( y 2 c : ðx,yÞ 2 Fn

g. A process pkc is a prefix of p, iff pk

c ¼

ððPp \ ck,Tp \ ck,F \ ðck � ckÞÞ,r9ck Þ. A process p0 is anextension of process p if p is a prefix of p0.

In order to define fully concurrent bisimulation, weneed two auxiliary definitions: l-abstraction of a process,which is a process footprint that ignores silent transitions,and the order-isomorphism of l-abstractions.

Definition 4.5 (l-abstraction). Let S¼ ðN,M0Þ, N¼ ðP,T ,F,T ,lÞ, be a labeled system and let p¼ ðNp,rÞ, Np ¼ ðPp,Tp,FpÞ, be a process of S. The l-abstraction of p, denoted byalðpÞ ¼ ðTp

0,!,l0Þ, is defined by T 0p ¼ ft 2 Tp9lðrðtÞÞatg, !is the causal relation of Np restricted to the transitions inT 0p, i.e., !¼ Np \ ðT

0p � T 0pÞ, and l0 : T 0p-T , such that

l0ðtÞ ¼ lðrðtÞÞ, t 2 T 0p.

Two l-abstractions are order-isomorphic if there existsa one-to-one correspondence between transitions of bothabstractions which also preserves the ordering of thecorresponding transitions in the abstractions.

Definition 4.6 (Order-isomorphism of l-abstractions). Letal1¼ ðT1,!1,l1Þ and al2

¼ ðT2,!2,l2Þ be two l-abstrac-tions, both with labels in T . Then al1

and al2are order-

isomorphic, iff there is a bijection b : T1-T2, such that 8t 2T1 : l1ðtÞ¼l2ðbðtÞÞ and 8t1,t2 2 T1 : t1!1t23bðt1Þ!2bðt2Þ.

Given all of the above, fully concurrent bisimulation isdefined as follows:

Definition 4.7 (Fully concurrent bisimulation). Let S1 ¼

ðN1,M1Þ and S2 ¼ ðN2,M2Þ be labeled systems, N1 ¼

ðP1,T1,F1,T 1,l1Þ and N2 ¼ ðP2,T2,F2,T 2,l2Þ. S1 and S2 arefully concurrent bisimilar, or FCB-equivalent, denoted byS1 � S2, iff there is a set BDfðp1,p2,bÞg, such that:

(i)

p1 is a process of S1, p2 is a process of S2, and b is arelation between the non-t transitions of p1 and p2.

(ii)

If p10 and p2

0 are the initial processes of S1 and S2,respectively, then ðp1

0,p20,+Þ 2 B.

(iii)

If ðp1,p2,bÞ 2 B, then b is an order-isomorphismbetween the l1-abstraction of p1 and the l2-abstraction

of p2.

(iv) 8ðp1,p2,bÞ 2 B:
Page 8: Structuring acyclic process models

A. Polyvyanyy et al. / Information

(a) If p01 is an extension of p1, then (ðp01,p02,b0Þ 2 Bwhere p02 is an extension of p2 and bDb0.

(b) Vice versa.

Fully concurrent bisimulation defines an equivalencerelation on labeled systems that is stricter than weakbisimulation and related notions. The nets in Figs. 6 and 7are not fully concurrent bisimilar. Meanwhile, the twomodels in Fig. 1 are FCB-equivalent (with the under-standing that two process models are FCB-equivalent ifthe corresponding Petri nets are FCB-equivalent).

4.2. Behavioral equivalence and ordering relations

The above definition of fully concurrent bisimulation isabstract and hardly of any use when synthesizing struc-tured nets from unstructured ones. Accordingly, weemploy a more convenient way of reasoning aboutequivalence based on the ordering relations of the specialclass of nets, viz. occurrence nets.

Nets can have forward or backward conflicts, i.e.,places with multiple output or input transitions, respec-tively. This means that a subnet which may cause atransition firing is not unique. An occurrence net is a netof a special kind. Occurrence nets forbid backward con-flicts and, thus, ensure the unique cause of a transitionfiring. Essentially, occurrence nets generalize causal netsby allowing forward conflicts. Note that every causal netis also an occurrence net.

Definition 4.8 (Occurrence net). A net N¼ ðP,T,FÞ is anoccurrence net, iff:

J

for each place p 2 P holds 9�p9r1, J N is acyclic, i.e., Fþ is irreflexive, J for each node x 2 P [ T the set fy 2 P [ T9ðy,xÞ 2 Fþ g is

finite, and

J no t 2 T is in self-conflict, i.e., #N is irreflexive.

Every two nodes of an occurrence net are either incausal, inverse causal, conflict, or concurrent relation [35].Let N¼ ðP,T ,F,T ,lÞ be a labeled occurrence net and letT 0DT be its observable transitions. The l-ordering

relations of N are formed by its ordering relations restrictedto T 0, i.e., Rl ¼ f N \ ðT

0� T 0Þ, N \ ðT

0� T 0Þ,#N \ ðT

0� T 0Þ,

JN \ ðT0� T 0Þg. We say that two ordering relations are

isomorphic, if there exists a mapping between the obser-vable transitions such that every corresponding pair oftransitions is in the same ordering relation.

Definition 4.9 (Isomorphism of ordering relations). LetN1 ¼ ðP1,T1,F1,T 1,l1Þ and N2 ¼ ðP2,T2,F2,T 2,l2Þ be twolabeled occurrence nets. Let T 01DT1 and T 02DT2 be obser-vable transitions of N1 and N2, respectively. Two l-orderingrelations Rl1

of N1 and Rl2of N2 are isomorphic, denoted by

Rl1ffiRl2

, iff there is a bijection g : T 01-T 02, such that:

J

8t 2 T 01 : l1ðtÞ ¼ l2ðgðtÞÞ, and J 8t1,t2 2 T 01 : ðt1 N1

t24gðt1Þ N2gðt2ÞÞ3ðt2 N1

t14gðt2Þ N2

gðt1ÞÞ3ðt1#N1t24gðt1Þ#N2

gðt2ÞÞ3ðt1JN1t24

gðt1ÞJN2gðt2ÞÞ.

morphic ordering relations are FCB-equivalent, and vice

Finally, we show that two occurrence nets with iso-

versa. This result is exploited in the next section.

Theorem 1. Let S1 ¼ ðN1,M1Þ, N1 ¼ ðP1,T1,F1,T 1,l1Þ, and

S2 ¼ ðN2,M2Þ, N2 ¼ ðP2,T2,F2,T 2,l2Þ, be two labeled occur-

rence systems with natural markings and distinctive label-

ings. Let T 01DT1 and T 02DT2 be observable transitions of N1

and N2, respectively, such that there exists a bijection c :

T 01-T 02 for which holds l1ðtÞ ¼ l2ðcðtÞÞ, for all t 2 T 01. LetRl1

and Rl2be the l-ordering relations of N1 and N2, respec-

tively. Then, it holds:

S1 � S2 3 Rl1ffiRl2

:

Proof. We prove each direction of the equality separately.

Systems 37 (2012) 518–538 525

ð)Þ

Let S1 and S2 be FCB-equivalent. We want to showthat Rl1

ffiRl2. Let us assume that S1 � S2 holds,

but Rl1ffiRl2

does not hold. Furthermore, let usconsider transitions t1

i ,t1j 2 T 01 that are in one-to-

one correspondence with transitions t2i ,t2

j 2 T 02, i.e.,cðt1

i Þ ¼ t2i and cðt1

j Þ ¼ t2j . All scenarios can be

reduced to the following two cases:

Case1:

(t1i JN1

t1j or t1

i N1t1

j , and t2i #N2

t2j ). If t1

i JN1t1

j ort1

i N1t1

j , then there exists process p1 of S1 thatcontains t1

i and t1j . If t2

i #N2t2

j , then there exists noprocess p2 of S2 that contains t2

i and t2j .

Case2:

(t1i N1

t1j , and t2

j N2t2

i or t2i JN2

t2j ). Let p1 be a

process of S1 that contains t1i and t1

j , and let p2

be a process of S2 that contains t2i and t2

j . Then,there exists no fDc, such that f is an order-isomorphism between l-abstractions of p1 and p2.

In both cases we reach a contradiction, i.e., sys-tems S1 and S2 cannot be FCB-equivalent if thel-ordering relations are not isomorphic.

ð(Þ

Let Rl1ffiRl2

. We want to show that S1 and S2

are FCB-equivalent. Let us assume that Rl1ffiRl2

holds, but S1 � S2 does not hold. Then, for instance,there exists process p01 of S1, which has no corre-sponding order-isomorphic process of S2. Supposethat p01 has the minimal size among all such pro-cesses, i.e., any prefix of p01 has a correspondingorder-isomorphic process of S2. Let p01 be an exten-sion of process p1 of S1 by exactly one observabletransition t1

j 2 T 01. Let p2 be a process of S2 that isorder-isomorphic with p1. Let t2

j 2 T 02 be in one-to-one correspondence with t1

j , i.e., cðt1j Þ ¼ t2

j . All sce-narios can be reduced to the following three cases:

Case1:

There exists process p02 of S2 that contains t2j and is

an extension of p2 by one observable transition.Moreover, there exists t1

i 2 T 01 in p1, such thatt1

i N1t1

j . However, it holds t2i JN2

t2j , for t2

i 2 T 02,such that cðt1

i Þ ¼ t2i ; otherwise there exists an

order-isomorphism fDc between p01 and p02.

Case2: There exists no process p02 of S2 that contains t2

j

and is an extension of p2. Moreover, there existst1

i 2 T 01 in p1, such that t1i N1

t1j . However, it holds

t2i #N2

t2j , for t2

i 2 T 02, such that cðt1i Þ ¼ t2

i .

Page 9: Structuring acyclic process models

A. Polyvyanyy et al. / Information Systems 37 (2012) 518–538526

Case3:

There exists process p02 of S2 that contains t2j and is

an extension of p2, but not by only one observabletransition. Then, there exists t2

k 2 T 02, such thatt2

k N2t2

j but p2 does not contain t2k . However,

t1k 2 T 01, such that cðt1

k Þ ¼ t2k , is not in p01 and, hence,

t1k_N1

t1j .

In all three cases we reach a contradiction, i.e., thel-ordering relations cannot be isomorphic if sys-tems S1 and S2 are not FCB-equivalent. &

5. Synthesis of structured process models

Given an acyclic rigid process component, the mainidea of the proposed structuring technique is to computeits ordering relations and to synthesize, whenever possi-ble, a well-structured process component that exhibitsthe same ordering relations (recall that a well-structuredprocess component is the one whose RPST contains onlytrivial, bond, and polygon components). In order toimplement this idea, we encode ordering relations in adirected graph, which we call an ordering relations graph.Afterwards, we parse the graph into so-called modulesthat show ordering relations of well-structured processcomponents. To this end, we rely on the technique ofmodular decomposition [36]. With this background,below we present the following: (i) the notion of a propercomplete prefix unfolding, which is essential for extract-ing all the needful behavioral information from rigidcomponents, (ii) the notion of an ordering relations graph,which is a convenient abstraction of the information inthe proper complete prefix unfolding, and (iii) the struc-turing algorithm.

ta

tb

Complete prefix unfolding

pt,att,a

pt

pt,btt,b

pa,u

pb,v

tu

tv

pw

px

p'w

p'x

tw,c

tx,dtc

td

pw,c

px,d

t'c

t'd

p'w,c

p'x,d

pc,y

pd,y

p'c,y

p'd,y

ty pz

p'zt'yt'w,c

t'x,d

Fig. 8. The unfolding and a complete prefix unfolding.

5.1. Proper complete prefix unfoldings

According to Theorem 1, two systems are FCB-equiva-lent, if they demonstrate same ordering relations. Givenan (unstructured) process model, the structuring proceedsby computing ordering relations of its corresponding net.Theorem 1 operates on occurrence nets and, hence, mustbe adjusted to the case of the general class of systems. Weaccomplish the adjustment by employing the notion ofcomplete prefix unfolding of a system [37,38]. A completeprefix unfolding of a system is an occurrence net thatexplicitly represents all concurrent runs of the system. Tofit the structuring use case, complete prefix unfoldingsmust be restricted, i.e., they must be proper.

A branching process is a formal way to capture therelation between a system and its complete prefix unfold-ing. The relation technically builds on a homomorphismthat preserves the nature of nodes and the environment oftransitions. Let N1 ¼ ðP1,T1,F1Þ and N2 ¼ ðP2,T2,F2Þ be twonets. A homomorphism from N1 to N2 is a mappingh : P1 [ T1-P2 [ T2, such that: hðP1ÞDP2 and hðT1ÞDT2,and for all t 2 T1, the restriction of h to �t is a bijectionbetween �t in N1 and �hðtÞ in N2; correspondingly for t�

and hðtÞ�.

Definition 5.1 (Branching process). A branching process ofa system S¼ ðN,M0Þ is a pair b¼ ðN0,nÞ, where N0 ¼ ðP,T ,FÞis an occurrence net and n is a homomorphism from N0 toN, such that the restriction of n to MinðN0Þ is a bijectionbetween MinðN0Þ and M0, and for all p1,p2 2 P holds if�p1 ¼ �p2 and nðp1Þ ¼ nðp2Þ, then p1 ¼ p2.

System S is referred to as the originative system of thebranching process. Similar to the prefix relation onprocesses, see Definitions 4.3 and 4.4, a branching processcan be in the prefix relation with another branchingprocess.

Definition 5.2 (Prefix of a branching process). Let b1 ¼

ðN1,n1Þ and b2 ¼ ðN2,n2Þ be two branching processes of asystem. b1 is a prefix of b2, if N1 is a subnet of N2 suchthat: If a place belongs to N1, then its input transition inN2 also belongs to N1. If a transition belongs to N1, then itsinput and output places in N2 also belong to N1. n1 is therestriction of n2 to nodes of N1.

The maximal branching process of a net system withrespect to the prefix relation is called unfolding of thesystem. Every net system has a unique (up to isomorph-ism) unfolding [37].

Fig. 8 exemplifies the unfolding of a net that corre-sponds to a rigid component R1 in Fig. 6. An unfolding cancontain multiple transitions that refer to the same transi-tion in its originative system, e.g., transitions tc and t0c inFig. 8 both refer to transition tc in the originative system.If we used the ordering relations computed from theunfolding to synthesize a well-structured process compo-nent, the component would contain many duplicate tasks.Fortunately, for any safe system there exists a prefix of itsunfolding, called complete prefix unfolding [39], that ismore compact than the unfolding but contains all theinformation about reachable markings of the originativesystem. Complete prefix unfoldings are finite, even forcyclic systems. A complete prefix unfolding of a system isobtained by truncating its unfolding at points where theinformation about reachable markings starts to be redun-dant. In the next definition, we summarize main notionson complete prefix unfoldings from [39].

Definition 5.3 (Complete prefix unfolding). Let b¼ ðN0,nÞ,N0 ¼ ðP,T,FÞ, be a branching process of a system S¼ ðN,M0Þ.

J

A configuration C of b is a set of transitions, CDT , suchthat: (i) t 2 C implies that for all t0 2 T , t0 t implies
Page 10: Structuring acyclic process models

A. Polyvyanyy et al. / Information Systems 37 (2012) 518–538 527

t0 2 C, i.e., C is causally closed, and (ii) for all t1,t2 2 C

holds :ðt1#t2Þ, i.e., C is conflict-free.

J A local configuration of a transition t 2 T , denoted bydte, is the set ft0 2 T9t0 tg, i.e., the set of transitions thatprecede t.

J

For a finite configuration C of b, CutðCÞ ¼ ðMinðN0Þ [

C�Þ\�C is a cut, whereas nðCutðCÞÞ is a reachable mark-ing of S, denoted by MarkðCÞ.

J

b is complete if for each reachable marking M of S thereexists a configuration C of b, such that: (i) MarkðCÞ ¼M,i.e., M is represented in b, and (ii) for each transition t

enabled at M in N, there exists a configuration C [ ft0g,t0 2 T , of b such that t0=2C and nðt0Þ ¼ t.

J

A partial order v on the finite configurations of b is anadequate order, if: (i) v is well-founded, (ii) C1 � C2

implies C1 v C2, and (iii) v is preserved by finiteextensions, cf. [39] for details.

J

A transition t 2 T is a cutoff transition of b, induced byv , iff there exists a corresponding transition corrðtÞ 2 T ,such that MarkðdteÞ ¼MarkðdcorrðtÞeÞ and dcorrðtÞev dte.

J

b is a complete prefix unfolding, induced by v , iff b isthe maximal prefix of the unfolding of S that containsno transition after a cutoff transition.

The definition of a complete prefix unfolding is, in fact,

2 Using the Mole tool—http://www.lsv.ens-cachan.fr/ schwoon/

tools/mole/, which implements the algorithm in [39].

the definition of a family of prefixes. Every adequate orderleads to a different prefix with a different set of cutofftransitions. There exist several definitions of adequateorders, cf. [39–41]. Our structuring technique relies on theadequate total order for safe systems proposed in [40],hereafter denoted by v safe. The desired property of acomplete prefix unfolding is minimality. For a givensystem, there exist no smaller complete prefix unfoldingof the system than its minimal one. By employing v safe

adequate order, one always computes the minimal com-plete prefix unfolding of a safe system, if one onlycompares markings corresponding to local configurations[40].

Fig. 8 exemplifies the notion of a complete prefixunfolding. The dotted lines indicate which parts of theunfolding must be truncated. Transition tv is a cutofftransition of the unfolding, whereas transition tu is itscorresponding transition; the relation is visualized by adotted arc. Both local configurations of transitions tv andtu induce the same marking fpw,pxg in the originativesystem. The subnet of the unfolding that follows CutðdtveÞ

is isomorphic with the subnet that follows CutðdtueÞ

(equivalent markings imply equivalent futures ofbranching processes, cf. [39]). As this implies redun-dancy, only one subnet is included in the complete prefixunfolding.

The running time of the algorithm for constructing acomplete prefix unfolding of a safe system when employ-ing an adequate total order, cf. [39], has an upper boundof Oð9T9 Rx

Þ, where T is the set of transitions, R is thenumber of reachable markings, and x is the maximal sizeof the presets or postsets of the transitions in theoriginative system. The complete prefix unfolding con-tains a total number of Oðx RÞ nodes. Observe that in thecase of an and rigid component, the step of computing acomplete prefix unfolding is not required, as the

corresponding WF-net and its complete prefix unfoldingcoincide. Besides, we do not compute a complete prefixunfolding over the whole net, but only on individual rigidcomponents of the net. Tests we have conducted withsample process models show that the complete prefixunfolding computation takes sub-second times.2 Thisfinding is in line with other works that have empiricallyshown that complete prefix unfolding computation isefficient in practice.

In order to allow structuring, we impose an additionalrequirement on complete prefix unfoldings: A completeprefix unfolding must be proper.

Definition 5.4 (Proper complete prefix unfolding). Letb¼ ðN,nÞ, N¼ ðP,T,FÞ, be a branching process of an acyclicsystem S.

J

A cutoff transition t 2 T of b induced by an adequateorder v is healthy, iff CutðdteÞ\t� ¼ CutðdcorrðtÞeÞ\

corrðtÞ�.

J b is a proper complete prefix unfolding, or a proper prefix,

induced by an adequate order v , iff b is the maximalprefix of the unfolding of S that contains no transitionafter a healthy cutoff transition.

Proper prefixes of acyclic systems are clearly complete

and finite. The properness requirement guarantees thatconcurrency is kept encapsulated, i.e., if some branch of acomplete prefix unfolding contains a non-cutoff transitiont that introduces concurrency, i.e., 9t�941, then subse-quently the very same branch must contain a transition t0

that synchronizes this concurrency, i.e., for all p 2 t� holdsp t0. As the restriction for healthy cutoff transitions isdefined in terms of local configurations, a proper com-plete prefix unfolding of a safe acyclic system is alwaysminimal when constructed using v safe adequate order. Inthe following, when we refer to a proper prefix of a safeacyclic system, we always assume the minimal prefixconstructed using v safe adequate order. The only cutofftransition tv in Fig. 8 is healthy and, hence, the completeprefix unfolding is proper.

5.2. Ordering relations graphs

Ordering relations of a proper prefix specify the uniquebehavioral footprint of its originative system. Our idea isto use these relations to synthesize a well-structuredprocess component. For this purpose, it is convenient toencode ordering relations in a directed graph, viz. ordering

relations graph. As will be shown later, such a representa-tion allows for a simple structuring algorithm.

In order to overcome the effects of the proper prefixtruncation at healthy cutoff transitions, the notion of anordering relations graph is founded on proper orderingrelations.

Page 11: Structuring acyclic process models

A. Polyvyanyy et al. / Information Systems 37 (2012) 518–538528

Definition 5.5 (Ordering relations graph). Let b¼ ðN,nÞ,N¼ ðP,T ,FÞ, be a proper complete prefix unfolding of alabeled acyclic system S¼ ðN0,M0Þ, N0 ¼ ðP0,T 0,F 0,T ,lÞ.

J

Two nodes x and y of N are in proper causal relation,denoted by x:Ny, iff ðx,yÞ 2 F þ or there exists asequence ðt1, . . . ,tnÞ of healthy cutoff transitions of b,ti 2 T , 1r irn, n 2N, such that ðx,t1Þ 2 Fn, ðcorrðtnÞ,yÞ2 Fþ , and ðcorrðtjÞ,tjþ1Þ 2 Fn, 1r jon. We denote by9N the inverse of :N .

J

Let RN ¼ f N , N ,#N ,JNg be the ordering relations of N.The set 1N ¼#N\ð:N [9NÞ is the proper conflict

relation of N. The set RN ¼ f:N ,9N ,1N ,JNg formsthe proper ordering relations of N. We refer to RN asobservable proper ordering relations, if the relations inRN are restricted to the set of transitions of N thatcorrespond to observable transitions of N0.

J

Let RN ¼ f:N ,9N ,1N ,JNg be the observable properordering relations of N. An ordering relations graph, ororgraph, GN ¼ ðV ,A,B,sÞ of b consists of vertices V DT

defined by transitions of N that correspond to obser-vable transitions of N0, i.e., V ¼ ft 2 T9lðnðtÞÞatg, arcsA¼:N [1N , and the labeling function s : V-B,B¼ T \ftg with sðvÞ ¼ lðnðvÞÞ, v 2 V .

In the following, we omit the subscripts (superscripts)of proper ordering relations and orgraphs where thecontext is clear.

Fig. 9(a) shows the orgraph of the proper complete prefixunfolding in Fig. 8. Orgraphs are visualized as directedgraphs. Due to a design decision, arcs of orgraphs encodeproper causal (one-sided arrows) and proper conflict (two-sided arrows) relations. Such a design allows for uniqueencoding of proper ordering relations. Thus, one can deducefrom the orgraph in Fig. 9(a) that transitions ta and tc of theproper prefix in Fig. 8 are in proper causal relation, ta and tb

are in proper conflict relation, whereas tc and td areconcurrent. Observe that tb is in proper causal relation withtc and td as there exists a sequence ðtvÞ, such that ðtb,tvÞ 2 Fn,ðcorrðtvÞ,tcÞ 2 Fþ , and ðcorrðtvÞ,tdÞ 2 Fþ .

5.3. Structuring

The RPST of a well-structured process model is com-posed of trivial, polygon, and bond (either and or xor)components. In contrast to a rigid component, eachcomponent in a well-structured model has a well-definedand regular structure within the corresponding orgraph,which allows for a precise characterization. The orderingrelations graph of an xor bond is a complete graph, or a

a

b

c

d

a

b

c

d

L1C1 C2

Fig. 9. (a) An orgraph, (b)–(c) the MDT of (a), and (d) the MDT of

clique, whereas the orgraph of an and bond is an edgelessgraph. These topologies are consistent with the intuitionbehind, i.e., all tasks in an xor bond are in conflict, i.e., onlyone is executed, and all tasks in an and bond are concur-rently executed. Fig. 10(a) shows an xor bond with threebranches, whereas Fig. 10(b) shows the correspondingcomplete orgraph. Similarly, Fig. 10(c) and (d) shows anand bond and the corresponding orgraph, respectively. In thecases of a trivial and polygon component, the orgraph is adirect acyclic graph representing the transitive closure, orthe total order, of the proper causal relation. Fig. 10(e) showsa polygon composed of three tasks, whereas Fig. 10(f)presents the corresponding transitive closure over the propercausal relation.

The main idea of our structuring technique is to parsea given orgraph into subgraphs. If one can decompose anorgraph into subgraphs such that every subgraph is eithercomplete, edgeless, or total order, then one can constructa corresponding well-structured process component foreach of the discovered subgraphs and, in this way, tosynthesize a well-structured process model. To performthe parsing, we rely on the modular decomposition ofdirected graphs [36].

The modular decomposition is a technique for parsingdirected graphs into modules. Let G¼ ðV ,A,B,sÞ be anordering relations graph. A module MDV of G is a non-empty subset of vertices of G that are in uniform relationwith vertices V\M, i.e., if v 2 V\M, then v has directed arcsto all members of M or to none of them, and all membersof M have directed arcs to v or none of them do. However,v1,v2 2 V\M, v1av2, can have different relations to mem-bers of M. Moreover, the members of M and V\M can havearbitrary relations to each other [36]. This definition of amodule supports our intent of synthesizing a processcomponent from the ordering relations captured ina module; all tasks inside a SESE process component arein the same ordering relation with a given task outsidethe component.

Two modules M1 and M2 of G overlap, iff they intersectand neither is a subset of the other, i.e., M1\M2, M1 \M2 ,and M2\M1 are all non-empty. M1 is strong, iff there existsno module M2 of G, such that M1 and M2 overlap. Themodular decomposition substitutes each strong module ofa graph by a new vertex and proceeds recursively. Theresult is a unique rooted tree, called the modular decom-

position tree, which can be computed in linear time [36].

Definition 5.6 (Modular decomposition tree). The modular

decomposition tree (MDT) of an ordering relations graph isthe set of all its strong modules.

L1

C1 C2

a

b

c

d

P1

the orgraph computed for the rigid component R1 in Fig. 5.

Page 12: Structuring acyclic process models

a

b

c

a

b c

a

b

c

a

b ca b c a b c

Fig. 10. (a) An xor bond component, (b) a complete orgraph, (c) an and bond component, (d) an edgeless orgraph, (e) a polygon component, and (f) a total

order orgraph.

A. Polyvyanyy et al. / Information Systems 37 (2012) 518–538 529

According to [36], each module in the MDT belongs toone out of four classes.

Definition 5.7 (Trivial, linear, complete, primitive). Let M

be a module of an ordering relations graph G.

J

M is a trivial module, iff M is singleton, i.e., M containsa single vertex.

J

M is a linear module, iff there exists a linear orderðx1, . . . ,x9M9Þ of elements of M, such that there is adirected arc from xi to xj in G, iff io j.

J

M is a complete module, iff the subgraph of G inducedby vertices in M is either complete or edgeless.

J

M is a primitive module, iff M is neither a trivial, nor alinear, nor a complete module.

For our purposes, we classify modules further: If acomplete module induces a complete subgraph, the mod-ule is referred to as a xor complete. If a complete moduleinduces an edgeless subgraph, the module is referred to asan and complete. Finally, if a module induces a subgraphwith a pair of distinct vertices which are not connected byan arc, the module is said to be concurrent.

Considering all of the above, the next propositionsummarizes relations between components of a processmodel and modules of an orgraph.

Proposition 5.1. Let C1 be a process component and let M1

be the corresponding ordering relations graph. Let M2 be an

ordering relations graph and let C2 be its corresponding

process component.

1.

If C1 is trivial or polygon, then M1 is linear. 2. If M2 is linear, then there exists C2 that is trivial or

polygon.

3. If C1 is and (xor) bond, then M1 is and (xor) complete. 4. If M2 is and (xor) complete, then there exists C2 that is

and (xor) bond.

Fig. 9(b) shows the MDT of the orgraph in Fig. 9(a).Every region inside a dotted box defines a strong module.Note that module names hint at their class. For instance,module C1 is a complete module composed of twovertices a and b. Observe that every vertex outside C1,i.e., either vertex c or d, is in same relation with everyvertex inside C1, i.e., vertices a and b. As C1 induces acomplete graph, it is an xor complete module. Similarly,C2 is an and complete module. By treating both modulesas singletons, the modular decomposition identifies that

they are in total order and, hence, form linear module L1.Fig. 9(c) visualizes the MDT as a tree (without trivialmodules).

We are now ready to present the main result of thissection.

Theorem 2. Let G be an ordering relations graph. The MDT

of G contains no primitive module, iff there exists a well-

structured process model W such that G is the ordering

relations graph of W.

Proof. Let G¼ ðV ,A,B,sÞ be an ordering relations graph.

ð)Þ

Let G be such that the MDT of G contains noprimitive module. We show now by structuralinduction on the MDT of G that there exists awell-structured process model W such that G isthe ordering relations graph of W. The MDT of G cancontain trivial, linear, or complete modules.

Base:

If the MDT of G consists of a single module M, thenM is a trivial module and W is a process modelcomposed of a single task.

Step:

Let M be a module of the MDT of G such that everychild module of M has a corresponding well-struc-tured process component. If M is linear, then W canbe a trivial or polygon component composed ofchildren of M, cf. (2) in Proposition 5.1. If M iscomplete, then W can be a bond process component,either and or xor, composed of process componentsthat correspond to child modules of M, cf. (4) inProposition 5.1. In both cases, M has a correspondingwell-structured process model (component).

Therefore, there exists a well-structured processmodel W that is composed of process componentswhich correspond to child modules of module V,such that G is the ordering relations graph of W.

ð(Þ

Let W be a well-structured process model such thatG is the ordering relations graph of W. We shownow by structural induction on the RPST of W that theMDT of G has no primitive module. Because W is well-structured, the RPST of W has no rigid component.

Base:

If W is composed of a single task, then the corre-sponding ordering relations graph contains onetrivial module.

Step:

Let C be a process component of the RPST of W suchthat every child process component of C has acorresponding ordering relations graph without a
Page 13: Structuring acyclic process models

Fig. 11

A. Polyvyanyy et al. / Information Systems 37 (2012) 518–538530

primitive module. If C is trivial or polygon, then G iseither trivial or linear, cf. (1) in Proposition 5.1. If C

is bond, then G is complete, cf. (3) in Proposition 5.1.In both cases, C has a corresponding ordering relationsgraph without a primitive module.

Therefore, the MDT of the ordering relations graphthat corresponds to W has no primitive module. &

Theorem 2 implicitly specifies a procedure which, given

a process model, synthesizes an FCB-equivalent well-struc-tured model. This procedure is made explicit in Algorithm 1.Note that the algorithm expects as input a process model(component) in which no pair of distinct tasks have thesame label.

Algorithm 1. Structuring acyclic process component.

Input P: A process model (component) in which all tasks are

distinctly labeled and whose RPST consists of an acyclic rigid

component

Output T: An RPST consisting of trivials, bonds, polygons

N’ CorrespondingNet(P) // cf. Definition 3.7

U’ ProperCompletePrefixUnfolding(N) // cf. Definition 5.4

ORG’ OrderingRelationsGraph(U) // cf. Definition 5.5

MDT ’ ModularDecomposition(ORG) // cf. Definition 5.6

T’RPST obtained by traversing each module M of the MDT (in

postorder) and applying the following rules:

J If M is and complete, generate an and bond component in T

J If M is xor complete, generate an xor bond component in T

J If M is linear, generate a trivial or polygon component in T

J If M is non-concurrent primitive, generate a well-structured

component using compiler techniques, e.g., [11]

J If M is concurrent primitive, FAIL

return T

Without loss of generality, Algorithm 1 assumes thatthe RPST of the process model (or process component),taken as input, consists of a single rigid component. Thealgorithm can be trivially extended to the case where theRPST of the process model given as input consists of arigid component with polygons, bonds, or rigids as des-cendants. In this latter case, child components of the rigidcomponent need to be abstracted into atomic task nodes.Also, if we were given a process model whose RPSTcontains several rigid components, we would start bystructuring the rigid components at lower levels of theRPST and collapsing them into atomic task nodes beforeattempting to structure rigids at upper levels. The com-plexity of the algorithm is determined by the complexity

a

b

c

d

Book freight shipment

Send invoice to customer

Handle confidential shipment

Handle regular

shipment

Request security

clearance

Sdelno

i

R1

s

t

u

v

w

x

. (a) An unstructured process model with inherently unstructured rigid p

of the unfolding step (see earlier discussion). The compu-tation of an orgraph has polynomial complexity. All othersteps can be accomplished in linear time. As explainedbefore, in the case of an and rigid, the unfolding step is notrequired as nets which correspond to and rigids and theirproper prefixes coincide. Note that the behavior capturedin non-concurrent primitives can be structured byemploying compiler techniques for GOTO program trans-formations. To accomplish structuring, one first needs tosynthesize a program (process component) from a non-concurrent primitive module. The synthesis can be trivi-ally accomplished by adopting the technique in [42]. Thealgorithm fails if the input process component is inher-ently unstructured, such as component R1 in Fig. 11(a). Inthis particular case, the ordering relations graph of R1forms a single concurrent primitive module, cf. Fig. 11(b).Observe that the unfolding step duplicates the transitionwhich corresponds to task f. As structuring relies onminimal proper prefixes (see the discussion in Section5.1), the proposed technique always operates with theminimum duplication of transitions which is required toallow structuring.

In Fig. 9(d), we show the MDT of the orgraph computedfor process component R1 in Fig. 5. The MDT contains asingle primitive module P1, which hints at inherentunstructured nature of R1. Note that due to the character-istic topology of causal relations, the pattern in Fig. 5 isoften referred to as Z-structure or N-structure.

In light of the above, we conclude that given an acyclicprocess model with an arbitrary topology, one can con-struct an FCB-equivalent well-structured process model,iff the proper complete prefix unfolding of the system thatcorresponds to the given process model is such that themodular decomposition of its orgraph contains no con-current primitive module.

6. Structuring multi-source and multi-sink models

In Section 5, we assumed that process models havesingle source and single sink nodes. However, contem-porary notations for business process modeling, includingBPMN and EPC, support the definition of models withmultiple source and/or multiple sink events, hereaftercalled multi-source and multi-sink models. In this section,we extend the notion of structuredness for multi-sourceand multi-sink models and we extend the structuringmethod accordingly.

e

f

end ivery tice

o

P1

y

z

a

b

c

de

f

f'P1

rocess component R1 and (b) the ordering relations graph of R1 from (a).

Page 14: Structuring acyclic process models

A. Polyvyanyy et al. / Information Systems 37 (2012) 518–538 531

6.1. Notion of structuredness

In the context of single-source and single-sink processmodels, we have defined a well-structured process modelas one whose RPST does not contain any rigid compo-nents, cf. Section 2.2. This notion needs to be extended forthe case of multi-source and multi-sink models. To thisend, a process model with multiple source nodes isaugmented with an additional (start) node and an arcfrom this fresh node to each of the source nodes. Thisadditional node is labeled s by convention. Conversely,a model with multiple sink nodes is augmented with anadditional (end) node and an arc from each of the sinknodes of the original model to this fresh node (labeled e

by convention). This augmentation is captured by thefollowing definition.

Definition 6.1 (Augmented process model). Let W be amulti-source and multi-sink process model. The augmen-

ted version of W is constructed from W as follows:

J

If W has more than one source, a new source start isadded and for each source node s of W, an arc fromstart to s is added.

J

If W has more than one sink, a new sink end is addedand for each sink node s of W, an arc from s to end

is added.

A multi-source and multi-sink model is said to be

(well-)structured, iff the RPST of its augmented versioncontains no rigid process component. Note that theaugmentation of a model aligns well with the principlesof the generalized RPST computation, cf. [21] for details.

Given an unstructured multi-source and multi-sinkprocess model P, our goal is to compute an FCB-equivalentstructured process model P0. By definition, this means thatthe labels of the nodes in P0 must coincide with those in P.Hence, the special nodes s and e, which are added for thesake of constructing the RPST, need to be removed at theend of the structuring procedure, thereby yielding astructured multi-source and multi-sink process model asoutput.

e3 e4

c d

e1 e2

a b

V X

e5 e6

w

y

x

z

V X

e1 e2

a b

V

e5

w

y

V

B1

Fig. 12. (a) A multi-source and multi-sink EPC, (b) its

6.2. Instantiation semantics

Given a multi-source and multi-sink process model,we can compute the RPST of the augmented model inorder to separate rigid components from non-rigid ones.Non-rigid components are already structured and thuscan be replaced with a single ‘‘black box’’, so that theirparent node in the RPST can be structured. For example,Fig. 12(a) shows a multi-source and multi-sink model,captured in EPC notation. Fig. 12(b) shows its augmentedversion whose RPST contains rigid component R1, underwhich we can find two bond components B1 and B2. Eachof these bond components in turn contains two polygons,but the polygons are not shown in the figure for the sakeof simplicity. After replacing components B1 and B2 withblack-boxes, we obtain the abstract model depicted inFig. 12(c) consisting of a single rigid component. Theproblem of structuring the original EPC is then reducedto that of structuring this rigid component.

Rigid components can be classified into those thatcontain one of the special nodes s or e, and those thatdo not. The latter type of rigid can be structured using themethod outlined in the previous sections. Therefore, wecan focus our attention on the case where a rigid containss or e. Without loss of generality, we assume below thatwe are given a rigid component that contains both the s

node and the e node introduced during the augmentation.The case where a rigid contains the s node or the e node(but not both) is just a special case.

The first step in structuring a rigid component is tocompute its corresponding net. Definition 3.7 can bedirectly applied to multi-sink process models (and thusto multi-sink rigid components). The resulting net willhave multiple sink places, but this feature does not poseany particular problem. Indeed, the unfolding is definedfor multi-sink nets in the same way as for single-sink nets.On the other hand, the mapping of multi-source processmodels to Petri nets requires special care for two reasons:

1.

s

e

aug

In order to compute an unfolding, we need to define an‘‘initial marking’’. In the case of a single-source net, theinitial marking is the one that contains a single token

e3 e4

c d

X

e6

x

z

X

B2

R1

V X

e5 e6

y z

B1 B2

s

e

R1

mented version, and (c) its abstract version.

Page 15: Structuring acyclic process models

tB1

t{B1}te5

typB1,y py,e5pstart,B1 pe5

A. Polyvyanyy et al. / Information Systems 37 (2012) 518–538532

in the source place and no other tokens, but in thecase of multi-source nets, several initial markings arepossible.

pstart

2.

tB2 te6

tz,e6 pz,e6

pz

t{B1,B2}

pstart,B2t{B2}

pe6

Fig. 14. The augmented version of the net that corresponds to the

abstract model in Fig. 12(c).

Different process modeling notations adopt differentsemantics for multi-source models [43].

Hence, we need to consider each process modelingnotation separately in order to determine how to map amulti-source process model (or a process component) inthat notation into a Petri net, and how to determinethe initial marking from which the unfolding will becomputed. Below we address these questions in thecontext of two concrete process modeling notations,namely BPMN and EPC.

6.2.1. Multi-source BPMN models

The notion of process model (as per Definition 2.1)allows us to generically represent models in several graph-oriented process modeling notations, including BPMN andEPC. Below, we consider the case where a process modelrepresents a BPMN model. In this context, a node in aprocess model represents either a BPMN activity, a BPMNevent, or a BPMN gateway. Such process models maycontain multiple source nodes, each one correspondingeither to a ‘‘start event’’ or to an ‘‘event-based gateway’’.As per the BPMN standard specification [1], the instantiationsemantics of such multi-source models is as follows:

J

If a BPMN model starts with multiple events, a newprocess instance is created whenever one of these‘‘start events’’ fires. The mapping of this case to a Petrinet with a single start place is depicted in Fig. 13(a).

J

If a BPMN model starts with multiple event-basedgateways that participate in a common conversation,each of these gateways must receive one token. Thecorresponding Petri net mapping is shown in Fig. 13(b).

J

If a BPMN model starts with a so-called ‘‘parallelevent-based gateway’’, then each one of the eventsconnected to this parallel event-driven gateway mustoccur before the process is instantiated. The corre-sponding Petri net mapping is shown in Fig. 13(c).

In all three cases, we see that a multi-source BPMNmodel – and consequently a rigid component in a BPMNmodel containing the s node – can be mapped to a Petri netwith a single source place, such that the initial markingconsists of exactly one token in this source place. Since thismapping is trivial, as shown in Fig. 13, we omit its formaldefinition. The rest of the section focuses on the EPCsemantics, which requires a more extensive treatment.

Fig. 13. Mapping multi-source

6.2.2. Multi-source EPC models

There is no ‘‘official’’ precise instantiation semantics forEPCs with multiple start events. However, some authorshave adopted the following semantics [43]:

J

BP

An instance of the process requires at least one of thestart events to be triggered.

J

Additional start events may be triggered during theexecution of a process instance.

An EPC can be represented as a process model (as perDefinition 2.1) where each node of the process model

represents either a function, an event, or a connector.In order to capture the instantiation semantics of anEPC process model with multiple source tasks, the netsobtained by applying Definition 3.7 can be augmented toa single source.

Definition 6.2 (Augmented Petri net). Let N¼ ðP,T,FÞ be anet and let S � P be the set of all source places of N. Theaugmented version of N is constructed from N as follows:

J

A fresh place pstart is added in the net. J For every non-empty subset of source places

s 2 PðSÞ\+, a fresh start transition ts and a fresh flowarc ðpstart ,tsÞ is added in the net.

J

For every source place s 2 S and for every non-emptysubset of source places s 2 PðSÞ\+ containing s, i.e.,s 2 s, a fresh flow arc ðts,sÞ is added in the net.

For example, Fig. 14 shows the augmented version ofthe net which corresponds to the abstract EPC model inFig. 12(c) (note the abuse of notation for simplicity reasons).Fresh nodes are highlighted with grey background. Placepstart is the only source place of the resulting net.

If we set the initial marking to be the one that containsa single token in the source place (and no tokens else-where), the augmented net captures all possible instantia-

tions of the process model. The Petri net of a multi-source

MN models to nets.

Page 16: Structuring acyclic process models

A. Polyvyanyy et al. / Information Systems 37 (2012) 518–538 533

and multi-sink EPC model (or component of a model)starts with one transition per possible instantiation. Forinstance, the net in Fig. 14 starts with three transitions:transition tfB1g corresponds to the instantiation where onlycomponent B1 is executed, transition tfB2g corresponds tothe instantiation where only component B2 is executed, andtransition tfB1,B2g corresponds to the case where bothcomponents are executed.

6.3. Soundness

An instantiation of a process model does not alwayslead to a successful completion of the process. Someinstantiations may lead to deadlocks or lack of synchroni-

zation. Here, a deadlock is defined as a situation where notransition can fire, but one of the branches is still active.In other words, a deadlock occurs when there is a token ina non-sink place in the net, and no transition in the net isenabled. Lack of synchronization refers to the situation inwhich a transition can fire twice (without any other transi-tion firing in-between these two firings). This corresponds tothe situation where the net reaches a marking where there ismore than one token in a place. Deadlock freeness andproper synchronization, i.e., absence of any state exhibiting alack of synchronization, correspond to the notions of sound-ness and safeness introduced earlier in this article [24].

In light of the above, Decker and Mendling [43] suggest –but do not formally define – a notion of ‘‘correct instantia-tion’’ of an EPC based on the idea that a start event will betriggered, if and only if it is required, meaning that:

J

If the execution of a process instance runs into a dead-lock because one of its join gateways is waiting for one ofits branches to complete, and the completion of thisbranch requires one of the start events to be triggered,this start event will eventually be triggered so that theexecution of the process instance can complete.

J

A start event will not be triggered if this may even-tually cause a lack of synchronization.

In order to illustrate the notions of deadlock and lack

of synchronization, let us go back to the EPC in Fig. 12(a).The execution of the process may start with an occurrenceof event e1. Eventually, the left-hand side branch ofgateway w completes, i.e., a token reaches the left-hand

e3 e4

c d

X

e1 e2

a b

V

V X

e5 e6

w

y

x

z

Fig. 15. Markings of EPCs with (a) ‘‘event must occ

side incoming arc of the gateway. This is the statedepicted in Fig. 15(a). In order to proceed, a token isrequired in the second incoming branch. Otherwise, theexecution would remain deadlocked in gateway w. Thus,event e2 must eventually occur. On the other hand, wenote that neither event e3 nor event e4 may occur at thisstage, since that would lead to a lack of synchronization,depicted in Fig. 15(b), which shows a state where event e6may occur twice.

Coming back to the abstract model in Fig. 12(c), weobserve that the instantiation where either B1 or B2 areexecuted is correct. On the other hand, the instantiationwhere both B1 and B2 are executed leads to the lack ofsynchronization depicted in Fig. 15(b). In other words,start transition tfB1,B2g in Fig. 14 corresponds to anincorrect instantiation of the original model. We notein passing that in the abstract EPC, the deadlock situa-tion depicted in Fig. 15(a) does not manifest itselfbecause the underlying events have been abstractedaway inside B1.

In order to capture the notion of ‘‘correct instantiation’’introduced by Decker and Mendling [43], we proceed asfollows: we start with the augmented version of a net thatcorresponds to a multi-source process model, as perDefinition 6.2. Based on this net, we compute an unfold-ing starting from the initial marking that puts one tokenin the source place and no tokens elsewhere. This unfold-ing captures both the correct and incorrect instantiations.Accordingly, we ‘‘prune’’ the unfolding in order to removethose branches that represent incorrect instantiations.To this end, we formally capture – at the level of theunfolding – the notion of incorrect instantiation, i.e., lackof synchronization and deadlock. The following definition– based on similar definitions by Fahland [44] – capturesthese notions. Here, the term ‘‘locally unsafe place’’ isused in lieu of ‘‘lack of synchronization’’.

Definition 6.3 (Local safeness, local deadlock). Let b¼ðN,nÞ, N¼ ðP,T ,FÞ, be the unfolding of an acyclic systemS¼ ðN0,M0Þ.

J

ur’’

A place p 2 P is locally safe in b, iff there exists no placeq 2 P, qap, such that p is concurrent to q and bothcorrespond to the same place in N0, i.e., ) q 2 P,qap :

ðp JN qÞ4ðnðpÞ ¼ nðqÞÞ; otherwise p is locally unsafe.

e3 e4

c d

X

e1 e2

a b

V

V X

e5 e6

w

y

x

z

situation and (b) lack of synchronization.

Page 17: Structuring acyclic process models

A. Polyvyanyy et al. / Information Systems 37 (2012) 518–538534

J

A place p 2 P is a local deadlock in b, iff one of thefollowing holds:2 p� ¼+ and nðpÞ�a+.2 There exist transition t 2 p�, place p0 2 �t, and place

p00 2 P, such that: p00� ¼+, p00 #N p, and p00 JN p0.

A locally unsafe place in a branching process of a net

system clearly signals that the system is unsafe, cf.Proposition 6.3 in [39]. Every local deadlock hints atexistence of a deadlock in the system. Definition 6.3specifies the deadlock condition in the special case ofacyclic systems. In case of the general class of systems,one can deduce deadlock conditions by following theprinciples described in [45,46]. Accordingly, the sound

unfolding of a system is a prefix of its unfolding thatexcludes incorrect instantiations.

Fig. 16. The sound unfolding of the net in Fig. 14.

Definition 6.4 (Sound unfolding). Let b¼ ðN,nÞ, N¼

ðP,T ,FÞ, be the unfolding of an acyclic system S. The sound

unfolding of S is the maximal prefix of b that contains notransition that is in causal relation either with a locallyunsafe place or a local deadlock in b.

e5

B2 e6'

e6B1

e5

B2 e6'

e6B1

C1

L1

L2

C2

e3

c

e1 e2

a b

V

e5

xz

V

e6

V

V

1P

B1

B3

X

X

Fig. 17. Structuring of the EPC in Fig. 12(a): (a) the orgraph

Fig. 16 exemplifies the sound unfolding of the net inFig. 14. The subnet of the unfolding below the dashed linemust be pruned out because it contains a lack of synchro-nization. Indeed, places p00z , p00z,e6, p00e6, p000z , p000z,e6, and p000e6

(highlighted with grey background) are locally unsafe.Accordingly, transition tfB1,B2g (highlighted with blackbackground) is the transition that is causal with all locallyunsafe places of the unfolding.

In certain cases, some transitions of a system might benot represented in its sound unfolding. This happenswhen a transition does not occur in any execution thatstarts with a correct instantiation. Such systems areunsound and are excluded from further consideration.

Definition 6.5 (Soundness of acyclic nets). Let b¼ ðN,nÞ,N¼ ðP,T ,FÞ, be the sound unfolding of the augmentedversion N0 ¼ ðP0,T 0,F 0Þ of an acyclic net N00. Let S0DT 0 bethe set of all start transitions of N0. N00 is said to be sound,iff for every t0 2 T 0\S0 there exists t 2 T , such that nðtÞ ¼ t0.

Specifically, we say that a multi-source and multi-sinkprocess model is sound, if every non-start transition of theaugmented version of the corresponding net appears in atleast one execution that starts with a correct instantia-tion. The sound unfolding in Fig. 16 represents all thenon-start transitions of its originative net in Fig. 14 and,hence, the model in Fig. 12 is sound.

6.4. Structuring

Given a sound multi-source and multi-sink processmodel, one can construct an equivalent structured modelusing Algorithm 1 with the following changes:

J

e6'

X

X

B2

an

The corresponding net must be augmented accordingto Definition 6.2.

J

One must construct a proper prefix based on the soundunfolding, i.e., as a prefix of the sound unfolding.

Note that the sound unfolding of the net in Fig. 14 andits proper prefix coincide, see in Fig. 16.

e4

d

y

B4

2P

e3 e4

c d

e1 e2

a b

V

e5 e6'

x

z

y

V X

e6

d its MDT, (b)–(c) structured versions of the EPC.

Page 18: Structuring acyclic process models

e1 e2

V

a

V

e3 e4

V

V

b

e5

e1 e2

V

a

e3 e4

V

b

e5

Fig. 18. Structuring of a homogeneous and rigid.

A. Polyvyanyy et al. / Information Systems 37 (2012) 518–538 535

Fig. 17(a) shows the orgraph of the proper prefixin Fig. 16, along with its MDT. The MDT contains noconcurrent primitive and, therefore, the well-structuredprocess model exhibiting given ordering relations exists.The resulting structured model has a single source andsingle sink, and starts and ends with gateways thatencode instantiations and completions of the model, asshown in Fig. 17(b), which is the well-structured versionof the EPC in Fig. 12(a). The figure also visualizes theprocess components: B1 and B2 are the components fromFig. 12(b). Polygons P1 and P2 are synthesized frommodules L1 and L2, respectively, cf. the MDT. Finally,bonds B3 and B4 correspond to modules C1 and C2,respectively. The gateways at the start and at the endcan be trivially removed through a post-processing step,thereby yielding a structured multi-source and multi-sinkEPC model, cf. Fig. 17(c).

7. Evaluation

The proposed structuring method has been implemen-ted as a tool, namely bpstruct, publicly available athttps://code.google.com/p/bpstruct/. Using this imple-mentation, we conducted an empirical evaluation of theproposed method with the aim of addressing the follow-ing questions in the context of a repository of processmodels taken from commercial practice:

Q1.

What proportion of unstructured process models areinherently unstructured and what proportion of mod-els are structurable?

Q2.

Is the exponential worst-case complexity of the struc-turing method (particularly the unfolding method)problematic in practice?

Q3.

In theory, the structuring method may lead to nodeduplication. To what extent does this duplication leadto larger process models?

Q4.

The method for structuring multi-source models maylead to disconnected models. How often does thisphenomenon occur?

In the following, we present the dataset which we used

for the evaluation (Section 7.1), and discuss answers tothe questions proposed above (Section 7.2), which wederived from the evaluation.

7.1. Dataset

For the evaluation, we used the SAP Reference Model[47]—a collection consisting of 604 EPCs capturing busi-ness processes supported by the SAP R/3 enterprisesystem. In the first stage, we discarded models that arealready structured and models that contain cycles. Wealso discarded models that contained or joins in a rigid,since these models cannot be processed by the proposedmethod. Note that or joins at the boundaries of bonds donot pose any problem to bpstruct, since bonds areseparated from rigids during the computation of the RPST,and bpstruct only needs to deal with rigids; a descrip-tion of the execution semantics of or gateways can befound in [48]. After this pre-processing, we were left with

78 models, 40 of which are sound. As many models in theSAP Reference Model are multi-source and multi-sink,we checked soundness using the technique proposedin Section 6.3. Coincidently, each of the 40 soundmodels contained exactly one rigid component. Thus,the number of rigids that needed to be structured wasalso 40.

Among these 40 rigids, six are homogeneous xor rigids,19 are homogeneous and rigids, and 15 are heterogeneousrigids. All homogeneous xor rigids can be structured, sincethe only source of inherent unstructuredness in acyclicprocess models stems from concurrency. On the otherhand, all but one of the 19 homogeneous and rigids areinherently unstructured. The only and rigid that is struc-turable is a case of a rigid that contained redundanttransitive arcs; the core structure of the rigid is summar-ized in Fig. 18(a). The arc going from the and split afterevent e2 to the and join after event e4 is redundant and,thus, can be removed; in doing so, the split and the joinbecome redundant and can also be removed. The struc-tured version of the EPC in Fig. 18(a) is proposed inFig. 18(b). All other 18 homogeneous and rigids containthe structure depicted in Fig. 5. Finally, among the 15heterogeneous rigids, only three are inherently unstruc-tured. The complete characterization of the datasetemployed for the evaluation is depicted in Fig. 19. Fromthe total of 604 models, we did not address 31 EPCs withcycles and 96 EPCs with or gateways; these are depictedby empty circles in the figure. We plan to address thesecases as a part of our future work.

7.2. Results

With reference to question Q1 above, the evaluationsuggests that unstructured homogeneous and rigids arehighly prone to being inherently unstructured, whileheterogeneous rigids are prone to be structurable. Withreference to question Q2 above, Fig. 20(a) plots theexecution time relative to the size of the input model.The figure also plots two trendlines of the linear regres-sion analysis: one for the heterogeneous and one for thehomogeneous case. The plot shows that, although intheory the complexity of the structuring algorithm isexponential to the size of the input (due to the unfolding

Page 19: Structuring acyclic process models

All models(604)

Structured(406)

Unstructured(198)

Cyclic(31)

Acyclic(167)

With ORs (96)

W/o ORs (78)

Unsound(38)

Sound(40)

Structurable (19)

Inherently unstructured

(21)

Homogeneous XOR(6)

Homogeneous AND(1)

Heterogeneous(12)

Homogeneous AND(18)

Heterogeneous(3)

Fig. 19. Structural characterization of models in the SAP Reference Model.

A. Polyvyanyy et al. / Information Systems 37 (2012) 518–538536

step—cf. Section 5), this worst-case exponential complex-ity does not manifest itself in the dataset at hand. Theobserved execution times are rather linear relative to thesize of the input models. In the given dataset, we foundone rigid component that contained two xor gatewaysintermingled with eight and gateways. The structuringalgorithm took 1.5 s to execute for this model and con-cluded that the model is inherently unstructured. These1.5 s were spent mainly on the unfolding and the pruningsteps. Putting this case aside, the average execution timefor the remaining models is 12 ms (standard deviation:7 ms). These execution times exclude the time to computethe RPST, but this step has a linear complexity.

With reference to question Q3, Fig. 20(b) plots the sizeof the structured process models relative to the size of theoriginal models (only for models that were originallyunstructured but not inherently unstructured). The figureshows that, except for the model with a homogeneous and

rigid (located slightly below the diagonal), the size of thestructured model is at least equal and in most cases larger

Fig. 20. Experimental results: (a) average structuring time and (b) duplica-

tion ratio.

than that of the input model. Specifically, we observedthat the size of the output model (measured in terms ofnumber of nodes) was up to 1.625 times that of the inputmodel. On average, the size of the models produced bybpstruct was 1.22 times the size of the input models(standard deviation: 0.2). These results emphasize the factthat structuring a process model involves a tradeoffbetween modularity and size. Note that duplicationresults were obtained for flat models, i.e., models withoutsub-processes. Structured models may also contain dupli-cates of whole process components, which leaves oppor-tunities for modularization, e.g., by employing techniqueslike [29].

Finally, with reference to question Q4, we found that15 of the 19 structurable models led to disconnectedstructured models. Two of the rigids whose structuredversions were connected originally had a single source,while two of them were multi-source. And as expected, allrigids whose structured versions are disconnected wereoriginally multi-source models. This result suggests thatwhen structuring multi-source models, one is likely toobtain disconnected models. This phenomenon is specificto EPCs. It does not occur in the case of multi-sourceBPMN models, since in BPMN, the multiple disconnectedfragments would be re-connected with a common event-driven or parallel event-driven gateway.

8. Conclusion

We conclude that a sound acyclic process model isinherently unstructured, if and only if its RPST contains arigid process component for which the modular decom-position of its orgraph contains a concurrent primitive.In all other cases, Algorithm 1 applied to every rigid processcomponent of a process model constructs an FCB-equivalentwell-structured process model. We have thus provided acharacterization of the class of well-structured acyclicprocess models under fully concurrent bisimulation, and acomplete structuring method, which has been implementedin the bpstruct tool.

Our method can also be used to structure models withSESE cycles, even if these cycles contain unstructuredcomponents. In this case, the unstructured componentsand the cycles are in different nodes of the RPST. However,the proposed method cannot deal with models with arbitrarycycles. We believe that the notion of the proper completeprefix unfolding, cf. Definition 5.4, can be applied to unfoldall cycles in the model to the point where structuring can be

Page 20: Structuring acyclic process models

A. Polyvyanyy et al. / Information Systems 37 (2012) 518–538 537

addressed as a combination of the compiler techniques forstructuring sequential programs and the approach proposedin this article. We plan to investigate this option in futurework. Also, the proposed method does not apply to modelswith or joins and complex gateways, except in the casewhere these gateways appear at the boundaries of a bond.When or joins are present inside a rigid, the structuringproblem becomes conceptually similar to the problem oftransforming a process model with or joins into FCB-equiva-lent process model with xor and and gateways only, so thatthe unfolding techniques can be applied. Dealing withcomplex gateways would require different techniques sincecomplex gateways lead to unsafe models, i.e., ones that havemore than one token in the same place. Future work will aimat addressing these and other restrictions of the presentedtechnique, such as the ability to deal with BPMN modelswith attached events (both interrupting and non-interrupt-ing ones).

As mentioned in the Introduction, one of the motivat-ing reasons for structuring process models is to be able toapply existing analysis techniques which only work forstructured process models. For example, existing techni-ques for calculating Quality of Service (QoS) properties ofbusiness processes [8] are only applicable for structuredprocess models. In a separate work [49], we have shownhow these techniques can be extended to models witharbitrary topology by applying bpstruct in conjunctionwith additional post-processing steps to deal with inher-ently unstructured rigid components.

Another potential benefit of structuring a process modelis that the resulting structured model may be easier tocomprehend and less error-prone to maintain thanks to itshigher degree of modularity [5]. However, the empiricalevaluation reported in this article has put into evidencethat the structured process models produced by bpstructare often larger than the original ones (by an average of 22%in the case of the SAP Reference Model). Larger models aregenerally more difficult to comprehend and maintain thansmaller models. An interesting direction for future work isto conduct empirical evaluations with end-users in order todetermine whether the complexity reduction due to themodularity of the structured model outweighs the increasein complexity due to the larger size of the structured model.

Acknowledgments

This research is partly funded by ERDF via the EstonianCenter of Excellence in Computer Science and the EstonianScience Foundation.

References

[1] Object Management Group (OMG), Business Process Model andNotation (BPMN) Version 2.0. URL /http://www.omg.org/spec/BPMN/2.0/PDF/S.

[2] A.-W. Scheer, O. Thomas, O. Adam, Process modeling using event-driven process chains, in: Process-Aware Information Systems,Wiley InterScience, 2005, pp. 119–145 (Chapter 6).

[3] B. Kiepuszewski, A.H.M. ter Hofstede, C. Bussler, On structuredworkflow modelling, Conference on Advanced Information SystemsEngineering (CAiSE), Lecture Notes in Computer Science, vol. 1789,Springer, 2000, pp. 431–445.

[4] E. Best, R.R. Devillers, A. Kiehn, L. Pomello, Concurrent bi-simulations in Petri nets, Acta Informatica (ACTA) 28 (3) (1991)231–264.

[5] R. Laue, J. Mendling, The impact of structuredness on error prob-ability of process models, Information Systems Technology and itsApplications (UNISCON), Lecture Notes in Business InformationProcessing, vol. 5, Springer, 2008, pp. 585–590.

[6] H.D. Rombach, A controlled experiment on the impact of softwarestructure on maintainability, IEEE Transactions on Software Engi-neering (TSE) 13 (3) (1987) 344–354.

[7] V.R. Gibson, J.A. Senn, System structure and software maintenanceperformance, Communications of the ACM (CACM) 32 (3) (1989)347–358.

[8] M. Laguna, J. Marklund, Business Process Modeling, Simulation, andDesign, Prentice Hall, 2005.

[9] C. Combi, R. Posenato, Controllability in temporal conceptual work-flow schemata, Business Process Management (BPM), Lecture Notesin Computer Science, vol. 5701, Springer, 2009, pp. 64–79.

[10] C. Ouyang, M. Dumas, A.H.M. ter Hofstede, W.M.P. van der Aalst,From BPMN process models to BPEL web services, in: International/European Conference on Web Services (ICWS), IEEE ComputerSociety, 2006, pp. 285–292.

[11] G. Oulsnam, Unravelling unstructured programs, The ComputerJournal (CJ) 25 (3) (1982) 379–387.

[12] R. Liu, A. Kumar, An analysis and taxonomy of unstructured work-flows, Business Process Management (BPM), Lecture Notes inComputer Science, vol. 3649, Springer, 2005, pp. 268–284.

[13] R. Hauser, M. Friess, J.M. Kuster, J. Vanhatalo, An incrementalapproach to the analysis and transformation of workflows usingregion trees, IEEE Transactions on Systems, Man, and CyberneticsPart C (TSMC) 38 (3) (2008) 347–359.

[14] A. Polyvyanyy, L. Garcıa-Banuelos, M. Weske, Unveiling hiddenunstructured regions in process models, OTM Conferences,Lecture Notes in Computer Science, vol. 5870, Springer, 2009,pp. 340–356.

[15] R. Hauser, J. Koehler, Compiling process graphs into executablecode, Generative Programming and Component Engineering(GPCE), Lecture Notes in Computer Science, vol. 3286, Springer,2004, pp. 317–336.

[16] J. Koehler, R. Hauser, Untangling unstructured cyclic flows—a solutionbased on continuations, CoopIS/DOA/ODBASE, Lecture Notes in Com-puter Science, vol. 3290, Springer, 2004, pp. 121–138.

[17] C. Ouyang, M. Dumas, W.M.P. van der Aalst, A.H.M. ter Hofstede, J.Mendling, From business process models to process-oriented soft-ware systems, ACM Transactions on Software Engineering andMethodology (TOSEM) 19(1) (2009).

[18] B. Kiepuszewski, A.H.M. ter Hofstede, W.M.P. van der Aalst, Funda-mentals of control flow in workflows, Acta Informatica (ACTA) 39(3) (2003) 143–209.

[19] T. Murata, Petri nets: properties, analysis and applications, Pro-ceedings of the IEEE 77 (4) (1989) 541–580.

[20] A. Polyvyanyy, L. Garcıa-Banuelos, M. Dumas, Structuring acyclicprocess models, Business Process Management (BPM), Lecture Notesin Computer Science, vol. 6336, Springer, 2010, pp. 276–293.

[21] A. Polyvyanyy, J. Vanhatalo, H. Volzer, Simplified computation andgeneralization of the refined process structure tree, Web Servicesand Formal Methods (WS-FM), Lecture Notes in Computer Science,vol. 6551, Springer, 2010, pp. 25–41.

[22] J. Vanhatalo, H. Volzer, J. Koehler, The refined process structuretree, Business Process Management (BPM), Lecture Notes in Com-puter Science, vol. 5240, Springer, 2008, pp. 100–115.

[23] J. Vanhatalo, H. Volzer, J. Koehler, The refined process structuretree, Data & Knowledge Engineering (DKE) 68 (9) (2009) 793–818.

[24] J. Vanhatalo, H. Volzer, F. Leymann, Faster and more focusedcontrol-flow analysis for business process models through SESEdecomposition, International Conference on Service Oriented Com-puting (ICSOC), Lecture Notes in Computer Science, vol. 4749,Springer, 2007, pp. 43–55.

[25] F. Zhang, E.H. D’Hollander, Using hammock graphs to structureprograms, IEEE Transactions on Software Engineering (TSE) 30 (4)(2004) 231–245.

[26] W. Zhao, R. Hauser, K. Bhattacharya, B.R. Bryant, F. Cao, Compilingbusiness processes: untangling unstructured loops in irreducibleflow graphs, International Journal of Web and Grid Services(IJWGS) 2 (1) (2006) 68–91.

[27] B. Weber, M. Reichert, J. Mendling, H.A. Reijers, Refactoring largeprocess model repositories, Computers in Industry (CII) 62 (5)(2011) 467–486.

Page 21: Structuring acyclic process models

A. Polyvyanyy et al. / Information Systems 37 (2012) 518–538538

[28] R. Dijkman, B. Gfeller, J. Kuster, H. Volzer, Identifying refactoringopportunities in process model repositories, Information and Soft-ware Technology 53 (9) (2011) 937–948.

[29] R. Uba, M. Dumas, L. Garcıa-Banuelos, M. La Rosa, Clone detectionin repositories of business process models, in: Business ProcessManagement (BPM), Lecture Notes in Computer Science, Springer,2011, pp. 248–264.

[30] W.M.P. van der Aalst, Verification of workflow nets, Applicationsand Theory of Petri Nets (ICATPN/APN), Lecture Notes in ComputerScience, vol. 1248, Springer, 1997, pp. 407–426.

[31] E. Best, Structure theory of Petri nets: the free choice hiatus,Advances in Petri Nets, Lecture Notes in Computer Science, vol.254, Springer, 1987, pp. 168–205.

[32] J. Desel, J. Esparza, Free Choice Petri Nets, Cambridge Tracts inTheoretical Computer Science, Cambridge University Press, 1995.

[33] W.M.P. van der Aalst, Workflow verification: finding control-flowerrors using Petri-net-based techniques, Business Process Manage-ment (BPM), Lecture Notes in Computer Science, vol. 1806,Springer, 2000, pp. 161–183.

[34] R.J. van Glabbeek, The linear time-branching time spectrum(extended abstract), International Conference on Concurrency The-ory (CONCUR), Lecture Notes in Computer Science, vol. 458,Springer, 1990, pp. 278–297.

[35] M. Nielsen, G.D. Plotkin, G. Winskel, Petri nets, event structures anddomains, Part I, Theoretical Computer Science (TCS) 13 (1981) 85–108.

[36] R.M. McConnell, F. de Montgolfier, Linear-time modular decom-position of directed graphs, Discrete Applied Mathematics (DAM)145 (2) (2005) 198–209.

[37] J. Engelfriet, Branching processes of Petri nets, Acta Informatica(ACTA) 28 (6) (1991) 575–591.

[38] J. Esparza, K. Heljanko, Unfoldings—a partial-order approach tomodel checking, EATCS Monographs in Theoretical ComputerScience, Springer, 2008.

[39] J. Esparza, S. Romer, W. Vogler, An improvement of McMillan’sunfolding algorithm, Formal Methods in System Design (FMSD) 20(3) (2002) 285–310.

[40] J. Esparza, S. Romer, W. Vogler, An improvement of McMillan’sunfolding algorithm, Tools and Algorithms for Construction andAnalysis of Systems (TACAS), Lecture Notes in Computer Science,vol. 1055, Springer, 1996, pp. 87–106.

[41] V. Khomenko, M. Koutny, W. Vogler, Canonical prefixes of Petri netunfoldings, Acta Informatica (ACTA) 40 (2) (2003) 95–118.

[42] F. Elliger, A. Polyvyanyy, M. Weske, On separation of concurrencyand conflicts in acyclic process models, in: EMISA, Lecture Notes inInformatics, Gesellschaft fur Informatik, vol. 172, pp. 25–36.

[43] G. Decker, J. Mendling, Process instantiation, Data & KnowledgeEngineering (DKE) 68 (9) (2009) 777–792.

[44] D. Fahland, From Scenarios to Components, Ph.D. Thesis, Hum-boldt-Universitat zu Berlin and Technische Universiteit Eindhoven,2010.

[45] K.L. McMillan, Using unfoldings to avoid the state explosionproblem in the verification of asynchronous circuits, ComputerAided Verification (CAV), Lecture Notes in Computer Science, vol.663, Springer, 1992, pp. 164–177.

[46] S. Melzer, S. Romer, Deadlock checking using net unfoldings,Computer Aided Verification (CAV), Lecture Notes in ComputerScience, vol. 1254, Springer, 1997, pp. 352–363.

[47] G. Keller, T. Teufel, SAP R/3 Process Oriented Implementation:Iterative Process Prototyping, Addison-Wesley, 1998.

[48] W.M.P. van der Aalst, A.H.M. ter Hofstede, B. Kiepuszewski,A.P. Barros, Workflow patterns, Distributed and Parallel Databases(DPD) 14 (1) (2003) 5–51.

[49] M. Dumas, L. Garcıa-Banuelos, A. Polyvyanyy, Y. Yang, L. Zhang,Aggregate quality of service computation for composite services,International Conference on Service Oriented Computing (ICSOC),Lecture Notes in Computer Science, vol. 6470, 2010, pp. 213–227.