auditing 2 - process mining · auditing page 2 “the term auditing refers to the evaluation of...

76
Auditing 2.0 Using Process Mining to Support Tomorrow's Auditor prof.dr.ir. Wil van der Aalst www.processmining.org

Upload: hoangthuan

Post on 22-May-2018

213 views

Category:

Documents


1 download

TRANSCRIPT

Auditing 2.0Using Process Mining to Support Tomorrow's Auditor

prof.dr.ir. Wil van der Aalstwww.processmining.org

PAGE 1

IEEE Computer, vol. 43, no. 3, pp. 90-93, Mar. 2010

Auditing

PAGE 2

“The term auditing refers to the evaluation of organizations

and their processes. Audits are performed to ascertain the

validity and reliability of information about these

organizations and associated processes. This is done to check whether business

processes are executed within certain boundaries set by

managers, governments, and other stakeholders.”

Growth of data

PAGE 3

PAGE 4Data Mining

Smoker

Drinker

Weight

Short(91/10)

YesNo

Long(30/1)

NoYes

Long(150/20)

Short(321/25)

<81.5 ≥81.5

Process Mining =

Process Analysis

start register initial conditions

check_Aneeded?

check_A

modify conditions

check_Bneeded?

check_B

check_Cneeded?

check_C

assesrisk

declinec1

c2

c3

c4

c5

c6

c7

c8

c9

c10

c11

c12

c13

makeoffer

handleresponse

handlepayment

send insurance

documents

timeout1 timeout2 withdraw offer

c14 c15 c16

c17

(RM,RD)(RM,RD)(E,SD) (E,RD)

(SM,SD) (E,SD)(E,FD)

(E,SD)

(E,SD)

(YE,RD)

(YE,RD)

(FE,FD)

(RM,RD)

+

PAGE 5

Process Mining

• Process discovery: "What is really happening?"

• Conformance checking: "Do we do what was agreed upon?"

• Performance analysis: "Where are the bottlenecks?"

• Process prediction: "Will this case be late?"

• Process improvement: "How to redesign this process?"

• Etc.

PAGE 6

Process mining: Linking events to models

software system

(process)model

eventlogs

modelsanalyzes

discovery

records events, e.g., messages,

transactions, etc.

specifies configures implements

analyzes

supports/controls

extension

conformance

“world”

people machines

organizationscomponents

business processes

Process Discovery Example

PAGE 7

α

PAGE 8

>,→,||,# relations

• Direct succession: x>y iff for some case x is directly followed by y.

• Causality: x→y iff x>y and not y>x.

• Parallel: x||y iff x>y and y>x

• Choice: x#y iff not x>y and not y>x.

case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task Acase 5 : task Ecase 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task D case 4 : task D

A>BA>CA>EB>CB>DC>BC>DE>D

A→BA→CA→EB→DC→DE→D

B||CC||B

ABCDACBDAED

PAGE 9

Basic Idea Used by α Algorithm (1)

a b

(a) sequence pattern: a→b

PAGE 10

Basic Idea Used by α Algorithm (2)

a

b

c

(b) XOR-split pattern:a→b, a→c, and b#c

a

b

c

(c) XOR-join pattern:a→c, b→c, and a#b

a

b

c

(b) XOR-split pattern:a→b, a→c, and b#c

PAGE 11

Basic Idea Used by α Algorithm (3)

a

b

c

(d) AND-split pattern:a→b, a→c, and b||c

a

b

c

(e) AND-join pattern:a→c, b→c, and a||b

a

b

c

(d) AND-split pattern:a→b, a→c, and b||c

Example Revisited

PAGE 12

A

B

C

DE

p2

end

p4

p3p1

start

B#EC#E…

Result produced by α algorithm

A>BA>CA>EB>CB>DC>BC>DE>D

A→BA→CA→EB→DC→DE→D

B||CC||B

PAGE 13

Where did we apply process mining?

• Municipalities (e.g., Alkmaar, Heusden, Harderwijk, etc.)• Government agencies (e.g., Rijkswaterstaat, Centraal

Justitieel Incasso Bureau, Justice department)• Insurance related agencies (e.g., UWV)• Banks (e.g., ING Bank)• Hospitals (e.g., AMC hospital, Catharina hospital)• Multinationals (e.g., DSM, Deloitte)• High-tech system manufacturers and their customers

(e.g., Philips Healthcare, ASML, Ricoh, Thales)• Media companies (e.g. Winkwaves)• ...

Let’s eat …

PAGE 14

Example: WMO Harderwijk

• Process related to the execution of “Wet Maatschappelijke Ondersteuning” (WMO) Harderwijk

• Handling WMO applications• WMO: supporting citizens of municipalities (illness,

handicaps, elderly, etc.).• Examples:

• wheelchair, scootmobiel, ...• adaptation of house (elevator), ...• household help, ...

PAGE 16

Event log (796 applications, 5187 events)

PAGE 17

Helicopter view of 1.5 years

PAGE 18

Huge variance in durations

PAGE 19

Process discovered using Genetic Miner

PAGE 20

Various representations

PAGE 21

Fuzzy Miner

PAGE 22

Seamless abstraction

PAGE 23more detailed more abstract

Fuzzy Replay

PAGE 24

Conformance checking using Replay

PAGE 25

= should not have happened but did

= should have happened but did not

Performance analysis using Replay

PAGE 26

Performance information

PAGE 27

Prediction based on Replay

PAGE 28

PAGE 30

How can process mining help?

• Detect bottlenecks• Detect deviations• Performance

measurement• Suggest

improvements• Decision support

(recommendation and prediction)

• Provide mirror• Highlight important

problems• Avoid ICT failures• Avoid management

by PowerPoint • From “politics” to

“analytics”

PAGE 35

PAGE 37

Business Intelligence Tools?

• Business Objects (SAP)• Cognos Business Intelligence (IBM)• Oracle Business Intelligence • Hyperion (Oracle)• SAS Business Intelligence• Microsoft Business Intelligence• SAP Business Intelligence (SAP BI)• Jaspersoft (Open Source Business Intelligence)• Pentaho BI Suite (Open Source)• ....

• Dashboards, reports, scorecards, • Slicing and dicing, data mining, ...

PAGE 38

Process Mining Software

ARIS Process Performance Manager

Interstage Automated Business Process Discovery & Visualization

Process Discovery Focus

Futura Reflect

Enterprise Visualization Suite

Comprehend

BPM|one fluxicon/nitro

ProcessGold

ProM 6

PAGE 39

PAGE 40

Starting point: event logs

event logs, audit trails, databases, message logs, etc. www.xes-standard.org

PAGE 41

extensions loaded

every trace has a name

every event has a name and a transition

classifier = name + transitionstart of trace (i.e. process instance)

name of trace

name of event (activity name)

resource

transition

timestamp

PAGE 42PAGE 42

start of trace

name of trace

name of event (activity name)

resource

data associated to event

timestamp

end of trace (i.e. process instance)

Framework

PAGE 44

information system(s)

current data

“world”people

machines

organizationsbusiness

processes documents

historic data

resources/organization

data/rules

control-flow

de jure models

resources/organization

data/rules

control-flow

de facto models

provenance

expl

ore

pred

ict

reco

mm

end

dete

ct

chec

k

com

pare

prom

ote

disc

over

enha

nce

diag

nose

cartographynavigation auditing

event logs

Models

“pre mortem”

“post mortem”software

system

(process)model

eventlogs

modelsanalyzes

discovery

records events, e.g., messages,

transactions, etc.

specifies configures implements

analyzes

supports/controls

extension

conformance

“world”

people machines

organizationscomponents

business processes

PAGE 45

information system(s)

current data

“world”people

machines

organizationsbusiness

processes documents

historic data

control-flow

de jure models

control-flow

de facto models

provenanceex

plor

e

pred

ict

reco

mm

end

dete

ct

chec

k

com

pare

prom

ote

disc

over

enha

nce

diag

nose

cartographynavigation auditing

event logs

Models

“pre mortem”

“post mortem”

A

B

C

DE

p2

end

p4

p3p1

start

Play Out (Classical use of models)

PAGE 47

A B C D

A C B DA B C D

A E D

A C B DA C B D

A E D

A E D

Play In (Process Discovery)

PAGE 48

A

B

C

DE

p2

end

p4

p3p1

start

ABCDACBDAED

ACBDAED

ABCD…

a process discovery algorithm like the αalgorithm

A

B

C

DE

p2

end

p4

p3p1

start

Replay

PAGE 49

A B C D

A

B

C

DE

p2

end

p4

p3p1

start

Replay can detect problems

PAGE 50

AC D

Problem!missing token

Problem!token left behind

A

B

C

DE

p2

end

p4

p3p1

start

Replay can extract timing information

PAGE 51

A5B8 C9 D13

5

8

9

13

3

4

5

43

265

8

764

7

74

3

Using Replay

PAGE 52

information system(s)

current data

“world”people

machines

organizationsbusiness

processes documents

historic data

resources/organization

data/rules

control-flow

de jure models

resources/organization

data/rules

control-flow

de facto models

provenance

expl

ore

pred

ict

reco

mm

end

dete

ct

chec

k

com

pare

prom

ote

disc

over

enha

nce

diag

nose

cartographynavigation auditing

event logs

Models

“pre mortem”

“post mortem”

PAGE 54

Conformance checker(Anne Rozinat et al.)

How to quantify this?

PAGE 55

Fitness by replay

m=missing,r=remaining,c=consumed,p=produced

PAGE 56

No problem (m=0, r=0)

PAGE 57

Another (impossible) trace

PAGE 58

PAGE 59

Fitness calculation

PAGE 60

Examples

f=1.000 f=0.995 f=0.540

PAGE 61

Diagnostics

PAGE 62

Other Metrics

• Fitness is not sufficient: hence other metrics are needed such as behavioral and structural appropriateness, etc.

• These metrics cover aspects such as:• Punishing for "too much" behavior.• Punishing for "overly complex" models.

LTL checker

PAGE 64

Split cases

PAGE 65

DECLAREAn alternative approach based on constraints ...

forbidden behavior

deviations from the prescribed

model

IMPERATIVE MODEL

constraint constraint

constraint constraint

Basic idea

A B

Declarative notation(e.g., ConDec, DecSerFlow)

LTL semantics

?

Example: "existence response"

• OK:• [ ]• [A,B,C,D,E]• [A,A,A,C,D,E,B,B,B]• [B,B,A,A,C,D,E]• [B,C,D,E]

• NOK• [A]• [A,A,C,D,E]

A B

Example: "response"

• OK:• [ ]• [A,B,C,D,E]• [A,A,A,B,C,D,E]• [B,B,A,A,B,C,D,E]• [B,C,D,E]

• NOK• [A]• [B,B,B,B,A,A]

A B

Example: "precedence"

• OK:• [ ]• [A,B,C,D,E]• [A,A,A,C,D,E,B,B,B]• [A,A,C,D,E]

• NOK• [B]• [B,A,C,D,E]

A B

Example

Model with constraints

• (C.1) Always start with activity register client data.• (C.2) Activity bill must be executed at least once. • (C.3) Every room service must be billed. • (C.4) Every laundry service must be billed. • (C.5) If the client checks-out- she/he must be charged. • (C.6) Sometimes it is recommended that additional cleaning is also be billed. (---optional---)

C.1

C.2

C.4

C.5C.3

C.6

Constraints

• constraints can be:• mandatory− imposed by DECLARE− can be fulfilled or temporarily violated

• optional− used as warnings for users− can be fulfilled or temporarily violated or

permanently violated• at the end of the execution all

mandatory constraints have to be fulfilled

(a) initial state (b) after "register client data"

(c) after "room service" (d) after "bill"

Possible ways of using Declare

• Discover Declare models• Check Declare models offline• Check Declare models online• Quantify “health”• Drill-down

Process Mining !

Framework

PAGE 78

information system(s)

current data

“world”people

machines

organizationsbusiness

processes documents

historic data

resources/organization

data/rules

control-flow

de jure models

resources/organization

data/rules

control-flow

de facto models

provenance

expl

ore

pred

ict

reco

mm

end

dete

ct

chec

k

com

pare

prom

ote

disc

over

enha

nce

diag

nose

cartographynavigation auditing

event logs

Models

“pre mortem”

“post mortem”

More Information

PAGE 79PAGE 79

IEEE Task Force on Process Mining

• ProM Software: prom.sourceforge.net• Process mining: www.processmining.org• ProM 5 series nightly builds: prom.win.tue.nl/tools/prom/nightly5/• ProM 6 series nightly builds: prom.win.tue.nl/tools/prom/nightly/• Converting logs (MXML-based) promimport.sourceforge.net • XES: www.xes-standard.org and www.openxes.org• Papers et al.: vdaalst.com• IEEE Task Force on Process Mining: www.win.tue.nl/ieeetfpm/