auditing 2 - semantic scholar · 2018. 10. 22. · auditing page 2 “the term auditing refers to...

76
Auditing 2.0 Using Process Mining to Support Tomorrow's Auditor prof.dr.ir. Wil van der Aalst www.processmining.org

Upload: others

Post on 10-Feb-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

  • Auditing 2.0Using Process Mining to Support Tomorrow's Auditor

    prof.dr.ir. Wil van der Aalstwww.processmining.org

    http://www.layoutsparks.com/myspace-layouts/footsteps+in++s_0�

  • PAGE 1

    IEEE Computer, vol. 43, no. 3, pp. 90-93, Mar. 2010

  • Auditing

    PAGE 2

    “The term auditing refers to the evaluation of organizations

    and their processes. Audits are performed to ascertain the

    validity and reliability of information about these

    organizations and associated processes. This is done to check whether business

    processes are executed within certain boundaries set by

    managers, governments, and other stakeholders.”

  • Growth of data

    PAGE 3

  • PAGE 4Data Mining

    Smoker

    Drinker

    Weight

    Short(91/10)

    YesNo

    Long(30/1)

    NoYes

    Long(150/20)

    Short(321/25)

  • PAGE 5

    Process Mining

    • Process discovery: "What is really happening?"

    • Conformance checking: "Do we do what was agreed upon?"

    • Performance analysis: "Where are the bottlenecks?"

    • Process prediction: "Will this case be late?"

    • Process improvement: "How to redesign this process?"

    • Etc.

  • PAGE 6

    Process mining: Linking events to models

    software system

    (process)model

    eventlogs

    modelsanalyzes

    discovery

    records events, e.g., messages,

    transactions, etc.

    specifies configures implements

    analyzes

    supports/controls

    extension

    conformance

    “world”

    people machines

    organizationscomponents

    business processes

  • Process Discovery Example

    PAGE 7

    α

  • PAGE 8

    >,→,||,# relations

    • Direct succession: x>y iff for some case x is directly followed by y.

    • Causality: x→y iff x>y and not y>x.

    • Parallel: x||y iff x>y and y>x

    • Choice: x#y iff not x>y and not y>x.

    case 1 : task A case 2 : task A case 3 : task A case 3 : task B case 1 : task B case 1 : task C case 2 : task C case 4 : task A case 2 : task B case 2 : task D case 5 : task Acase 5 : task Ecase 4 : task C case 1 : task D case 3 : task C case 3 : task D case 4 : task B case 5 : task D case 4 : task D

    A>BA>CA>EB>CB>DC>BC>DE>D

    A→BA→CA→EB→DC→DE→D

    B||CC||B

    ABCDACBDAED

  • PAGE 9

    Basic Idea Used by α Algorithm (1)

    a b

    (a) sequence pattern: a→b

  • PAGE 10

    Basic Idea Used by α Algorithm (2)

    a

    b

    c

    (b) XOR-split pattern:a→b, a→c, and b#c

    a

    b

    c

    (c) XOR-join pattern:a→c, b→c, and a#b

    a

    b

    c

    (b) XOR-split pattern:a→b, a→c, and b#c

  • PAGE 11

    Basic Idea Used by α Algorithm (3)

    a

    b

    c

    (d) AND-split pattern:a→b, a→c, and b||c

    a

    b

    c

    (e) AND-join pattern:a→c, b→c, and a||b

    a

    b

    c

    (d) AND-split pattern:a→b, a→c, and b||c

  • Example Revisited

    PAGE 12

    A

    B

    C

    DE

    p2

    end

    p4

    p3p1

    start

    B#EC#E…

    Result produced by α algorithm

    A>BA>CA>EB>CB>DC>BC>DE>D

    A→BA→CA→EB→DC→DE→D

    B||CC||B

  • PAGE 13

    Where did we apply process mining?

    • Municipalities (e.g., Alkmaar, Heusden, Harderwijk, etc.)• Government agencies (e.g., Rijkswaterstaat, Centraal

    Justitieel Incasso Bureau, Justice department)• Insurance related agencies (e.g., UWV)• Banks (e.g., ING Bank)• Hospitals (e.g., AMC hospital, Catharina hospital)• Multinationals (e.g., DSM, Deloitte)• High-tech system manufacturers and their customers

    (e.g., Philips Healthcare, ASML, Ricoh, Thales)• Media companies (e.g. Winkwaves)• ...

  • Let’s eat …

    PAGE 14

  • Example of a Lasagna Process

    http://www.layoutsparks.com/myspace-layouts/footsteps+in++s_0�

  • Example: WMO Harderwijk

    • Process related to the execution of “Wet Maatschappelijke Ondersteuning” (WMO) Harderwijk

    • Handling WMO applications• WMO: supporting citizens of municipalities (illness,

    handicaps, elderly, etc.).• Examples:

    • wheelchair, scootmobiel, ...• adaptation of house (elevator), ...• household help, ...

    PAGE 16

  • Event log (796 applications, 5187 events)

    PAGE 17

  • Helicopter view of 1.5 years

    PAGE 18

  • Huge variance in durations

    PAGE 19

  • Process discovered using Genetic Miner

    PAGE 20

  • Various representations

    PAGE 21

  • Fuzzy Miner

    PAGE 22

  • Seamless abstraction

    PAGE 23more detailed more abstract

  • Fuzzy Replay

    PAGE 24

  • Conformance checking using Replay

    PAGE 25

    = should not have happened but did

    = should have happened but did not

  • Performance analysis using Replay

    PAGE 26

  • Performance information

    PAGE 27

  • Prediction based on Replay

    PAGE 28

  • Spaghetti Processes

    http://www.layoutsparks.com/myspace-layouts/footsteps+in++s_0�

  • PAGE 30

  • How can process mining help?

    • Detect bottlenecks• Detect deviations• Performance

    measurement• Suggest

    improvements• Decision support

    (recommendation and prediction)

    • Provide mirror• Highlight important

    problems• Avoid ICT failures• Avoid management

    by PowerPoint • From “politics” to

    “analytics”

    PAGE 35

  • Business Intelligence Tools?

    http://www.layoutsparks.com/myspace-layouts/footsteps+in++s_0�

  • PAGE 37

    Business Intelligence Tools?

    • Business Objects (SAP)• Cognos Business Intelligence (IBM)• Oracle Business Intelligence • Hyperion (Oracle)• SAS Business Intelligence• Microsoft Business Intelligence• SAP Business Intelligence (SAP BI)• Jaspersoft (Open Source Business Intelligence)• Pentaho BI Suite (Open Source)• ....

    • Dashboards, reports, scorecards, • Slicing and dicing, data mining, ...

  • PAGE 38

    Process Mining Software

    ARIS Process Performance Manager

    Interstage Automated Business Process Discovery & Visualization

    Process Discovery Focus

    Futura Reflect

    Enterprise Visualization Suite

    Comprehend

    BPM|one fluxicon/nitro

    ProcessGold

    http://www.ids-scheer.com/en/index.html�http://www.iontas.com/index.php�http://www.oc.com/�http://www.pallas-athena.com/�http://fluxicon.com/�http://twitter.com/account/profile_image/ProcessGold?hreflang=de�

  • ProM 6

    PAGE 39

  • PAGE 40

    Starting point: event logs

    event logs, audit trails, databases, message logs, etc. www.xes-standard.org

  • PAGE 41

    extensions loaded

    every trace has a name

    every event has a name and a transition

    classifier = name + transitionstart of trace (i.e. process instance)

    name of trace

    name of event (activity name)

    resource

    transition

    timestamp

  • PAGE 42PAGE 42

    start of trace

    name of trace

    name of event (activity name)

    resource

    data associated to event

    timestamp

    end of trace (i.e. process instance)

  • Process Mining/ Auditing Framework

    http://www.layoutsparks.com/myspace-layouts/footsteps+in++s_0�

  • Framework

    PAGE 44

    information system(s)

    current data

    “world”peoplemachines

    organizationsbusiness

    processes documents

    historic data

    resources/organization

    data/rules

    control-flow

    de jure models

    resources/organization

    data/rules

    control-flow

    de facto models

    provenance

    expl

    ore

    pred

    ict

    reco

    mm

    end

    dete

    ct

    chec

    k

    com

    pare

    prom

    ote

    disc

    over

    enha

    nce

    diag

    nose

    cartographynavigation auditing

    event logs

    Models

    “pre mortem”

    “post mortem”software

    system

    (process)model

    eventlogs

    modelsanalyzes

    discovery

    records events, e.g., messages,

    transactions, etc.

    specifies configures implements

    analyzes

    supports/controls

    extension

    conformance

    “world”

    people machines

    organizationscomponents

    business processes

  • PAGE 45

    information system(s)

    current data

    “world”peoplemachines

    organizationsbusiness

    processes documents

    historic data

    control-flow

    de jure models

    control-flow

    de facto models

    provenanceex

    plor

    e

    pred

    ict

    reco

    mm

    end

    dete

    ct

    chec

    k

    com

    pare

    prom

    ote

    disc

    over

    enha

    nce

    diag

    nose

    cartographynavigation auditing

    event logs

    Models

    “pre mortem”

    “post mortem”

  • Using Replay

    http://www.layoutsparks.com/myspace-layouts/footsteps+in++s_0�

  • A

    B

    C

    DE

    p2

    end

    p4

    p3p1

    start

    Play Out (Classical use of models)

    PAGE 47

    A B C D

    A C B DA B C D

    A E D

    A C B DA C B D

    A E D

    A E D

  • Play In (Process Discovery)

    PAGE 48

    A

    B

    C

    DE

    p2

    end

    p4

    p3p1

    start

    ABCDACBDAED

    ACBDAED

    ABCD…

    a process discovery algorithm like the αalgorithm

  • A

    B

    C

    DE

    p2

    end

    p4

    p3p1

    start

    Replay

    PAGE 49

    A B C D

  • A

    B

    C

    DE

    p2

    end

    p4

    p3p1

    start

    Replay can detect problems

    PAGE 50

    AC D

    Problem!missing token

    Problem!token left behind

  • A

    B

    C

    DE

    p2

    end

    p4

    p3p1

    start

    Replay can extract timing information

    PAGE 51

    A5B8 C9 D13

    5

    8

    9

    13

    3

    4

    5

    43

    265

    8

    764

    7

    74

    3

  • Using Replay

    PAGE 52

    information system(s)

    current data

    “world”peoplemachines

    organizationsbusiness

    processes documents

    historic data

    resources/organization

    data/rules

    control-flow

    de jure models

    resources/organization

    data/rules

    control-flow

    de facto models

    provenance

    expl

    ore

    pred

    ict

    reco

    mm

    end

    dete

    ct

    chec

    k

    com

    pare

    prom

    ote

    disc

    over

    enha

    nce

    diag

    nose

    cartographynavigation auditing

    event logs

    Models

    “pre mortem”

    “post mortem”

  • Example: Conformance Checker

    http://www.layoutsparks.com/myspace-layouts/footsteps+in++s_0�

  • PAGE 54

    Conformance checker(Anne Rozinat et al.)

    How to quantify this?

  • PAGE 55

    Fitness by replay

    m=missing,r=remaining,c=consumed,p=produced

  • PAGE 56

    No problem (m=0, r=0)

  • PAGE 57

    Another (impossible) trace

  • PAGE 58

  • PAGE 59

    Fitness calculation

  • PAGE 60

    Examples

    f=1.000 f=0.995 f=0.540

  • PAGE 61

    Diagnostics

  • PAGE 62

    Other Metrics

    • Fitness is not sufficient: hence other metrics are needed such as behavioral and structural appropriateness, etc.

    • These metrics cover aspects such as:• Punishing for "too much" behavior.• Punishing for "overly complex" models.

  • Declarative models

    http://www.layoutsparks.com/myspace-layouts/footsteps+in++s_0�

  • LTL checker

    PAGE 64

  • Split cases

    PAGE 65

  • DECLAREAn alternative approach based on constraints ...

    forbidden behavior

    deviations from the prescribed

    model

    IMPERATIVE MODEL

    constra

    int constraint

    constraint constr

    aint

  • Basic idea

    A B

    Declarative notation(e.g., ConDec, DecSerFlow)

    LTL semantics

    ?

  • Example: "existence response"

    • OK:• [ ]• [A,B,C,D,E]• [A,A,A,C,D,E,B,B,B]• [B,B,A,A,C,D,E]• [B,C,D,E]

    • NOK• [A]• [A,A,C,D,E]

    A B

  • Example: "response"

    • OK:• [ ]• [A,B,C,D,E]• [A,A,A,B,C,D,E]• [B,B,A,A,B,C,D,E]• [B,C,D,E]

    • NOK• [A]• [B,B,B,B,A,A]

    A B

  • Example: "precedence"

    • OK:• [ ]• [A,B,C,D,E]• [A,A,A,C,D,E,B,B,B]• [A,A,C,D,E]

    • NOK• [B]• [B,A,C,D,E]

    A B

  • Example

  • Model with constraints

    • (C.1) Always start with activity register client data.• (C.2) Activity bill must be executed at least once. • (C.3) Every room service must be billed. • (C.4) Every laundry service must be billed. • (C.5) If the client checks-out- she/he must be charged. • (C.6) Sometimes it is recommended that additional cleaning is also be billed. (---optional---)

    C.1

    C.2

    C.4

    C.5C.3

    C.6

  • Constraints

    • constraints can be:• mandatory− imposed by DECLARE− can be fulfilled or temporarily violated

    • optional− used as warnings for users− can be fulfilled or temporarily violated or

    permanently violated• at the end of the execution all

    mandatory constraints have to be fulfilled

  • (a) initial state (b) after "register client data"

    (c) after "room service" (d) after "bill"

  • Possible ways of using Declare

    • Discover Declare models• Check Declare models offline• Check Declare models online• Quantify “health”• Drill-down

  • Conclusion

    http://www.layoutsparks.com/myspace-layouts/footsteps+in++s_0�

  • Process Mining !

  • Framework

    PAGE 78

    information system(s)

    current data

    “world”peoplemachines

    organizationsbusiness

    processes documents

    historic data

    resources/organization

    data/rules

    control-flow

    de jure models

    resources/organization

    data/rules

    control-flow

    de facto models

    provenance

    expl

    ore

    pred

    ict

    reco

    mm

    end

    dete

    ct

    chec

    k

    com

    pare

    prom

    ote

    disc

    over

    enha

    nce

    diag

    nose

    cartographynavigation auditing

    event logs

    Models

    “pre mortem”

    “post mortem”

  • More Information

    PAGE 79PAGE 79

    IEEE Task Force on Process Mining

    • ProM Software: prom.sourceforge.net• Process mining: www.processmining.org• ProM 5 series nightly builds: prom.win.tue.nl/tools/prom/nightly5/• ProM 6 series nightly builds: prom.win.tue.nl/tools/prom/nightly/• Converting logs (MXML-based) promimport.sourceforge.net • XES: www.xes-standard.org and www.openxes.org• Papers et al.: vdaalst.com• IEEE Task Force on Process Mining: www.win.tue.nl/ieeetfpm/

    Auditing 2.0 �Using Process Mining to Support Tomorrow's Auditor����Slide Number 2AuditingGrowth of dataSlide Number 5Process MiningProcess mining: Linking events to modelsProcess Discovery Example>,,||,# relationsBasic Idea Used by α Algorithm (1)Basic Idea Used by α Algorithm (2)Basic Idea Used by α Algorithm (3)Example RevisitedWhere did we apply process mining?Let’s eat …Example of a Lasagna Process���Example: WMO HarderwijkEvent log �(796 applications, 5187 events)Helicopter view of 1.5 years Huge variance in durationsProcess discovered using Genetic MinerVarious representationsFuzzy MinerSeamless abstractionFuzzy ReplayConformance checking using ReplayPerformance analysis using ReplayPerformance informationPrediction based on ReplaySpaghetti Processes�Slide Number 31How can process mining help?Business Intelligence Tools?Business Intelligence Tools?Process Mining SoftwareProM 6Starting point: event logsSlide Number 42Slide Number 43Process Mining/ Auditing FrameworkFrameworkSlide Number 46Using ReplayPlay Out (Classical use of models)Play In (Process Discovery)ReplayReplay can detect problemsReplay can extract timing informationUsing ReplayExample: �Conformance CheckerConformance checker�(Anne Rozinat et al.)Fitness by replayNo problem (m=0, r=0)Another (impossible) traceSlide Number 59Fitness calculationExamplesDiagnosticsOther MetricsDeclarative modelsLTL checkerSplit casesDECLARE�An alternative approach based on constraints ...Basic ideaExample: "existence response"Example: "response"Example: "precedence"ExampleModel with constraintsConstraintsSlide Number 75Possible ways of using DeclareConclusionProcess Mining !FrameworkMore Information