serene 2014 workshop: paper "combined error propagation analysis and runtime event detection in...

26
1. Quanopt Ltd. Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems Gábor Urbanics, László Gönczy, Balázs Urbán, János Hartwig, Imre Kocsis

Upload: sereneworkshop

Post on 06-Jul-2015

124 views

Category:

Science


0 download

DESCRIPTION

SERENE 2014 - 6th International Workshop on Software Engineering for Resilient Systems http://serene.disim.univaq.it/ Session 4: Monitoring Paper 3: Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems

TRANSCRIPT

Page 1: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

1. Quanopt Ltd.

Combined Error Propagation Analysis and Runtime Event Detection in

Process-driven Systems

Gábor Urbanics, László Gönczy, Balázs Urbán, János Hartwig, Imre Kocsis

Page 2: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

2. Quanopt Ltd.

Motivation and our contributions

Approach

Motivational example

Design time analysis

Runtime analysis

Future work and conclusion

Page 3: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

3. Quanopt Ltd.

Motivation

Analyse complex IT system oDuring development

oDuring integration

oAt runtime

oBased on system models

Generate analysis for huge systems

Extendable

Page 4: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

4. Quanopt Ltd.

Process modelling

Business process: oDirectly executed models (e.g. BPMN)

In a complex systems there are many supporting resources oWe present a method for business process and

supporting resources together oOnly general tools:

• Markov chains, Event trees • Too general, modelling could be hard

oDevelopment tools • Basic performance analysis • Business activity monitoring

Page 5: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

5. Quanopt Ltd.

Contributions

Multi aspect modelling of complex (IT) systems oCustom, general process and resource model

Qualitative error propagation analysis oRoot cause and sensitivity analysis

oUsing finite domain constraint satisfaction problem

Runtime process monitoring

Page 6: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

6. Quanopt Ltd.

Motivation and our contributions

Approach

Motivational example

Design time analysis

Runtime analysis

Future work and conclusion

Page 7: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

7. Quanopt Ltd.

Approach

Process model

Resource model

Annotation model

System model Error Propagation

Analysis

Monitoring

[New Monitoring Rule]

[New Constraint]

Physical and Logical

Can be imported

Failure modes

Error propagation

behavior

Extra annotations for analysis

Page 8: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

8. Quanopt Ltd.

Motivation and our contributions

Approach

Motivational example

Design time analysis

Runtime analysis

Future work and conclusion

Page 9: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

9. Quanopt Ltd.

Motivational example

Design time analysis capabilities oSPOF analysis

oProcess-level effects of resource faults

oPropagating resource errors to the resource layer

Page 10: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

10. Quanopt Ltd.

Case study

Large

transaction?

ReceiptN

Y

N

N

Y

Y

Client

Business Processes Layer

Flag & report

Laundering

suspected?

Record

transaction

Money

takeover

Form

processing

Pay

to $

Manual

laundering check

Perform full

check

Timeout

Client checked

earlier?

Legend

Activity Execution Path

Page 11: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

11. Quanopt Ltd.

Process with resources

Large

transaction?

ReceiptN

Y

Backend Server 3

Compliance DB

AppServ4

N

N

Y

Y

AppServ3 VM

Customer & Account Identification

AppServ1 AppServ2

DB1 DB2

Backend Server 1 Backend Server 2Application Server

cluster

Client

Business Processes Layer

Supporting

Applications Layer

Physical

Resources Layer

Flag & report

Laundering

suspected?

Record

transaction

Money

takeover

Form

processing

Pay

to $

Manual

laundering check

Perform full

check

Timeout

DB

Client checked

earlier?

Cashier Module

Single

Hypervisor

Blade Server

Legend

Activity

ResourceDependency

Execution Path

Page 12: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

12. Quanopt Ltd.

Large

transaction?

ReceiptN

Y

Backend Server 3

Compliance DB

AppServ4

N

N

Y

Y

AppServ3 VM

Customer & Account Identification

AppServ1 AppServ2

DB1 DB2

Backend Server 1 Backend Server 2Application Server

cluster

Client

Business Processes Layer

Supporting

Applications Layer

Physical

Resources Layer

Flag & report

Laundering

suspected?

Record

transaction

Money

takeover

Form

processing

Pay

to $

Manual

laundering check

Perform full

check

Timeout

DB

Client checked

earlier?

Cashier Module

Outage 1

Outage 1

Stuck 1

Single Fault 1

Outage 1

Stuck 1

Single

Hypervisor

Blade Server

Legend

Outage 1

Resource Setup Identifier

Failure Mode

Use Case Id

Activity

ResourceDependency

Execution Path

Single fault in physical layer

Page 13: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

13. Quanopt Ltd.

Large

transaction?

ReceiptN

Y

Backend Server 3

Compliance DB

AppServ4

N

N

Y

Y

AppServ3 VM

Customer & Account Identification

AppServ1 AppServ2

DB1 DB2

Backend Server 1 Backend Server 2Application Server

cluster

Client

Business Processes Layer

Supporting

Applications Layer

Physical

Resources Layer

Flag & report

Laundering

suspected?

Record

transaction

Money

takeover

Form

processing

Pay

to $

Virtualized

HA Cluster

Manual

laundering check

Perform full

check

Timeout

Blade

Server Farm

DB

Client checked

earlier?

Cashier ModuleDegraded 2

Degraded 2

Failover 2

Single Fault 2

Delay-incurred Cost 2

Delayed 2

Delayed

Delay-incurred Cost 2

2

Legend

Outage 1

Resource Setup Identifier

Failure Mode

Use Case Id

Activity

ResourceDependency

Execution Path

Effects of a single fault

Page 14: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

14. Quanopt Ltd.

Backwards error propagation

Large

transaction?

ReceiptN

Y

Backend Server 3

Compliance DB

AppServ4

N

N

Y

Y

AppServ3 VM

Customer & Account Identification

AppServ1 AppServ2

DB1 DB2

Backend Server 1 Backend Server 2Application Server

cluster

Client

Business Processes Layer

Supporting

Applications Layer

Physical

Resources Layer

Flag & report

Laundering

suspected?

Record

transaction

Money

takeover

Form

processing

Pay

to $

Virtualized

HA Cluster

Manual

laundering check

Perform full

check

Timeout

Blade

Server Farm

DB

Client checked

earlier?

Cashier ModuleSQLInjected 3

OK 3

OK 3

OK 3

SQLInjected 3

SQLInjected 3

Legend

Outage 1

Resource Setup Identifier

Failure Mode

Use Case Id

Activity

ResourceDependency

Execution Path

Page 15: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

15. Quanopt Ltd.

Motivational example

Design time analysis capabilities oSPOF analysis

oProcess-level effects of resource faults

oPropagating process errors to the resource layer

Page 16: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

16. Quanopt Ltd.

Motivation and our contributions

Approach

Motivational example

Design time analysis

Runtime analysis

Future work and conclusion

Page 17: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

17. Quanopt Ltd.

Design time analysis

Error propagation rules oThrough the process’ execution path

oThrough dependencies

Translate model to constraint satisfaction problem (CSP)

Solution of the CSP provide the results oOf root cause analysis

oSensitivity analysis Process model

Resource model

Annotation model

System model Error Propagation Analysis

Monitoring

Page 18: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

18. Quanopt Ltd.

What is CSP?

Constraint satisfaction problem oProblems defined mathematically

• A set of variables

• Constraints between them

A general solver can find the solution oA single or a list of variable layouts

oAll constraints satisfied

Page 19: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

19. Quanopt Ltd.

Business Processes Layer

Form processingCustomer login

Legend

Activity Execution Path

Sample mapping to CSP

(Customer_login_run)

(Form_processing_run)

Page 20: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

20. Quanopt Ltd.

Sample mapping to CSP

(Customer_login_delay & Customer_login_run)

(Form_processing_delay)

Business Processes Layer

Form processingCustomer login

Legend

Activity Execution Path

Page 21: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

21. Quanopt Ltd.

Motivation and our contributions

Approach

Motivational example

Design time analysis

Runtime analysis

Future work and conclusion

Page 22: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

22. Quanopt Ltd.

Runtime process monitoring

Runtime monitoring based on the same model

Rule based online event processing oEvents captured during the execution

oEach time a rule satisfied • Notification can be recorded

• Update of rule-specific process metrics

Coverage checks

Annotation-based rule synthesis

Process model

Resource model

Annotation model

System model Error Propagation Analysis

Monitoring

Page 23: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

23. Quanopt Ltd.

Architecture of the prototype

•Process Model •Resource Model •Fault model

•Process Execution Log

•Diagnostic Rules •Propagation Rules •Tagging

•Dependability bottleneck •Process hotspots

•Runtime diagnostic metrics •Runtime alerts

Page 24: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

24. Quanopt Ltd.

Motivation and our contributions

Approach

Motivational example

Design time analysis

Runtime analysis

Future work and conclusion

Page 25: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

25. Quanopt Ltd.

Future work

System model and fault model „libraries”

Hierarchical modelling

Hierarchical/Incremental CSP evaluation

Uncertain failure modes

Back annotation of monitoring results oQualitative abstraction

Precise modelling frontend

Connection with optimisation methods

Page 26: SERENE 2014 Workshop: Paper "Combined Error Propagation Analysis and Runtime Event Detection in Process-driven Systems"

26. Quanopt Ltd.

Conclusion

Design time analysis of business processes oWith the use of a resource model

oRoot cause analysis

oDetermine weak points

Rule based runtime diagnostic oProcess monitoring based on event processing

oRule synthesis

oCoverage test