1 model-based methods to make distributed services fault-tolerant and dependable humberto nicolás...

1

Model-based Methods to Make Distributed Services Fault-Tolerant and Dependable

Humberto Nicolás Castejón MartínezInstitutt for Telematikk, NTNU

2

Outline

• Definition of Distributed System and Service

• Characteristics of Model-Driven Development (MDD)

• Dependability overview

◊ Means to achieve dependability

◊ Benefits from MDD

◊ Literature approaches

• Summary

3

Distributed Systems and Services

• A Distributed System consists of separate autonomous components that operate concurrently and interact with each other, by message passing, in order to provide some service to the system’s environment/end-users

◊ E.g. Public Telephone Switch Network

• A Service is an identified functionality, with value for the end-users of the system, that results from a collaboration between components of the system

4

“Elaboration” Development

Problem domain/

Requirements

Problem domain/

Requirements

Developing a service amounts to write executable

code that fulfills the user requirements

Developing a service amounts to write executable

code that fulfills the user requirements

• Incomplete high-level descriptions of functionality• Source code becomes the only complete view of the system,

and the only one maintained

ImplementationImplementation

5

Tackling Development Complexity

Two golden rules for tackling development complexity:

• Separation of concerns:

Identify aspects that are as independent as possible and describe them separately.

• Conceptual abstraction:

Replace low level concepts representing technical detail by more high level abstract concepts better suited to describe and understand the problem at hand Use models!

6

• The goal is to reduce the gap between problem and implementation domains through the use of models describing the system at multiple levels of abstraction and from different perspectives, and through automated techniques for model transformation and analysis

Model Driven Development

Problem domain/Requirements



Implementation Oriented

Implementation Oriented




SpecificationModels

SpecificationModels

DesignModels

DesignModels

V&V

V&V

Automatic Code Generation

Automatic Model Transformation

ModelOriented

ModelOriented

VS

7

Dependability

• Dependability of a system is

“the ability to avoid service failures that are more frequent and more severe than is acceptable” [Avizienis2004]

• Dependability is a property of the system in its environment!

8

Dependability Tree

Availability

Reliability

Safety

Faults

Errors

Failures

Fault Prevention

Fault Removal

Fault Tolerance

Fault Prediction

•Dependability

Attributes

Threats

Means

IFIP WG 10.4

9

Dependability Attributes

• A dependable system must be

◊ Available

◊ Reliable

◊ Safe

• More emphasis on one or another attribute depending on the particular system/service

ready to be used when needed

works properly and continuously in a time interval

operates without catastrophic consequences on the environment

10

Dependability Threats

• Failure

◊ No compliance to specification◊ No compliance to requirements/user’s needs

• Error

• Fault

The delivered service deviates from the correct service, as observed by the end-user

Deviation from correctness in the internal system state that may lead to failure

The cause of an error

11

Error

Dependability Threats: Causality Chain

... Fault Error Failure Fault Error Failure ...

InternalDormant

Fault

Error Errorpr

opag

atio

n

External Fault

Component C1 Component C2

Error

Failure of C1 = Fault for C2

12

Fault Prevention and Removal with MDD

• Fault Prevention Avoid the occurrence or introduction of faults

◊ Abstraction and separation of concerns: better understanding fewer specification mistakes

◊ Automatic model transformations and code generation: compliance between source and target models

• Fault Removal Reduce the number or severity of faults

◊ Formal model verification and validation

◊ Model animation/simulation

◊ Automatic generation of test cases. E.g. The “Model-based Generation of Tests for Dependable Embedded Systems” (MOGENTES) EU project

13

Fault Prediction with MDD

• Fault Prediction Estimate the present and future number of faults, and their likely consequences, by means of qualitative and quantitative evaluation

• Traditional approach: based on the system description, a dependability expert builds one or more dependability models

◊ Big gap between system design process and dependability modeling and analysis

• MDD approach: A dependability expert annotates the system design models with dependability-related information, and dependability models are automatically constructed

14

Some MDD Approaches for Fault Prediction

• [Addouche2006] - Extended UML state machines and communication diagrams are converted into Probabilistic Timed Automata for verification of probabilistic temporal properties related to the dependability of real time systems

• [Huszerl2002] - UML state machines annotated with timing and probabilistic properties are transformed into Stochastic Reward Nets (SRNs)

• [Leangsuksun2003] - UML deployment models are mapped into Fault Tree and Markov Chain models for the detection of hardware failures

• [Pai2002] and [Majzik2002] transform annotated UML structural diagrams (e.g. class and deployment diagrams) into Dynamic Fault Trees and Timed Petri Nets, respectively

15

Fault Tolerance

• Fault Tolerance Deliver correct service despite the occurrence of faults

• Mainly achieved by means of redundancy (in hardware and software)

• Three types of redundancy

◊ Static Redundancy: Tries to mask a fault by using redundant components/services

◊ Dynamic Redundancy: Based on error detection and error recovery

◊ Hybrid Redundancy: Combination of static and dynamic redundancy

16

Dynamic Redundancy

• Once an error is detected, appropriate actions are taken to return the system into a valid state

• Two types of error recovery

◊ Backward error recovery (BER): Restores the system to a previous valid state (i.e. to a saved recovery point)

• Can be used to mask unanticipated faults

• Not useful with highly interactive systems

◊ Forward error recovery (FER): Continues from an erroneous state by making selective corrections to the system state (e.g. by means of exception handling mechanisms)

• Depends on accurate identification of the cause of errors

17

Fault Tolerance with MDD

• Software-based fault-tolerance mechanisms are just software, so their implementation can certainly benefit from the MDD approach

◊ Separate models/views for normal behavior and fault-tolerant mechanisms + model composition/weaving

◊ Refinement for adding e.g. exception handling behavior

◊ Early test of fault-tolerance solutions

◊ Deployment models for specification of hardware redundancy and static software redundancy

18

Some MDD approaches for Fault Tolerance

• [Reddy2005] - Fault-tolerant mechanisms are described by aspect models (with parameterized class and sequence diagrams) and automatically composed with the overall system model. Analysis of the integrated model is also provided.

• [Domokos2005] - Aspect oriented modeling is used to design the architecture of fault tolerant systems. A model weaver generates both an integrated design model and an associated dependability model based on SPNs.

• [Bucchiarone2007] - System architecture is modeled with a UML 2 component diagram, following the pattern dictated by the idealized fault tolerant component, i.e. with differentiated parts for normal and exceptional behaviors. Each part has its own state machine (in an extended version). Test cases are automatically created.

19

Modeling Dependable Sys. with PatternsPattern

RepositoryStatic System

Model

Binding

Automatic Expansion

Concrete Scenarios

Dynamic System Model

VerificationFrom

[Sand2006]

20

Summary

Thank you!

• Model Driven Development can positively contribute to all four means of achieving dependability

◊ By helping to reduce the number of faults in services

◊ By automatically detecting service faults through model V&V

◊ By allowing an integrated and precise development of normal and fault-tolerant behaviors at different levels of abstraction

◊ By automatically constructing dependability models through model transformations

• Some approaches that exploit MDD for achieving dependability already exist, but more work has to be done (since this is also true for MDD itself)

21

References

• [Addouche2006] – N. Addouche, C. Antoine, J. Montmain, “Methodology for UML Modeling and Formal Verification of Real-Time Systems”, Intl. Conf. on Computational Intelligence for Modelling Control and Automation, and Intl. Conf. on Intelligent Agents, Web Technologies and Internet Commerce (CIMCA-IAWTIC'06), IEEE CS, 2006

• [Avizienis2004] – A. Avizienis, J-C. Laprie, B. Randell, C. Landwehr, “Basic Concepts and Taxonomy of Dependable and Secure Computing”, IEEE Transactions on Dependable and Secure Computing, vol. 1, no. 1, 2004

• [Bucchiarone2007] – A. Bucchiarone, H. Muccini and P. Pelliccione, “Architecting Fault-tolerant Component-based Systems: from requirements to testing”, Electronic Notes in Theoretical Computer Science, vol. 168, Elsevier, 2007

• [Domokos2005] – P. Domokos and I. Majzik, “Design and Analysis of Fault Tolerant Architectures by Model Weaving”, 9th IEEE Intl. Symposium on High-Assurance Systems Engineering (HASE’05), IEEE CS, 2005

• [Huszerl2002] – G. Huszerl, I. Majzik, A. Pataricza, K. Kosmidis, M. Dal Cin, “Quantitative Analysis of UML Statechart Models of Dependable Systems”, The Computer Journal, Vol 45(3), May 2002

• [Leangsuksun2003] – C. Leangsuksun, H. Song, L. Shen, “Reliability Modeling Using UML”, Int. Conf. on Software Engineering Research and Practice (SERP'03), CSREA Press, 2003

• [Majzik2002] – I. Majzik, A. Pataricza, A. Bondavalli, “Stochastic Dependability Analysis of System Architecture Based on UML Models”, ICSE 2002 Workshop on Software Architectures for Dependable Systems, LNCS 2677, Springer, 2002

• [Pai2002] – G. J. Pai, J. B. Dugan, “Automatic Synthesis of Dynamic Fault Trees from UML System Models”, 13th Intl. Symposium on Software Reliability Engineering (ISSRE’02), IEEE CS, 2002

22

References (II)

• [Reddy2005] – R. Reddy, R. France, G. Georg, “An Aspect Oriented Approach to Analyzing Dependability Features”, Workshop on Aspect Oriented Modeling at Intl. Conf. on Aspect Oriented Software Development (AOM-AOSD 2005)

• [Sand2006] – M. Sand, “Patternbasierte Verifikation objektorientierter Modelle - Methodik, Semantik und Verfahren“, PhD Thesis, University of Erlangen-Nürnberg, 2006

23

Static Redundancy: N-version Programming

• The elements of n-version programming are:

◊ Variants: modules with different design but providing the same service

◊ Controller: responsible for the coordinated execution of the variants

◊ Adjudicator: responsible for checking the results offered by the variants

• Useless if not combined with hardware redundancy!

Controller

Variant n

Variant 2

Variant 1

Adjudicator

24

Coordinated Atomic Actions (CAAs)

• Support for error recovery of multiple interacting components in a distributed system

• A CAA is designed as a set of participants cooperating inside the CAA and a set of resources accessed by those participants

• The CAA starts when all participants have been activated and finishes when all of them reach the end of the CAA (i.e. produce a normal outcome)

• If an error is detected inside a CAA (i.e. a participant raises an exception), all participants are involved in recovery

• If recovery is successful, the action completes normally. Otherwise, a failure exception is propagated to the containing CAA

25

Dependability Annotations in UML models

• Several proposals in the literature to annotate UML models with dependability-related information

◊ Each one covers only certain dependability aspects

• UML profile for Modeling Quality of Service & Fault Tolerance Characteristics & Mechanisms (QoS&FT)

◊ Flexible, but heavy-weight mechanisms

◊ May require the creation of extra objects just for annotation purposes

• Dependability Analysis Modelling (DAM) profile [Bernardi2008]

◊ Emphasis on quantitative analysis

◊ Aims at unifying best practices reported in literature

◊ Compliant with MARTE profile• Dependability-specific data types defined with MARTE’s mechanisms

(i.e. Non-Functional Properties framework and Value Specification Language)

• Specializes concepts from MARTE’s generic quantitative analysis model

26

DAM’s Conceptual Model

• Represents the main dependability concepts from literature

• System Core: Concepts for the description of the system to be analyzed, and for the description of redundancy structures

• Threats: Concepts for modeling threats and their relationships

• Maintenance: Concepts for modeling repair/recovery actions

27

DAM’s Conceptual Model: Core

• Structural view: System as set of components interconnected via connectors

• Behavioral view: ◊ System delivers high-level services (i.e. behavior as observed by

the users) upon user service requests.◊ Components interact to deliver the high-level services, by

providing and requesting basic services to each other◊ Service = sequence of steps (i.e. component states and actions)

28

DAM’s Conceptual Model: Redundancy

• Represents redundancy structures that may characterize a system• Components can play different roles within a redundant structure:

◊ Variants: modules with different design but providing the same service, and allocated over different spares

◊ Controller: responsible for the coordinated execution of the variants◊ Adjudicator: responsible for checking the results offered by the variants

29

DAM’s Conceptual Model: Threats

1 model-based methods to make distributed services fault-tolerant and dependable humberto nicolás...

Documents

system slide

dependability dependability

error slide

environment slide

dependable system

ntnu slide

dependability benefits

service failures