development of safety critical cps: some applicable .../cs_cpssafetyanalysis.pdf · some applicable...
TRANSCRIPT
Development of safety critical CPS:some applicable
dependability concepts &safety assessment techniques
from the aeronautic domain
Christel SeguinONERA/DCSD
Dire
ctio
n -
Con
fére
nce
Lecture scope:Cyber Physical Systems & Dependability
• CPS: • “complex engineering systems that rely on the integration of physical,
computation, and communication processes to function”• Holistic view addressed in the lecture
Dire
ctio
n -
Con
fére
nce
Presentation scope:Cyber Physical Systems & Dependability
• Dependability concepts:• [Avizienis-al2004]
“ability to deliver service that can justifiably be trusted”
• It encompassescyber securityand safety
Dire
ctio
n -
Con
fére
nce
Presentation scope:Cyber Physical Systems & Dependability
• Dependability practices• are application dependent
=> Lessons learnt from aeronautic
• Certification process• Safety assessment
methods & tools
Dire
ctio
n -
Con
fére
nce
General dependability concepts
Dire
ctio
n -
Con
fére
nce
Generic system definition
• System = • a set of interacting items, forming an integrated whole• examples of various complexity:
• air traffic control, aircraft + pilot, flight-control system, computers, sensors, actuators ...
aircraft
equipment
A380, Rafale, B787
flight control,hydraulic, electrical,flight warning, …
aircraft systems
Flight control computers,Flight warning computers, …
Dire
ctio
n -
Con
fére
nce
A leading « simple » example: A320 hydraulic system
• Architecture overview: • About 20 components of 8 classes: reservoir, pumps, pipes, valves, controlers • Safety barriers: 3 redundant independant lines, valves for load management
and fault containment
eng2
EDPy
EMPy
EMPb
elec1
RAT
Pdistgeng1
EDPg PVg NPdistg
PVy
PVb
NPisty
NPdistb
Pdisty
PTU
Pdistb
elec2
rsvg
rsvy
rsvb
green
yellow
blue
Engine Driven Pump Priority distribution Non-Priority distribution
Priority Valve
Power Transfer Unit
Reservoir
Engine #1
From electrical system side #1
Electrical Motor Pump
Ram Air Turbine
Dire
ctio
n -
Con
fére
nce
A more complex example: Remotely Piloted Aircraft
UAV
Perception
On ground Pilot
Control
Communication
Air Traffic Control
• PRAS : a challenging system mixing organizational, human and technical concerns
Avionics
Otherusers
Dire
ctio
n -
Con
fére
nce
System from dependability perspective
Failure : deviation of the service provided by the considered system, with respect to the expectations.
Failure rate: the probability of failure per unit of time of items in operation
Failure mode : way by which a failure appears (e.g fail-silent, erroneous value, …)
Fault: cause of a potential failure
Undesirable event : Any adverse event or situation that could be due to the considered system and its potential failures. Also called Failure Condition
Dire
ctio
n -
Con
fére
nce
Bath curve failure rate
Dire
ctio
n -
Con
fére
nce
Hydraulic example
• Nominal function: hydraulic power delivery• Failure: no delivery of hydraulic power.• Expected failure rate of " no delivery of hydraulic power" is less than
10-9 per flight hour.• Failure modes:
• total loss of delivery of hydraulic power (loss of the three lines)• Partial loss of delivery of hydraulic power (loss of one line)• Loss of delivery of hydraulic power on one pipe• Loss of valve control• Intempestive valve closure
• Faults • For pipe loss:
• Primary (intrinsic) cause: pipe wearing• Secondary cause (extrinsic): pipe received to high pressure fluid
• For intempestive valve closure• Controller design error
Dire
ctio
n -
Con
fére
nce
Dependability properties
• Dependability = [Avizienis-al2004]"ability to deliver service that can justifiably be trusted"
• It encompasses :• Reliability: continuity of correct service• Availability: readiness for correct service• Maintainability: ability to undergo modifications and repair• Safety: absence of catastrophic consequences on the human,
equipments and the environment• Confidentiality: absence of unauthorized disclosure of information• ...
Dire
ctio
n -
Con
fére
nceDependability measures - 1
• Reminder: • Reliability: continuity of correct service• Availability: readiness for correct service• Maintainability: ability to undergo modifications and repair
• Mathematical definitions for a system S• R(t) = Prob(S non faulty during [0, t])
function decreasing from 1 to 0 for t in [0 +∞[
• A(t) = Prob(S non faulty at t)if the system can not be repaired R(t) = A(t)
• M(t) = 1 – Prob(S non repaired during [0, t])function increasing from 0 to 1 for t in [0 +∞[
Dire
ctio
n -
Con
fére
nce
Some other reliability attributes
• Development Assurance Level:• Level of care taken in the specification and the design of one item• Qualitative indicator: A, B, C, D, E
Dire
ctio
n -
Con
fére
nce
Standard safety assessment process for systems of civil aircraft
Dire
ctio
n -
Con
fére
nce
Overview of means to build safety
Source: Dassault Aviation
Dire
ctio
n -
Con
fére
nce
Processes: certification / safety assessment / V&VSafety assessment/Safety analysis process ARP4754
Certification readiness
Final certification Acceptable means of compliance deliverables: Safety dossier(safety synthesis / Safety case, FHA, SSA, ASA, PRA, etc.)
Certification preparation
process
- Aircraft Airworthiness requirement assignment
- CRI for novelties and associated means of compliance
- Certification plan
Safety assessment/analysis activityCertification activity
Function and implementation oriented safety assessment:
Common Cause Analysis (CCA)
Functional Hazard Assessment (FHA):Failure Condition Identification and allocation of safety objectives (qualitative and quantitative). Aircraft level FHA and System level FHA
Multi system and system Safety Assessment- Multi systems (Aircraft level): PASA/ASA (Preliminary and final Aircraft Safety Assessment - System level: PSSA/SSA (Preliminary and final System safety Assessment
Particular Risk Analysis (PRA): e.g. engine burst, bird impact.
Human Error Analysis (HEA: Crew error, Maintenance error.
Common Mode Analysis (CMA)
Zonal Safety Analysis (ZSA): Installation review
Safety Plan
Safety Validation/Verification and safety assurance process
V/V and Assurance Process activities
Saf
ety
Pro
gram
Pla
n
Dire
ctio
n -
Con
fére
nce
Parts addressed in the lectureSafety assessment/Safety analysis process ARP4754
Certification readiness
Final certification Acceptable means of compliance deliverables: Safety dossier(safety synthesis / Safety case, FHA, SSA, ASA, PRA, etc.)
Certification preparation
process
- Aircraft Airworthiness requirement assignment
- CRI for novelties and associated means of compliance
- Certification plan
Safety assessment/analysis activityCertification activity
Function and implementation oriented safety assessment:
Common Cause Analysis (CCA)
Functional Hazard Assessment (FHA):Failure Condition Identification and allocation of safety objectives (qualitative and quantitative). Aircraft level FHA and System level FHA
Multi system and system Safety Assessment- Multi systems (Aircraft level): PASA/ASA (Preliminary and final Aircraft Safety Assessment - System level: PSSA/SSA (Preliminary and final System safety Assessment
Particular Risk Analysis (PRA): e.g. engine burst, bird impact.
Human Error Analysis (HEA: Crew error, Maintenance error.
Common Mode Analysis (CMA)
Zonal Safety Analysis (ZSA): Installation review
Safety Plan
Safety Validation/Verification and safety assurance process
V/V and Assurance Process activities
Saf
ety
Pro
gram
Pla
n
Dire
ctio
n -
Con
fére
nceSafety requirements for aircraft systems - 1
• Another general definition of dependability:• "ability to avoid services failures that are frequent and more
severe than acceptable“
• Meaning of • severe? • frequent? • acceptable?
depends on the system kind !
Dire
ctio
n -
Con
fére
nce
Safety requirements for aircraft systems - 2
• Interpretation of the definition when considering safety of civil aircraft
• kind of service failures = Failure Condition (FC) = • A condition with an effect on the aircraft and its occupants, both
direct and consequential,• caused or contributed to by one or more failures,• considering relevant adverse operational or environmental
conditions.
• In terms of commercial airplane airworthiness, a Failure Condition is classified in accordance to the severity of its effects as defined in FAA AC 25.1309-1A or JAR AMJ 25.1309.
Dire
ctio
n -
Con
fére
nce
Classification table for the failure conditions of systems in civil aircraft - 1
severity classes
effects description DAL acceptable frequency of FC
catastrophic prevent continuous safe flight and landing: aircraft loss and loss of crew and passengers
A FC occurrence <10-9 per flight hour +no single failure leads to the FC
hazardous large reduction in safety margins or functional capabilitiesor physical distress or high crew workloador serious or fatal injuries to a relatively small number of passengers
B <10-7 per flight hour
Dire
ctio
n -
Con
fére
nce
Classification table for the failure conditions of systems in civil aircraft - 2
severity classes
effects description DAL acceptable frequency of FC
major significant reduction in safety margin or functional capabilitiesor significant increase in crew workloador discomfort to occupants possibly including injuries
C <10-5 per flight hour
minor no significant reduction in aircraft safety.may include: slight reduction in safety margin or functional capabilities, slight increase in crew work load, some inconvenience to the occupants
D no objectives
no safety effect
E
Dire
ctio
n -
Con
fére
nce
Hazard Classification for CPS ?
• Classification rules such the one used for civil aircraft are a must
• They are strongly dependent from the concept of operation of the system and they do not exist for newest systems
• E.g.: There is not yet predefined classification of failure condition for remotely piloted aircraft
• UAV crash on a area without population is not catastrophic• The severity depends also on the aircraft energy (weight, speed, ...)
• More research is required to help defining safety assessment procedures proportionate to the operation risk
• Cf for instance http://easa.europa.eu/newsroom-and-events/news/easa-presents-new-regulatory-approach-remotely-piloted-aircraft-rpas
Dire
ctio
n -
Con
fére
nce
Example of FC and related safety requirements
"Total loss of hydraulic power is classified Catastrophic, the probability rate of this failure condition shall be less than 10-/FH.
No single event shall lead to this failure condition "
• In practice, how to find all meaningful safety requirements for aircraft items (functions, systems, equipements)?
Dire
ctio
n -
Con
fére
nce
FHA - Principles
• Functional Hazard Assessment (FHA)• a systematic, comprehensive examination of functions to identify and
classify FCs of those functions according to their severity• process:
1. identification of all the functions associated with the system under study (internal functions and exchanged functions)
2. identification and description of FCs associated with these functions, considering single and multiple failures in normal and degraded environments
3. determination of the effects of the FC4. classification of FC effects on the aircraft (cat, haz, maj, min, no
safety effect)
Dire
ctio
n -
Con
fére
nce
FHA – Model of FHA table extracted from ARP 4761
Dire
ctio
n -
Con
fére
nce
FHA and CPS
• FHA input is the overall functional system breakdown• It can be applied to CPS
• However, for highly integrated system like CPS, it is also meaningful to review combinations of multiple failures or integration / control failures
• Cf STAMP http://sunnyday.mit.edu/STAMP-publications.html• Cf Bowtie methods
http://www.caa.co.uk/default.aspx?catid=2786&pagetype=90
Dire
ctio
n -
Con
fére
nce
PASA / PSSA – ASA/SSA
• A Preliminary Aircraft / System Safety Assessment (PASA/PSSA) :• systematic examination of a proposed architecture(s) • to determine how failures could cause the Failure Conditions identified by the
FHA. • Outputs:
• consolidation of the safety requirements allocation or • need for alternative protective strategies (e.g., partitioning, built-in-test, monitoring,
independence and safety maintenance task intervals, etc.).
• . An Aircraft/System Safety Assessment (ASA/SSA) • systematic, comprehensive evaluation of the implemented aircraft and system(s)
to show that relevant safety requirements are satisfied. • Ouputs: judgement on the compliance of the design with the safety requirements
as defined in the PASA and PSSA.
• How to perform safety assessment? models and tools for failure propagation analysis are defined in ARP 4761
Dire
ctio
n -
Con
fére
nce
Methods for failure propagation analysis
Dire
ctio
n -
Con
fére
nce
Analysis of failure propagation : FMEA Principles
• Failure Mode Effect Analysis (FMEA)• Inductive analysis of local and global effects of all components failures :
• Effect of leakage in the Green Reservoir :
eng1
EDPg
EMPg
elec
distgrsvg
eng2
EDPydistyrsvy
RAT
EMPb
elec
distbrsvb
Local effect : loss of fluid
Global effect : loss of Green power
Dire
ctio
n -
Con
fére
nce
FMEA of the hydraulic example
Dire
ctio
n -
Con
fére
nce
Fault tree analysis (FTA)• A tree decomposing a top level event (a system failure) to exhibit all
its root causes• Mathematical model: a boolean formulae• Computation on the model:
• Extraction of minimal combination of atomic faults leading to the top level event
• Computation of the probability of occurrence of top event knowing the probability of the tree leaves
Fault Tree Analysis principles
FT unannunciated loss of wheel braking
Dire
ctio
n -
Con
fére
nce
Drawbacks of the classical Safety Assessment Approaches
– Fault Tree, FMEA – Give failure propagation paths without referring explicitly to a
commonly agreed system architecture / nominal behavior =>– Misunderstanding between safety analysts and designers– Potential discrepancies between working hypothesis
• Exhaustive consideration of all failure propagations become more and more difficult, due to:
– increased interconnection between systems, – integration of functions that often are performed jointly across multiple
systems – increased inter relations between hardware and software.
Dire
ctio
n -
Con
fére
nce
Model based safety assessment rationales
• Goals• Propose formal failure propagation models closer to design
models• Develop tools to
• Assist model construction • Analyze automatically complex models
• For various purposes• FTA, FMEA, Common Cause Analysis, Human Error Analysis, …• since the earlier phases of the system development
• Approaches Extend design models (Simulink,
SysML, AADL...)with failure modes
Build dedicated failure propagation models
(Figaro, AltaRica, Slim...)
Transform into analyzable formalisms (boolean formulae,
automata, ...)
Develop specialized analysis tools
Dire
ctio
n -
Con
fére
nce
35
A leading example: the basic block component
• Let be a basic system component Block that• receives
• one Boolean input I, • an activation signal A and • a resource signal R.
• produces • a Boolean output O
• Block performs nominally the following transfer law • O is true iff I, A and R are true.
• Block may fail. • In this case, the output O is false.
• Initially, the block performs the nominal function
BlockI
fail
A
O
R
Dire
ctio
n -
Con
fére
nce
36
Mode automata of a Boolean block –Graphical view and concrete syntax
ok=trueO=(I and A and R)
ok=falseO=false
fail
node Blockflow
O:Bool:out;I, A, R :Bool: in;
state ok: Bool;
event fail;
trans ok |- fail -> ok := false;
assert O = (I and A and R and ok);
init ok:= true;
edon
Dire
ctio
n -
Con
fére
nce
37
Kripke structure derived from the mode automata of the Boolean block
Kripke structure = (Configurations, Data Assignation in configurations, Relations between configurations)
Runs of a mode automata are paths of the derived Kripke structure that start from one possible initial configuration
8 possible initial configurations to 8 possible final configurations
ok=trueI=A=R=O=true
ok=falseI=A=R=true, O=false
fail
ok=trueI=A=true, R=O=false
ok=trueI=A=R=O=false
ok=falseI=A=true, R=O=false
ok=falseI=A=R=O=false
......
......
Dire
ctio
n -
Con
fére
nce
38
Internal operations on mode automata
• Parallel composition : free product of mode automata• preserves all states, variables, transitions, assertions• interleaving parallelism (only one transition at a time)
• Ex: two parallel Boolean blocks
block1.ok=block2.ok=trueblock1.O=(block1.I and block1.A and block1.R)block2.O=(block2.I and block2.A and block2.R)
block1.ok=false, block2.ok=trueblock1.O=falseblock2.O=(block2.I and block2.A and block2.R)
fail1 fail1
block1.ok=true, block2.ok=falseblock1.O=(block1.I and block1.A and block1.R)block2.O=false
block1.ok=block2.ok=falseblock1.O=falseblock2.O=false
fail2
fail2
Block 1 Block 2//
Dire
ctio
n -
Con
fére
nce
39
Internal operations on mode automata
• Interconnection : mapping an input of an automaton with an output of another automaton
• preserves all states, variables, transitions, assertions• Introduces new assertions: Block2.I = Bloc1.O for all pairs of connected
interfaces • interleaving parallelism (only one transition at a time)• ! allowed only if variables are not circularly defined
• Ex: two series blocks
block1.ok=block2.ok=trueblock1.O=block2.I=
(block1.I and block1.A and block1.R)block2.O=(block2.I and block2.A and block2.R)
block1.ok=false, block2.O=trueblock1.O=block2.I=falseblock2.O=false
fail1 fail1
block1.ok=true, block2.ok=falseblock1.O=block2.I=
(block1.I and block1.A and block1.R)block2.O=false
block1.ok=block2.ok=falseblock1.O=block2.I=falseblock2.O=false
fail2
fail2
Block 1 Block 2
Dire
ctio
n -
Con
fére
nce
40
Safety Assessment Techniques
• Interactive simulation = user driven exploration of the Kripke structure
→ play simple combination of failures (in the style of FMEA)
Dire
ctio
n -
Con
fére
nce
41
Safety Assessment Techniques
• OCAS Fault-Tree generation • The fault tree can be exported to other tools (Simtree, Arbor,...) to compute of minimal cut
sets and probabilities
Dire
ctio
n -
Con
fére
nce
42
Safety Assessment Techniques
• OCAS Sequence Generator • Automatic generation of failure sequences that lead to the observation of the
failure conditions• Limit on the number of failures to be considered
/*orders(MCS('hydrau_total_loss.O.true')) = orders product-number3 354 56total 91end*/products(MCS('hydrau_total_loss.O.true'))= {'EDPg.fail_loss', 'EMPb.fail_loss', 'disty.fail_loss'}{'EDPg.fail_loss', 'EMPb.fail_loss', 'rsvy.fail_loss'}{'EDPg.fail_loss', 'Elec2.fail_loss', 'disty.fail_loss'}{'EDPg.fail_loss', 'Elec2.fail_loss', 'rsvy.fail_loss'}...
Dire
ctio
n -
Con
fére
nce
MBSA application for CPSD
• Application range: • from detailed CPS architecture
Control architecture of one ONERA medium size UAV
Dire
ctio
n -
Con
fére
nce
MBSA application for CPSD
• Application range: • from detailed CPS architecture to their CONOPS
Procedure to ensure separation of aircraft trajectories in controlled airspace
Dire
ctio
n -
Con
fére
nce
Safety Assessment techniques and CPS
• CPS are complex integrated systems• MBSA is more adapted to CPS than FTA / FMEA
• CPS support new operation concepts (CONOPS)• Safety assessment of the CONOPS needed to identify accurate
system safety requirements• Easier with MBSA
• Open issue: • efficient safety assessment of the largest CPS systems (e.g. very
large networks of sensors).
Dire
ctio
n -
Con
fére
nce
Conclusion
• CPS are systems: • general dependability concepts remain applicable to CPS
• CPS uses are versatil whereas dependability standards are application dependant
• Focus on the practices in aviation where safety is a must• Key ideas/process step can be adapted to CPS of other domains• => need for convergence of safety standards to build a dependability
culture of CPS
• CPS are complex systems• Most advanced safety assessment techniques needed to support
analysis of highly integrated systems• => need for more research to address very large highly reconfigurable
systems
Dire
ctio
n -
Con
fére
nce
Bibliography
[Avizienis-al2004 ] "Basic Concepts and Taxonomy of Dependable and Secure Computing", Algirdas Avizienis, Jean-Claude Laprie, Brian Randell, and Carl Landwehr, IEEE Transactions on Dependable and Secure Computing, vol.1, n°1, january-march 2004
[ARP4754-2010] "Certification consideration for highly-integrated or complex aircraft system", Aerospace Recommended Practice 4754, SAE
[ARP4761-1996/2013?] "Guidelines and methods for conducting the safety assessment process on civil airborne systems and equipment", Aerospace Recommended Practice 4754, SAE
[Bieber-Seguin2013] "Safety Analysis of the Embedded Systems with the AltaRica Approach", in book "Industrial Use of Formal Methods: Formal Verification", Editor JL Boulanger, DOI: 10.1002/9781118561829.ch3