2016-05-30 risk driven design

35
Risk Driven Development [email protected]

Upload: jaap-van-ekris

Post on 27-Jan-2017

43 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: 2016-05-30 risk driven design

Risk Driven Development

[email protected]

Page 2: 2016-05-30 risk driven design

My Projects

Page 3: 2016-05-30 risk driven design

Reliability

Availability

Maintainability

Safety

Page 4: 2016-05-30 risk driven design

IEC 61508: Required activities for safety related systems

Page 5: 2016-05-30 risk driven design

Risk and the design process

• Each design step includes the refinement of the risk analysis

• Each design solution has to be measured against the risk analysis

• Constant design questions: – Is the design balanced? – Can it be made? – Can it be done simpeler?

Page 6: 2016-05-30 risk driven design

Simplicity is prerequisite for reliability

Edsger W. Dijkstra

Page 7: 2016-05-30 risk driven design

Risk management process

Slide 7 15 June 2016

Page 8: 2016-05-30 risk driven design

Failure definitions

• What can go wrong exactly?

• When do we consider the system to be failed?

Page 9: 2016-05-30 risk driven design

An example…

• Not extracting landing gear when commanded without error indication

• Spontaneous irreversible landing gear extraction while travelling overseas

Page 10: 2016-05-30 risk driven design

Top-down vs. Bottom-Up analysis

• Bottom-up: structured brainstorm about everything that could happen given a specific scope

• Top-down: think about your biggest fears first, than find out what could cause it.

Page 11: 2016-05-30 risk driven design

FME(C)A: bottom-up thinking

• Failure Mode and Effect (Criticality) Analysis

• Reasoning from failure of the components, thinking about the consequences

Page 12: 2016-05-30 risk driven design

Risk: System does not perform trick?

Page 13: 2016-05-30 risk driven design

Guide words…

Look at every component and investigate what happens if:

– It doesn’t work

– It is very slow

– Does the wrong thing

– Sends messages spontanously

– Loses messages/state

– Leaks information

Page 14: 2016-05-30 risk driven design

Structured FMECA approach

Function Failure Mode Causes Local Effects System Effects Criticality

Inwin Wrong output Logical error Unjustified open No closure Catastrophic

Delayed output PLC error delayed closure Closure delayed Limited

No output Application hang No closure No closure Catastrophic

Spontanous output Switching error Unjustified open Onterechtesluit False Positive

Process Wrong output Logical error Unjustified open No closure Catastrophic

Delayed output PLC error delayed closure Closure delayed Limited

No output Application hang No closure No closure Catastrophic

Spontanous output Switching error Power failure No closure Catastrophic

… … … … … …

… … … … …

… … … … …

… … … … …

Page 15: 2016-05-30 risk driven design

Certainty…

Rank beliefs not according to their plausibility but by the harm they may cause.

Nassim Nicholas Taleb

Slide 15 15 June 2016

Page 16: 2016-05-30 risk driven design

Identifying measures

• Risk = Chance * Impact

• Moments allowing measures: – Preventive

– Detection

– Repression

– Correction

– Ignore

– Accept

Slide 16 15 June 2016

Page 17: 2016-05-30 risk driven design

You can’t mitigate everything… • You can’t prevent everything

• You can’t plan for everything

• You can’t predict everything

• You couldn’t do any business

• But, you can’t ignore everything either

Page 18: 2016-05-30 risk driven design

Structured FMECA approach

Function Failure Mode Causes Local Effects System Effects Criticality Detection

Mitigating

Measures

Inwin Wrong output Logical error Unjustified open No closure Catastrophic None Multiprogramming

Delayed output PLC error delayed closure Closure delayed Limited None

No output Application hang No closure No closure Catastrophic None Failsafe behaviour Process

Spontanous output Switching error Unjustified open Onterechtesluit False Positive None

Process Wrong output Logical error Unjustified open No closure Catastrophic None Multiprogramming

Delayed output PLC error delayed closure Closure delayed Limited None

No output Application hang No closure No closure Catastrophic None Deadlock detection

Spontanous output Switching error Power failure No closure Catastrophic None Safety relay

… … … … … … … …

… … … … … … …

… … … … … … …

… … … … … … …

New functional and design requirements!

Page 19: 2016-05-30 risk driven design

Disadvantages FME(C)A

• It is impossible to calculate an overall risk exposure

• Relation between risks is missing – Common mode failures usually aren’t modelled

• Complex scenario’s are hard to model – Multiple failures aren’t modelled

– Are there root causes that could trigger multiple failures?

• Usually identifies irrelevant risks

Page 20: 2016-05-30 risk driven design

Top-Down Risk analysis • Start with a dominant

concern

• Identify potential causes

• Detail further

Page 21: 2016-05-30 risk driven design

A small FTA

Page 22: 2016-05-30 risk driven design

Typical risks identified

• Components making the wrong decissions

• Power failure

• Hardware failure of PLC’s/Servers

• Software failures

• Network failure

• External factors

• Human maintenance error

22

Page 23: 2016-05-30 risk driven design

Breaking a cut-set

Alternate component

Alternate service

Page 24: 2016-05-30 risk driven design

Measures and FTA

15/06/2016 24

Before After

Page 25: 2016-05-30 risk driven design

Design decisions…

• Every design decision is accompanied by a Risk analysis focussing on RAMS aspects

• In the end the cost, RAMS effects and other trade-off aspects will determine which design option will be used

Page 26: 2016-05-30 risk driven design

Option 1

Page 27: 2016-05-30 risk driven design

FTA Option 1

Page 28: 2016-05-30 risk driven design

Option 2

Page 29: 2016-05-30 risk driven design

FTA Option 2

Page 30: 2016-05-30 risk driven design

Info

Hoogtebepaling Aansturing

Hoogtemeting

Waterkering

Diesels

Meeta

Meetb

Stuura

Stuurb

Software failure Chance: 1/1.000 year

Measurement error Chance: (1/1.000.000 year)3

Software failure Chance: 1/1.000.000 year

Software failure Chance: 1/1.000 year

Design Option 1

Page 31: 2016-05-30 risk driven design

Info

Hoogtebepaling Aansturing

Hoogtemeting

Waterkering

Diesels

Meeta

Meetb

Stuura

Stuurb

Software failure Chance: 1/10.000 year

Measurement error Chance: (1/1.000.000 year)3

Software failure Chance: 1/100 year

Software failure Chance: 1/10.000 year

Design Option 2

Page 32: 2016-05-30 risk driven design

IEC 61508: Required activities for safety related systems

Page 33: 2016-05-30 risk driven design

Testing

Function Impact wrong/not functioning

Impact spontanous functioning

Function 1 Small Medium

Function 2 Disasterous Huge

Function 3 Serious Huge

Function 3 Serious Small

Function 4 Serious Serious

Function 5 Serious Small

Function 6 Huge Huge

Page 34: 2016-05-30 risk driven design

Test depth and acceptable risk

• Level A: Thorough endurancetest aiming to prove function reliability with high accuracy.

• Level B: Thorough endurancetest aiming to prove function reliability with medium accuracy.

• Level C: Thorough endurancetest aiming to prove function reliability with low accuracy.

• Level D: Test to verify if the function works once.

• Level E: Function testd alongside other functions, might leave paths untested.

Test effort

Level #Tests Effort

Level A 50.000 120 hours

Level B 10.000 24 hours

Level C 1.000 4 hours

Level D 1 1 hour

Level E - PM

Page 35: 2016-05-30 risk driven design

Test depth…

Functie Not functioning Spont. Function

Function 1 Level E NOT

Function 2 Level A Level A

Function 3 Level A Level A

Function 3 Level B Level B

Function 4 Level A Level A

Function 5 Level E NOT

Function 6 Level A Level A

… … …