dependable software development

63
Dependable Software Development Lecture 7

Upload: didier

Post on 26-Feb-2016

63 views

Category:

Documents


4 download

DESCRIPTION

Dependable Software Development . Lecture 7. System dependability. For many computer-based systems, the most important system property is the dependability of the system. The dependability of a system reflects: T he user’s degree of trust in that system. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Dependable  Software Development

Dependable Software Development

Lecture 7

Page 2: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 19, MCS-NUST2

System dependability• For many computer-based systems, the most

important system property is the dependability of the system.

• The dependability of a system reflects:– The user’s degree of trust in that system. – The extent of the user’s confidence that it will operate as

users expect – That it will not ‘fail’ in normal use.

• Dependability covers the related systems attributes of reliability, availability and security. These are all inter-dependent.

Page 3: Dependable  Software Development

3

Importance of dependability• System failures may have widespread effects with large

numbers of people affected by the failure.– Systems that are not dependable and are unreliable, unsafe or

insecure may be rejected by their users.– The costs of system failure may be very high if the failure leads to

economic losses or physical damage.– Undependable systems may cause information loss with a high

consequent recovery cost.• Causes of failure:

– Hardware failure: Poor design and manufacturing errors – Software failure: errors in its specification, design or implementation.– Operational failure: perhaps the largest single cause of system

failures in socio-technical systems

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 19, MCS-NUST

Page 4: Dependable  Software Development

Principal dependability properties

4Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST

Page 5: Dependable  Software Development

Principal properties• Availability

– The probability that the system will be up and running and able to deliver useful services to users.

• Reliability– The probability that the system will correctly deliver

services as expected by users.• Safety

– A judgment of how likely it is that the system will cause damage to people or its environment.

• Security– A judgment of how likely it is that the system can resist

accidental or deliberate intrusions.

5Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST

Page 6: Dependable  Software Development

Other dependability properties• Repairability

– Reflects the extent to which the system can be repaired in the event of a failure

• Maintainability– Reflects the extent to which the system can be adapted to new

requirements;• Survivability

– Reflects the extent to which the system can deliver services whilst under hostile attack;

• Error tolerance– Reflects the extent to which user input errors can be avoided and

tolerated.

6Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST

Page 7: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST7

Software dependability• In general, software customers expect all software to be dependable.

– However, for non-critical applications, they may be willing to accept some system failures.

• Some applications, have very high dependability requirements and special software engineering techniques may be used to achieve this.

• Dependability achievement– Fault avoidance

• The system is developed in such a way that human error is avoided and thus system faults are minimised.

• The development process is organised so that faults in the system are detected and repaired before delivery to the customer.

– Fault detection• Verification and validation techniques are used to discover and remove

faults in a system before it is deployed.– Fault tolerance

• The system is designed so that faults in the delivered software do not result in system failure.

• Software fault avoidance approaches include:• Formal or precise specification practices,• Programming disciplines like information hiding and• encapsulation,• Extensive and repetitive reviews and formal analyses during the

development process• rigorous testing

• software fault avoidance approaches include• verification & validation, software testing, and proof methodology

Formal methods are fault avoidance techniques that aim to increase dependability by eliminating errors at the requirements specification anddesign stages of development

• fault tolerance technique tries to keep the system operational despite the presence of faults. • Since complete fault avoidance or elimination is not

possible, a critical system always employs fault tolerance techniques to guarantee high system reliability and Availability

Page 8: Dependable  Software Development

8Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST

Critical Systems• Software failure is common, some time the failure can cause

inconvenience but no serious damage, some times it does harm to the human life– Known as “critical system”

• Three types of critical systems are: – Safety-critical systems

• Failure may results in loss of life, injury or damage to the environment;– Chemical plant protection system;

– Mission-critical systems• Failure results in failure of some goal-directed activity;

– Spacecraft navigation system;– Business-critical systems

• Failure results in high economic losses;– Customer accounting system in a bank;

• For critical systems, the most important system property is the dependability of the system

Page 9: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST9

Safety-critical systems• Safety-Critical systems:

– Systems whose failure could result in loss of life, cause significant property damage or cause damage to the environment.

– These systems must be designed in such a way as to guarantee system stability during all of the system operational modes.• when a fatal fault occurs, the system safely shuts down.

• Applications – Computer based systems used in avionics, chemical

process and nuclear power plants.• A failure in the system endangers human lives directly or through

environment pollution and Influence is on a large economic scale.

Page 10: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST10

Safety-Critical Systems - Present• Transportation systems from flight to automobiles

– New airplanes contain advanced avionics such as inertial guidance systems and GPS receivers that also have considerable safety requirements.

– Automobiles, electric vehicles. and hybrid vehicles are increasingly using embedded systems to maximize efficiency and reduce pollution.

• Other automotive safety systems such as anti-lock braking system, Electronic Stability Control, and automatic four-wheel drive.

• Medical equipment is continuing to advance with more embedded systems

– Vital signs monitoring– Electronic stethoscopes for amplifying sounds– Various medical imaging for non-invasive internal

inspections.

Page 11: Dependable  Software Development

Can We Trust the Computer?

Case Study: The Therac-25Based on Article in IEEE-Computer,

July 1993.

Page 12: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 19, MCS-NUST12

Opening the case• One of the most widely reported accidents involved

the Therac-25– radiation therapy machine– June 1985 and January 1987

• Six known accidents - massive overdoses – causing deaths and serious injuries

• Worst accidents in 35 year history of medical accelerators

• “A significant amount of SW for life-critical systems comes from small firms, especially in the medical industry; firms that fit the profile of those resistant to or uninformed of the principles of either system safety or software engineering.”

Page 13: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST13

Therac-25• Massive overdoses of radiation were given;

– Medical accelerator to treat tumors– 6 known accidents resulting in death or serious injury

• June 1985 – January 1987 – Caused severe and painful injuries and the death

of three patients

Airbag sensory system in Automobiles

“--- this thing will probably have to work only once in 10 years, but it better work then, otherwise the result will be catastrophic.”

Page 14: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 19, MCS-NUST14

Background of the case• Medical linear accelerators accelerate electrons to

create high-energy beams that can destroy tumors with minimal impact on surrounding healthy tissue

• shallow tissue is treated with accelerated electrons:– Deeper tissue requires converting the electron beam into

X-ray photons• The Therac-25 is a medical linear accelerator.

– A linear accelerator ("linac") is a particle accelerator, a device that increases the energy of electrically charged atomic particles.

– The charged particle are accelerated by the introduction of an electric field, producing beams of particles which are then focused by magnets.

Page 15: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST15

Case study – Therac-25 • Linacs are used to treat cancer patients.

– A patient is exposed to beams of particles, or radiation, in doses designed to kill a tumor.

– Since malignant tissues are more sensitive than normal tissues to radiation exposure, a treatment plan can be developed that permits the absorption of an amount of radiation that is fatal to tumor cells but causes relatively minor damage to normal tissue.

– Shallow tissue is treated with electrons, but to reach deeper tissue, X-ray photons are needed

Page 16: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST16

Development of Therac-25• Developed from the Therac-6’s

– A 6MeV accelerator producing only X rays, • Evolve to Therac-20's

– A 20-MeV dual mode(X Rays or electrons) accelerator• SW functionality was limited in both machines, it

added convenience to existing hardware– Industry-standard hardware safety features and interlocks in the

hardware were retained

• Therac-25– Therac-25, dual-mode linear accelerator– more compact and versatile than Therac-20– Therac-25 takes advantage of computer control from outset while

Therac-6 and 20 designed around machines already having histories of clinical use w/o computer control

– Therac-25 has more responsibility for maintaining safety than SW in previous machines

Page 17: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST17

Therac-25's software• One programmer, over several years, revised the

Therac-6 software into the Therac-25 software. – An important difference between the Therac-20 software

and the Therac-25 software is the overall role that each plays in the machine.

– In the Therac-20, the role of software is limited. • The software simply adds convenience to the hardware.

– In the Therac-25, software exclusively performs many of the critical safety checks of the system; • these safety checks are also included in the hardware of the

Therac-20, but were not included in the Therac-25 hardware.

Page 18: Dependable  Software Development

How it Operates• SW responsible for monitoring machine status• accepts input about treatment desired, sets machine up for

treatment• turns beam on , activated by operator command• turns beam off when treatment is completed, or when

operator commands it OR when a malfunction is detected• Unit has an interlock system designed to remove power to

unit when there is a HW malfunction• Computer monitors interlock system and provides diagnostic

messages• depending on fault the computer either prevents a treatment

from starting OR if treatment is in progress, creates a pause or suspension of treatment

Page 19: Dependable  Software Development

The Safety Analysis Report (before release of product)

• Programming errors have been reduced by extensive testing on a HW simulator and under field conditions on teletherapy units. – Any residual SW errors are not included in the analysis– Program SW does not degrade due to wear, fatigue, or reproduction

process• Computer execution errors are caused by faulty HW

components and by “soft” (random) errors induced by alpha particles and electromagnetic noise.

• The fault tree does include computer failure but only hardware failures

Page 20: Dependable  Software Development

Therac-25 SW Testing• Manufacturer said the HW and SW were “tested and exercised

separately or together over many years”– In deposition, QA manager explained, testing was done in two parts

• “small amount” of SW testing done on a simulator• most done on system

• Reports indicate that unit and SW testing was minimal• Most testing efforts directed to integrated system test• Same QA manager at a Therac-25 users meeting stated the

SW was tested for 2,700 hours• Under questioning by users clarified this as “2700 hours of use”• Programmer left AECL in 1986, we know nothing of the

programmer• AECL employees could not provide any information about the

programmers educational background or experience

Page 21: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST21

Therac-25• Software was carried over from earlier projects where it had

seemingly worked well– Therac-6, Therac-20

• Computer control added to earlier machines• Still capable of stand-alone (no computer) operation

– All standard hardware safety mechanisms– Therac-25

• Software defects in earlier machines were hidden by hardware safeguards

• No real software development process• Apparently no serious evaluation of risks involved in using software in lieu

of hardware safeguards– Single programmer

• Operating system was developed by one programmer using Assembly Language in the 1970’s.

• SW “evolved” from Therac-6 (which was started in 1972)• Very little SW documentation produced during development

When designing dependable systems we must deal with dependability issues from the beginning by addressing fault-tolerance mechanisms within the system design and by employing appropriate fault-avoidance approaches in the design process. Adding dependability later on could be both expensive and might be not so effective as designing it in from the beginning.

fault avoidance, fault removal and fault tolerance represent three successive lines of defense against the contingency of faults in software systems and their impact on system reliability

Page 22: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST22

THE Software Errors• Each bug contained in the Therac-25 software was also

found in the software of the Therac-20.– However, the hardware safety interfaces in the Therac-20 prevented

any accidents from occurring in the other machine. • The Therac-25 software errors that cause radiation

overexposures can be reduced down to interface errors.

Page 23: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST23

Fault-free software• Fault-free software means software which conforms

to its specification. – It does NOT mean software which will always perform

correctly as there may be specification errors.• Therac-25

– 1983 safety analysis, in effect, assumed that software had no errors!• “Programming errors have been reduced by extensive

testing ... Any residual software errors are not included in the analysis.”

• “Computer execution errors are caused by faulty hardware components and by ‘soft’ (random) errors induced by alpha particles and electromagnetic noise.”

Page 24: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST24

Diversity and Redundancy• Redundancy - Where availability is critical

– e.g. in e-commerce systems, companies normally keep backup servers and switch to these automatically if failure occurs.• Keep more than 1 version of a critical component available so

that if one fails then a backup is available.• Diversity - To provide flexibility against external

attacks– Different servers may be implemented using different

operating systems (e.g. Windows and Linux)• Provide the same functionality in different ways so that they will

not fail in the same way.• However, adding diversity and redundancy adds

complexity and this can increase the chances of error.

Page 25: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST26

Rigorous Software Development• Addresses quality and productivity by emphasizing

the early stages in the development process– concentrates on developing an early, precise

understanding of the required behavior of the system– Think carefully about what you want to do and get it right

the first time.• Underlying the rigorous approach are formal

specification languages– These are mathematically based languages that provide

support for abstract and precise descriptions of software systems.

Page 26: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST27

Therac-25• Overconfidence in Software

– Safety analysis did not include software, even though it was responsible for safety of the system

– When problems did occur, it was assumed to be a hardware failure

– Software was designed for small memory footprint– Self Checks, Error Detection, Error handling and Auditing

was left out– Risk Assessment did not include software

Page 27: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST28

• Software inspections. (static verification)– Concerned with analysis of the static system

representation to discover problems• May be supplement by tool-based document and code analysis

• Software testing. (dynamic verification)– Concerned with exercising and observing product

behaviour • The system is executed with test data and its operational

behaviour is observed

Static and Dynamic verification

Page 28: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST30

Stages of static analysis• Control flow analysis.

– Checks for loops with multiple exit or entry points, finds unreachable code, etc.

• Data use analysis. – Detects uninitialized variables, variables written twice

without an intervening assignment, variables which are declared but never used, etc.

• Interface analysis. – Checks the consistency of routine and procedure

declarations and their use

Page 29: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST31

Therac-25• The Operator Interface

– At first, operator needed to enter information at the treatment table, and then re-enter at a console in the control room• Operators complained; safeguard was removed

– Error codes are reported on the screen with no English explanation• Example: (East Texas Cancer Center) “Malfunction 54”

reported, caused by “dose input 2”. • An AECL technician testified that “does input 2” means the

dose delivered was either too high or too low (!)– “Treatment Pause” after non-critical error, which

operator can ignore by pressing “P”• Causes operators to become insensitive to errors

Page 30: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST32

Therac-25• Example Bugs

– Data Entry Bug• Setting the bending magnets takes 8 seconds

– “Delay” subroutine uses shared memory with the data entry subroutine

– So data changes within 8 seconds will be wiped out when Delay exits!

• Causes bugs that only show up with proficient users who do data entry in <8 seconds

– Set-Up Test Bug• On every 256th pass through Set-Up (one-byte counter), the

upper collimator is not checked• Problem if operator hits “set” exactly when counter rolls over

to 0 – These kinds of bugs are notoriously difficult to track

down

Page 31: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST33

Level of Concern• For critical systems the major, minor and

moderate safety concerns must be identified• Therac-25

– Major: • Device directly affects the patient or operator and

failure could result in death or serious injury– Moderate:

• Device directly affects the patient and failure could result in non-serious injury

– Minor: • Failures will not result in injury

Page 32: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST34

Levels of Concern• Does the software

– Control life support device?– Control delivery of harmful energy?– Control treatment delivery?– Provide diagnosis as basis for treatment?– Monitor vital signs?

• If no to all these questions, then concern is minor

Page 33: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST35

Safety• System property that reflects the system’s ability to

operate (normally or abnormally) without danger to system environment– As more devices become software controlled, safety

becomes a greater concern– Safety requirements are exclusive (they exclude

undesirable situations rather than specify required system services)

• Safety Criticality – Primary safety-critical systems

• embedded software systems whose failure can cause associated hardware to fail and directly threaten people

– Secondary safety-critical systems• systems whose faults can cause other systems to fail which

cause threaten people

Page 34: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST36

Safety and Reliability• They are related, but not identical

– Reliability• concerned with conformance to a specification and delivery of a service

– Safety• concerned with ensuring a system cannot damage, regardless of its

conformance (or nonconformance) to its specification• Safety Achievements

– Hazard Avoidance• system design so some hazard cases can not arise

– Hazard Detection and Removal• system design so hazards are detected and removed before they result in

an accident– Damage Limitation

• system includes protection features that minimize damage that may result from an accident

Page 35: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST37

Case study Insulin Pump• The system measures the level of blood sugar every 10

minutes and if this level is above a certain value and is increasing then the dose of insulin to counteract the increase is computed and injected into the diabetic

• The system can also detect abnormally low levels of blood sugar and, if these occur, an alarm is sounded to warn the diabetic that they should take some action.

Page 36: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST38

Dependability requirements• The system shall be available to deliver insulin

when required to do so.• The system shall perform reliability and deliver the

correct amount of insulin to counteract the current level of blood sugar.

• The essential safety requirement is that excessive doses of insulin should never be delivered as this is potentially life threatening.

Page 37: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST39

Dependability attributes• Availability

– The pump should have a high level of availability but the nature of diabetes is such that continuous availability is unnecessary

• Reliability– Intermittent demands for service are made on the system

• Safety– The key safety requirements are that the operation of the

system should never result in a very low level of blood sugar. A fail-safe position is for no insulin to be delivered

• Security– Not really applicable in this case

Page 38: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST40

Sample Requirement Specifications

Page 39: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST41

General dependability requirements – SR1:

• The system shall not deliver a single dose of insulin that is greater than a specified maximum dose for a system user.

– SR2: • The system shall not deliver a daily cumulative dose of

insulin that is greater than a specified maximum for a system user.

– SR3: • The system shall include a hardware diagnostic facility that

should be executed at least 4 times per hour.– SR4:

• The system shall include an exception handler for all of the exceptions that are identified in Table …..

– SR5: • The audible alarm shall be sounded when any hardware

anomaly is discovered and a diagnostic message as defined in Table ……. should be displayed.

Page 40: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST42

Insulin Pump System Design• The important design decisions

made during the production of insulin pump software and the simulator.– Approach used to produce the

insulin pump software was to emulate the hardware organization by producing separate software objects (classes) for each distinguishable hardware object

• Controller::• Clock::• Display::• Simulator::

Needleassembly

Sensor

Display1 Display2

Alarm

Pump Clock

Controller

Power supply

Insulin reservoir

System ArchitectureInsulin pump components

Page 41: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST43

The software objects• The controller object:, the bulk of the computation

within the system is carried– It is within the controller that the dose of insulin to be

delivered is computed and where the self tests are performed

• Clock Object:, Working in together with the controller object,– Constantly determining how much time has lapsed since

the software was started or the timer was reset (which happens every 24 hours). • Periodically, at every interval specified the clock triggers certain

events required to be performed by the system

Page 42: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST44

The software objects• Display:: Object, used to create a graphical user

interface (GUI), – The data is then presented to the user via text boxes

positioned on the GUI• The remaining software objects model the

peripheral hardware units, – the software contained within these objects simply

records the current state of the hardware unit and for the purpose of simulation, provides the functionality to change that state.

Page 43: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST45

The software objects• The simulator software object:

– Provides the user with the functionality to perform a simulation of real-world events that would affect the pump software in differing manners

• The simulator facilitates the testing process– making it quicker and easier to perform the

necessary testing required in order to determine whether the insulin pump system is adequately safe.

Page 44: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST46

Object Interaction – Object classes

Page 45: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST47

Object Interaction – Sequence Diagrams

Page 46: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST48

Object Interaction – Sequence Diagrams

Page 47: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST49

Insulin delivery system• Data flow model of software-controlled insulin pump

Insulinrequirementcomputation

Blood sugaranalysis

Blood sugarsensor

Insulindelivery

controller

Insulinpump

BloodBlood

parameters

Blood sugarlevel

InsulinPump controlcommands Insulin

requirement

Page 48: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST50

Concept of operation• Using readings from the embedded sensor, the

system automatically measures the level of glucose in the sufferer’s body– Consecutive readings are compared and, if they indicate

that the level of glucose is rising then insulin is injected to counteract this rise

• The ideal situation is a consistent level of sugar that is within some ‘safe’ band

Page 49: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST51

Sugar levels• Unsafe

– A very low level of sugar (arbitrarily, we will call this 3 units) is dangerous and can result in hypoglaecemia which can result in a diabetic coma and ultimately death.

• Safe– Between 3 units and about 7 units, the levels of sugar are

‘safe’ and are comparable to those in people without diabetes. This is the ideal band.

• Undesirable– Above 7 units of insulin is undesirable but high levels are

not dangerous in the short-term. Continuous high-levels however can result in long-term side-effects.

Page 50: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST52

Injection scenarios• Level of sugar is in the unsafe band

– Do not inject insulin;– Initiate warning for the sufferer.

• Level of sugar is falling– Do not inject insulin if in safe band. Inject insulin if rate of

change of level is decreasing.• Level of sugar is stable

– Do not inject insulin if level is in the safe band;– Inject insulin if level is in the undesirable band to bring

down glucose level;– Amount injected should be proportionate to the degree of

undesirability ie inject more if level is 20 rather than 10.

Page 51: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST53

System availability• In specifying the availability, issues that must be

considered are:– The machine does not have to be continuously available

as failure to deliver insulin on a single occasion is not a problem

– However, no insulin delivery over a few hours would have an effect on the patient’s health

– The machine software can be reset by switching it on and off hence recovery from software errors is possible without compromising the usefulness of the system

– Hardware failures can only be repaired by return to the manufacturer. This means, in practice, a loss of availability of at least 3 days

Page 52: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST54

Probability of Failure on Demand• Probability system will fail when a service request is

made• Useful when requests are made on an intermittent

or infrequent basis• Appropriate for protection systems service requests

may be rare and consequences can be serious if service is not delivered

• Relevant for many safety-critical systems with exception handlers

Page 53: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST55

System failures• Transient failures

– can be repaired by user actions such as resetting or recalibrating the machine. • For these types of failure, a relatively low value of POFOD (0.002)

may be acceptable. – This means that one failure may occur in every 500 demands made

on the machine. This is approximately once every 3.5 days.

• Permanent failures – require the machine to be repaired by the manufacturer

• The probability of this type of failure should be much lower– Roughly once a year is the minimum figure so POFOD should be no

more than 0.00002.

Page 54: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST56

Safety Processes• Hazard and risk analysis

– assess the hazards and risks associated with the system• Startup, Alarm, Low battery, Needle, reservoir

• Safety requirements specification– specify system safety requirements

• Power off, reset, hardware simulator • Designation of safety-critical systems

– identify sub-systems whose incorrect operation can compromise entire system safety• Controller, display, clock, sensor

• Safety validation– check overall system safety

Page 55: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST57

System hazard analysis• Physical hazards

– Hazards that result from some physical failure of the system

• Electrical hazards– Hazards that result from some electrical failure of

the system• Biological hazards

– Hazards that result from some system failure that interferes with biological processes

Page 56: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST58

• insulin overdose or underdose (biological)• power failure (electrical)• machine interferes electrically with other medical

equipment such as a heart pacemaker (electrical)• parts of machine break off in patient’s body

(physical)• infection caused by introduction of machine

(biological.)• allergic reaction to the materials or insulin used in

the machine (biological).

Insulin system hazards

Page 57: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST59

Risk Assessment• Assess the hazard severity, hazard probability, and

accident probability– Outcome of risk assessment is a statement of

acceptability• Intolerable (can never occur)• ALARP (as low as possible given cost and schedule constraints)• Acceptable (consequences are acceptable and no extra cost

should be incurred to reduce it further)

Page 58: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST60

Risk analysis example

Identifiedhazard

Hazardprobability

Hazardseverity

Estimatedrisk

Acceptability

Insulin overdose Medium High High IntolerableInsulinunderdose

Medium Low Low Acceptable

Power failure High Low Low AcceptableMachineincorrectly fitted

High High High Intolerable

Machine breaksin patient

Low High Medium ALARP

Machine causesinfection

Medium Medium Medium ALARP

Electricalinterference

Low High Medium ALARP

Allergic reaction Low Low Low Acceptable

Page 59: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST61

Fault-tree Analysis• Hazard analysis method that starts with an

identified fault and works backwards to the cause of the fault– Can be used at all stages of hazard analysis

• Hazard Analysis Steps– Identify hazard– Identify potential causes of hazards– Link combinations of alternative causes using “OR” or

“AND” symbols as appropriate– Continue process until “root” causes are identified (result

will be an and/or tree or a logic circuit) the causes are the “leaves”

Page 60: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST62

Insulin pump fault tree

Incorrectsugar levelmeasured

Incorrectinsulin doseadministered

or

Correct dosedelivered atwrong time

Sensorfailure

or

Sugarcomputation

error

Timerfailure

Pumpsignals

incorrect

or

Insulincomputation

incorrect

Deliverysystemfailure

Arithmeticerror

or

Algorithmerror

Arithmeticerror

or

Algorithmerror

Page 61: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST63

Software problems• Arithmetic error

– Some arithmetic computation causes a representation failure (overflow or underflow)• Specification may state that arithmetic error must be detected and

an exception handler included for each arithmetic error. – The action to be taken for these errors should be defined

• The insulin dose is computed incorrectly because of some failure of the computer arithmetic

• Algorithmic error– Difficult to detect anomalous situation– May use ‘realism’ checks on the computed dose of insulin

• The dose computation algorithm is incorrect

Page 62: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST64

• Use language exception handling mechanisms to trap errors as they arise

• Use explicit error checks for all errors which are identified

• Avoid error-prone arithmetic operations (multiply and divide). – Replace with add and subtract

• Never use floating-point numbers• Shut down system if exception detected (safe state)

Arithmetic errors

Page 63: Dependable  Software Development

Adv Software Engg, by Asst Prof Athar Mohsin, MSCS 18, MCS-NUST65

Safety validation• Design validation

– Checking the design to ensure that hazards do not arise or that they can be handled without causing an accident.

• Code validation– Testing the system to check the conformance of the code

to its specification and to check that the code is a true implementation of the design.

• Run-time validation– Designing safety checks while the system is in operation

to ensure that it does not reach an unsafe state.