©2015 all rights reserved 12 may 20151 international system safety society washington dc chapter...
TRANSCRIPT
1©2015 All Rights Reserved 12 May 2015
Introduction to System & Software Safety
Class 1-1
12 May 2015
International System Safety SocietyWashington DC Chapter
Organized 1962 | Incorporated 1973
David J. Shampine WDC Past-President
(386) [email protected]
Donne M. DiFigliaWDC 2015 President-Elect
(540) [email protected]
2©2015 All Rights Reserved 12 May 2015
Introduction to System Safety
International System Safety SocietyWashington DC Chapter
Organized 1962 | Incorporated 1973
3©2015 All Rights Reserved 12 May 2015
Learning Objectives forSystem Safety
Learning Objectives for System Safety: Understand the definition and objective of System Safety Become familiar with System Safety Guidance and
documentation Become familiar with the eight elements of the System Safety
Process identified in MIL-STD-882E Become familiar with the Hazard Analyses conducted
throughout the Integrated Defense Acquisition Cycle Understand the basics of Risk Assessment, Risk Mitigation, Risk
Reduction & Verification, and Risk Acceptance
4©2015 All Rights Reserved 12 May 2015
MIL-STD-882E: DoD Standard Practice: System Safety, 2012 SW020-AF-SAF-010: Weapon System Safety Guidelines
Handbook, 2006 DoD Joint Software System Safety Engineering Handbook
(JSSSEH), 2010 Department of Defense Architecture Framework (DoDAF)
Version 2.0 DoD Directive (DoDD) 5000.01 & DoD Instruction (DoDI)
5000.02: Operation of the Defense Acquisition System
Reference/ Guidance Documents
5©2015 All Rights Reserved 12 May 2015
MIL-STD-882E defines System Safety as:
“The application of engineering and management principles, criteria and techniques to achieve acceptable mishap risk within the constraints of operational effectiveness and suitability, time, and cost throughout all phases of the system lifecycle”
Definition: System Safety
6©2015 All Rights Reserved 12 May 2015
System Safety Purpose: To identify the risk of potential Mishaps Reduce the risk through engineering rigor
System Safety Objective: To Influence Design:
- As early as possible in the system lifecycle- Build Safety into the system or product (vice adding later)- To eliminate or control Hazards- To reduce mishap risk to acceptable level- Use best standard practices (MIL-STD-882 or equivalent)
System Safety Purpose & Objective
7©2015 All Rights Reserved 12 May 2015
Define the Need for Change
8©2015 All Rights Reserved 12 May 2015
Define the System
9©2015 All Rights Reserved 12 May 2015
Approve the System
10©2015 All Rights Reserved 12 May 2015
Implement the System
©2015 All Rights Reserved 12 May 2015 11
Test, Verify and Validate the System
12©2015 All Rights Reserved 12 May 2015
Operate and Maintain the System
13©2015 All Rights Reserved 12 May 2015
MIL-STD-882E - Eight Elements of the System Safety Process
1. Document the System Safety
Approach
2. Identify & Document Hazards
3. Assess & Document
Risk
5. Identify & Document Risk
Mitigation Measures
5. Reduce Risk
6. Verify, Validate & Document Risk
Reduction
7. Accept Risk &Document
8. Manage Life-CycleRisk
From MIL-STD-882E Section 4.3
14©2015 All Rights Reserved 12 May 2015
Document the System Safety Approach: System Safety Management Plan
Hazard Management Plan (HMP) – MIL-STD-882E Corresponds to System Safety Management Plan (SSMP) described in the
JSSSE Handbook
Hazard Management Plan (HMP)
System Requirements Hazard Analysis (SRHA)
Hazard Management Plan (HMP)
1. Document the System
Safety Approach
15©2015 All Rights Reserved 12 May 2015
Document the System Safety Approach: System Safety Program Plan (SSPP)
From the Joint Software Systems Safety Engineering Handbook
System Safety Program Plan (SSPP) MIL-STD-882E, Task 102 (“what to do”) SW020-AH-SAF-010, Section II, Chapter 5 (“how to do it”) Documents system safety methodology for identification,
classification, and mitigation of safety hazards Integral part of the Systems Engineering Management Plan (SEMP) Details tasks & activities required to implement a systematic
approach of hazard analysis, risk assessment, and risk management
1. Document the System
Safety Approach
16©2015 All Rights Reserved 12 May 2015
Identify, Recommend, and Implement New Hazard Controls
Return to Database and Close HCR
Hazard Identified and
Entered into HTSHTS
Hazard Eliminatedor Reduced to an
Acceptable Level?
HCR Remains OpenNO
YESHCR Closed
Closed-Loop Hazard Tracking System
Identify & Document Hazards: Hazard Tracking System (HTS)
MIL-STD-882E Task 106 - Hazard Tracking System (HTS) Tracks hazards and associated mishaps, causal factors & mitigation
measures throughout the life-cycle Provides traceability to safety requirements, code & trouble reports Provides traceability to safety test cases
1. Document the System
Safety Approach
©2015 All Rights Reserved 12 May 2015 17
Safety Program Plan
Requ
irem
ents
IV&VIV&VIV&VIV&V
Derived Requirements
Prescribed Requirements
• Prescribed Requirements:• Treaties• Component Spec. Sheets• MSDSs• CONOPS• Field Experience• Lessons Learned• Past Trouble Reports
• Derived Requirements:• Hazard Analysis Types
• PHL, PHA, SSHA, O&SHA, SHA, etc.
• Hazard Analysis Techniques• BPA, FTA, SRHA, FuHA, CCFA,
SCA, etc.• Design Change Reviews• TECHEVAL, OPEVAL • IV&V
Document the System Safety Approach: SSPP Requirement Identification
1. Document the System
Safety Approach
18©2015 All Rights Reserved 12 May 2015
Document the System Safety Approach: Safety Analyses
ACRONYMSFHA Functional Hazard AnalysisFOC Full Operational Capability HHA Hazard Assessment IOC Initial Operational CapabilityNEPA National Environmental Policy Act O&SHA Operating and Support Hazard Analysis PDR Preliminary Design Review PESHE Programmatic Environment, Safety, and Occupational Health Evaluation PHA Preliminary Hazard Analysis PHL Preliminary Hazard List SRHA System Requirements Hazard AnalysisSRVM System Requirement Verification Matrix SHA System Hazard AnalysisSSHA Subsystem Hazard Analysis
Safety Analyses throughout the Acquisition Cycle1. Document the System
Safety Approach
19©2015 All Rights Reserved 12 May 2015
Hazard Analyses Type: Establishes a general category of analysis type Defines detail, timing, coverage (what, when, where to analyze) Establishes a specific analysis task for a specific time in program
lifecycle Provides a specific design focus
MIL-STD-882E Task
Prime Hazard Analyses Types
1 Task 201 Preliminary Hazard List (PHL)
2 Task 202 Preliminary Hazard Analysis (PHA)
7 Task 203 System Requirements Hazard Analysis (SRHA)
3 Task 204 Subsystem Hazard Analysis (SSHA)
4 Task 205 System Hazard Analysis (SHA)
5 Task 206 Operations & Support Hazard Analysis (O&SHA)
6 Task 207 Health Hazard Analysis (HHA)
Identify & Document Hazards:Hazard Analyses Types
2. Identify & Document Hazards
20©2015 All Rights Reserved 12 May 2015
Hazard Analyses Technique: Establishes a specific and unique analysis methodology Defines the principles and procedures of hazard analysis inquiry Establishes how to perform the analysis Satisfies the intent of a specific hazard analysis type There are dozens of techniques which include the 7 prime analysis types
Identify & Document Hazards:Hazard Analyses Techniques
2. Identify & Document Hazards
Hazard Analysis TechniquesFault Tree Analysis (FTA) Threat Hazard Assessment (THA)
Failure Modes & Effects Analysis (FMEA) Cause Consequences Analysis (CCA)
Functional Hazard Analysis (FHA) Common Cause Failure Analysis (CCFA)
Failure Mode, Effects and Criticality Analysis (FMECA) Event Tree Analysis (ETA)
Fault Hazard Analysis (FHA) Petri Net Analysis (PNA)
Sneak Circuit Analysis (SCA) Hazard & Operability Stud (HAZOP)
Bent Pin Analysis (BPA) Etc.
21©2015 All Rights Reserved 12 May 2015
Identify & Document Hazards:Preliminary Hazard List (PHL)
2. Identify & Document Hazards
Identify general system-level hazard and mishaps Identify safety guidelines, precepts, TLMs & SCFs
22©2015 All Rights Reserved 12 May 2015
Identify & Document Hazards:Preliminary Hazard Analysis (PHA)
• Identify system hazards, causal factors and risk• Identify safety guidelines, SSRs, precepts, SCFs & TLMs
From the Joint Software Systems Safety Engineering Handbook
2. Identify & Document Hazards
23©2015 All Rights Reserved 12 May 2015
System Requirements Hazard Analysis (SRHA)
Ensure all hazards are covered by SSRs Ensure all SSRs are in design and test specifications and
successfully pass testing
From the Joint Software Systems Safety Engineering Handbook
2. Identify & Document Hazards
24©2015 All Rights Reserved 12 May 2015
Identify & Document Hazards:Subsystem Hazard Analysis (SSHA)
Identify hazards at subsystem level and interface level Identify detailed causal factors within the subsystem
2. Identify & Document Hazards
From the Joint Software Systems Safety Engineering Handbook
25©2015 All Rights Reserved 12 May 2015
Identify & Document Hazards:System Hazard Analysis (SHA)
2. Identify & Document Hazards
Assess the risk of the total system design specifically of the subsystem interfaces
Include HW, SW, HSI, COTS, etc.
From the Joint Software Systems Safety Engineering Handbook
26©2015 All Rights Reserved 12 May 2015
Operating and SupportHazard Analysis (O&SHA)
Identify hazards at operations and support level considering human error and design
Evaluate adequacy of risk mitigations
From the Joint Software Systems Safety Engineering Handbook
2. Identify & Document Hazards
27©2015 All Rights Reserved 12 May 2015
Health Hazard Assessement (HHA)
Identify hazards to humans resulting from design, operation and manufacturing (e.g., noise, ergonomics, HazMat)
From the Joint Software Systems Safety Engineering Handbook
2. Identify & Document Hazards
28©2015 All Rights Reserved 12 May 2015
Identify System hazards by evaluating the safety impact of Hardware, Software, or Humans failing to operate, operating incorrectly, or operating at the wrong time When a failure can be determined hazardous, the casual
factors of the malfunction are identified and investigated in greater detail
Performance
Safety
Identify & Document Hazards: Evaluate Safety Impact
Function- Software- Hardware
Message Operator Action
Time Sequence
Right Right
Right Wrong
Wrong Right
Wrong Wrong
2. Identify & Document Hazards
29©2015 All Rights Reserved 12 May 2015
Identify & Document Hazards:Identify System Properties
System Performance Reliability SafetyClothes Dryer Dries Clothes Fails to operate Overheats & ignites;
burns house down
Missile Launching System
Missile hits target when launched
Fails to launchFails to hit target
Inadvertently launchesHits wrong target
Radar System Detects target Fails to operateFails to detect
RF energy injures personnel or ignites explosives or fuel
Warhead Fuze System
Initiates explosives when criteria met
Fails to initiate explosives
Initiates explosives inadvertently or using wrong criteria (wrong time, wrong distance)
Success Failure
2. Identify & Document Hazards
30©2015 All Rights Reserved 12 May 2015
Identify & Document Hazards:Identify System-Related Safety Concerns
2. Identify & Document Hazards
System Element Safety Concern
Hardware Hazardous Elements/ ComponentsHazardous OperationsInadvertent OperationFailuresCombinations of Faults
Software Inadvertent OperationIncorrect Control
Humans Incorrect OperationIncorrect Decisions
Functions Fail to PerformMalfunction (incorrect)Unintended Function
Interface No DataIncorrect DataData Latency
Environment Severe WeatherHazardous Environments
Procedures Hazardous TasksIncorrect Instructions
31©2015 All Rights Reserved 12 May 2015
Assess & Document Risk:Severity Categories
Determine the Severity Category for a given hazard at a given point in time Identify the potential for death or injury, environmental impact, or monetary
loss
3. Assess &Document
Risk
MIL-STD-882E SEVERITY CATEGORIES
Description Severity Category Mishap Result Criteria
Catastrophic 1Could result in one or more of the following: death, permanent total disability, irreversible significant environmental impact, or monetary loss equal to or exceeding $10M.
Critical 2Could result in one or more of the following: permanent partial disability, injuries or occupational illness that may result in hospitalization of at least three personnel, reversible significant environmental impact, or monetary loss equal to or exceeding $1M but less than $10M.
Marginal 3Could result in one or more of the following: injury or occupational illness resulting in one or more lost work day(s), reversible moderate environmental impact, or monetary loss equal to or exceeding $100K but less than $1M.
Negligible 4Could result in one or more of the following: injury or occupational illness not resulting in a lost work day, minimal environmental impact, or monetary loss less than $100K.
32©2015 All Rights Reserved 12 May 2015
Assess & Document Risk:Severity Categories
Determine the Probability Level for a given hazard at a given point in time Assess the likelihood of occurrence of a mishap Probability level F
- Hazard must be eliminated or designed out- No amount of doctrine, training, warning, caution, or personal protective
equipment can move a mishap probability to level F
3. Assess &Document
Risk
MIL-STD-882E PROBABILITY LEVELS
Description Level Specific Individual Item Fleet or Inventory
Frequent A Likely to occur often in the life of an item. Continuously experienced.
Probable B Will occur several times in the life of an item. Will occur frequently.
Occasional C Likely to occur sometime in the life of an item. Will occur several times.
Remote D Unlikely, but possible to occur in the life of an item.
Unlikely, but can reasonably be expected to occur.
Improbable E So unlikely, it can be assumed occurrence may not be experienced in the life of an item. Unlikely to occur, but possible.
Eliminated FIncapable of occurrence. This level is used when potential hazards are identified and later eliminated.
Incapable of occurrence. This level is used when potential hazards are identified and later eliminated.
33©2015 All Rights Reserved 12 May 2015
Assess & Document Risk:Risk Assessment Matrix
Determine the Severity Category and Determine the Probability Level Assess the Risk Assessment Code (RAC)
3. Assess &Document
Risk
34©2015 All Rights Reserved 12 May 2015
Identify & Document Risk Mitigation Measures: Order of Precedence
↓ Eliminate hazards through design selection
↓ Reduce risk through design alteration
↓ Incorporate engineered features or devices
↓ Provide warning devices
↓ Incorporate signage, procedures, training, and Personal Protective Equipment (PPE)
NOTE: Warnings and cautions by themselves cannot be used to lower “High” and “Serious” category risks
MIL-STD-882E defines the System Safety Design Order Of Precedence which identifies alternative mitigation approaches and lists them in order of decreasing effectiveness
4. Identify & Document
RiskMitigationMeasures
35©2015 All Rights Reserved 12 May 2015
Reduce Risk:Implement Mitigation Measures
Implement Mitigation Measures to achieve an acceptable risk level
Consider and evaluate the cost, feasibility, and effectiveness of candidate mitigation methods as part of the Systems Engineering (SE) and Integrated Product Team (IPT) processes
Re-evaluate risk once the mitigation measure is incorporated into the system
Present the current hazards, their associated severity and probability assessments, and status of risk reduction efforts at SSWGs and technical reviews
Provide Objective Quality Evidence
5. ReduceRisk
36©2015 All Rights Reserved 12 May 2015
Verify, Validate & Document Risk Reduction: Safety TestingTest Mitigation Measures Verify implementation and validate effectiveness of all selected
risk mitigation measures through appropriate analysis, testing, demonstration, or inspection
6. Verify, Validate & Document
Risk Reduction
37©2015 All Rights Reserved 12 May 2015
Verify, Validate & Document Risk Reduction: Safety Test Planning
6. Verify, Validate & Document
Risk Reduction
Systems being Tested
Understand the System being tested Understand the Requirements that drive the safety test plans Become familiar with the Test Guidance Complete the RTVM
38©2015 All Rights Reserved 12 May 2015
Accept Risk & Document:Safety Assessment Report (SAR)
Include hazards that were identified and eliminated and specific procedural controls and precautions to be followed to mitigate the risks of hazards that could not be eliminated
Prepare for Risk Acceptance
7. AcceptRisk &
Document
From the Joint Software Systems Safety Engineering Handbook
39©2015 All Rights Reserved 12 May 2015
Accept Risk & Document
Obtain risk acceptance from the appropriate authority The system configuration and associated documentation that supports
the formal risk acceptance decision is retained through the life of the system.
Definitions in the Severity & Probability Charts, RACs, and the software criticality tables define risk at the time of acceptance
Mishap reports, user feedback, and experience with similar systems or other sources may reveal new hazards or demonstrate that the risk for a known hazard is higher or lower than previously recognized
- Revise and accept newly evaluated risk
Risk Level Acceptance Authority
High Assistant Secretary of the Navy (ASN DA)
Serious Program Executive Officer (PEO)
Medium Program Manager
Low Program Manager
7. AcceptRisk &
Document
40©2015 All Rights Reserved 12 May 2015
After the System is fielded, the Safety Engineer uses the System Safety Process to identify hazards & maintain the HTDB throughout the system’s life-cycle
This life-cycle effort considers any changes that include, but are not limited to:
• Interfaces• Users• Hardware and software• Mishap data• Missions• System health data
Manage Lifecycle Risk
8. ManageLifecycle
Risk
41©2015 All Rights Reserved 12 May 2015
Program Office and user community maintain effective communications to collaborate, identify, and manage new hazards and modified risks
If a new hazard is discovered or a known hazard is determined to have a higher risk level than previously assessed, the new or revised risk will need to be formally accepted in accordance with DoDI 5000.02
The System Safety Team will provide analyses of hazards that contributed to the mishap and recommendations for risk mitigation measures, especially those that minimize human errors
Manage Lifecycle Risk:Maintain Effective Communications
8. ManageLifecycle
Risk
42©2015 All Rights Reserved 12 May 2015
Introduction to Software Safety
International System Safety SocietyWashington DC Chapter
Organized 1962 | Incorporated 1973
43©2015 All Rights Reserved 12 May 2015
Introduction to Software Safety
Learning Objectives for Software Safety: Become familiar with Software Safety Methodology and Terminology Determine Software Criticality Index (SwCI) based on the level of severity
of potential mishaps with software contributors and control category of the software
Determine Level of Rigor (LOR) tasks based on SwCI Become familiar with recommended LOR analyses, tests and verification Become familiar with software hazard analysis techniques Understand how to develop technical evidence for LOR compliance Assess Software Contribution to overall System Risk
44©2015 All Rights Reserved 12 May 2015
Integrated Approach to Safer Software
Total software safety program consists of two distinct separate but overlapping processes
Processes implemented and integrated together to produce software as safe as reasonably practical
Integrated approach uses the strengths and skills of each individual team member to carry out specific tasks within their individual domain expertise
From the Joint Software Systems Safety Engineering Handbook
45©2015 All Rights Reserved 12 May 2015
Software Safety Tasks
Step Task Description
1 Determine the Safety Significant Software Functions (SSSFs) Determine the SSSFs in the System. Conduct an FHA. Reference the SSHA for mature systems.
2 Assign a Software Control Category (SCC)
Assign an SCC based upon the level of control and authority assigned to each SSSF1. Autonomous2. Semi-Autonomous3. Redundant Fault Tolerant4. Influential5. No Safety Impact
3 Assign a Software Criticality Index (SwCI)
The SwCI is a mechanism to assess software impact to the system in the event of a failure and, based upon command, control, and autonomy authority for a specific SSSF. Assign a SwCI for each SSSF mapped to the software design architecture.
4 Define LOR Tasks Use the SwCI to determine the LOR Tasks. The LOR specifies the amount and type of analyses and testing required to assess the Software Contributions to the System Level Risk
5 Tailor LOR Tasks Tailor the LOR tasks to clarify the LOR required in the software development and test activities to ensure the software’s safety integrity within the system context.
6 Develop Technical Evidence Develop the technical evidence through analysis and test that supports successful completion of the LOR for the safety-related functions.
7 Evaluate System Risk Define the inherent risk of not accomplishing the LOR tasks. Evaluate software’s contribution to the System Risk.
AssignSCC
Tailor LOR
TasksAssignSwCI
DefineLOR
Tasks
DevelopTechnical Evidence
EvaluateSystem
RiskIdentifySSSFs
©2015 All Rights Reserved 12 May 2015 46
Identify Safety-Significant Software Functions (SSSFs)
Identify system functions for categorization & prioritization of Safety-significant Software Functions (SSSFs) both Safety-Critical and Safety-Related
Software that is Safety-Significant will have to be evaluated to determine if is Safety-Critical or Safety-Related
IdentifySSSFs
MIL-STD-882E Definitions: Safety-significant. A term applied to a condition, event, operation, process,
or item that is identified as either safety-critical or safety-related. Safety-related. A term applied to a condition, event, operation, process, or
item whose mishap severity consequence is either Marginal or Negligible. Safety-critical. A term applied to a condition, event, operation, process, or
item whose mishap severity consequence is either Catastrophic or Critical (e.g., safety-critical function, safety-critical path, and safety-critical component).
©2015 All Rights Reserved 12 May 2015 47
Functional Hazard Analysis (FHA)
Identify and classify system functions and hazards associated with functional failure or malfunction
Provide the engineer with a better understanding of the physical attributes of the system and its intended functionality, logical structure and data attributes
IdentifySSSFs
From the Joint Software Systems Safety Engineering Handbook
48©2015 All Rights Reserved 12 May 2015
System Requirements Traceability Documents the requirement throughout its lifecycle Traces to the Test Cases Traces to the Design, Code, Enhancements, Fixes & Modifications Requirements & Historical Data captured in DOORS Database
System Requirements Traceability
IdentifySSSFs
49©2015 All Rights Reserved 12 May 2015
System Safety Requirements (SSRs): Provide design guidance for intentionally designing safety into the system SSRs captured in Hazard Tracking System as risk mitigation or causal factors Eliminate or mitigate hazards Identified hazards cannot be closed unless their mitigating SSSRs are successfully
verified and validated Trace to the Safety Test Cases
Safety Requirements Traceability
IdentifySSSFs
50©2015 All Rights Reserved 12 May 2015
Each SSSF is assessed and categorized against the Software Control Category (SCC) to determine the Level of Control that the Software exercises over safety-significant functionality
SOFTWARE CONTROL CATEGORIES
Level Name Description
1Autonomous
(AT)
Software functionality that exercises autonomous control authority over potentially safety-significant hardware systems, subsystems, or components without the possibility of predetermined safe detection and intervention by a control entity to preclude the occurrence of a mishap or hazard. (This definition includes complex system/software functionality with multiple subsystems, interacting parallel processors, multiple interfaces, and safety-critical functions that are time critical.)
2
Semi-Autonomous
(SAT)
Software functionality that exercises control authority over potentially safety-significant hardware systems, subsystems, or components, allowing time for predetermined safe detection and intervention by independent safety mechanisms to mitigate or control the mishap or hazard . (This definition includes the control of moderately complex system/software functionality, no parallel processing, or few interfaces, but other safety systems/mechanisms can partially mitigate. System and software fault detection and annunciation notifies the control entity of the need for required safety actions.)
Software item that displays safety-significant information requiring immediate operator entity to execute a predetermined action for mitigation or control over a mishap or hazard. Software exception, failure, fault, or delay will allow, or fail to prevent, mishap occurrence. (This definition assumes that the safety-critical display information may be time-critical, but the time available does not exceed the time required for adequate control entity response and hazard control.)
3
RedundantFault Tolerant
(RFT)
Software functionality that issues commands over safety-significant hardware systems , subsystems, or components requiring a control entity to complete the command function. The system detection and functional reaction includes redundant, independent fault tolerant mechanisms for each defined hazardous condition. (This definition assumes that there is adequate fault detection, annunciation, tolerance, and system recovery to prevent the hazard occurrence if software fails, malfunctions, or degrades. There are redundant sources of safety-significant information, and mitigating functionality can respond within any time-critical period.)
Software that generates information of a safety-critical nature used to make critical decisions. The system includes several redundant, independent fault tolerant mechanisms for each hazardous condition, detection and display.
4 Influential Software generates information of a safety-related nature used to make decisions by the operator, but does not require operator action to avoid a mishap.
5No Safety
Impact(NSI)
Software functionality that does not possess command or control authority over safety-significant hardware systems , subsystems, or components and does not provide safety-significant information. Software does not provide safety-significant or time sensitive data or information that requires control entity interaction. Software does not transport or resolve communication of safety-significant or time sensitive data.
From MIL-STD-882E Section 4.4.1.b
Assign Software Control Category
AssignSCC
51©2015 All Rights Reserved 12 May 2015
Assess Autonomy
Assessing Autonomy is an integral part of determining safety criticality for software that controls or resides in systems When assessing autonomy of the system the
Safety Engineer should consider how the system:– Perceives the world and itself – Communicates with other systems– Is controlled:
• Fully autonomous• Man-in-the-Loop• Manual
AssignSCC
52©2015 All Rights Reserved 12 May 2015
SOFTWARE CONTROL CATEGORIES
Level Name Description
1 Autonomous(AT)
Software functionality that exercises autonomous control authority over potentially safety-significant hardware systems, subsystems, or components without the possibility of predetermined safe detection and intervention by a control entity to preclude the occurrence of a mishap or hazard. (This definition includes complex system/software functionality with multiple subsystems, interacting parallel processors, multiple interfaces, and safety-critical functions that are time critical.)
SCC 1 - Autonomous
Small Diameter Bomb
Autonomous control over safety hardware system, and No possibility of detection and intervention by control entity
AssignSCC
53©2015 All Rights Reserved 12 May 2015
SOFTWARE CONTROL CATEGORIES
Level Name Description
2 Semi- Autonomous
(SAT)
Software functionality that exercises autonomous control authority over potentially safety-significant hardware systems, subsystems, or components without the possibility of predetermined safe detection and intervention by a control entity to preclude the occurrence of a mishap or hazard. (This definition includes complex system/software functionality with multiple subsystems, interacting parallel processors, multiple interfaces, and safety-critical functions that are time critical.)
SCC 2 – Semi-Autonomous
Control over safety hardware system, and Time for detection and intervention by Independent safety
mechanism
S/W displays Safety-Significant data for immediate, pre-determined operator action to prevent a mishap, or
S/W fault or delay that will allow, or fail to prevent, the mishap
AssignSCC
54©2015 All Rights Reserved 12 May 2015
SOFTWARE CONTROL CATEGORIES
Level Name Description
3 RedundantFault
Tolerant(RFT)
Software functionality that issues commands over safety-significant hardware systems, subsystems, or components requiring a control entity to complete the command function. The system detection and functional reaction includes redundant, independent fault tolerant mechanisms for each defined hazardous condition. (This definition assumes that there is adequate fault detection, annunciation, tolerance, and system recovery to prevent the hazard occurrence if software fails, malfunctions, or degrades. There are redundant sources of safety-significant information, and mitigating functionality can respond within any time-critical period.)
Software that generates information of a safety-critical nature used to make critical decisions. The system includes several redundant, independent fault tolerant mechanisms for each hazardous condition, detection and display.
SCC 3 – Redundant Fault Tolerant
S/W issues commands over safety hardware system, and Requires control entity to complete command function, and System includes REDUNDANT, INDEPENDENT fault tolerant
mechanisms
Generates information of safety-critical nature to make decisions, and
System includes several REDUNDANT, INDEPENDENT fault tolerant mechanisms
AssignSCC
55©2015 All Rights Reserved 12 May 2015
SOFTWARE CONTROL CATEGORIES
Level Name Description
4 Influential Software generates information of a safety-related nature used to make decisions by the operator, but does not require operator action to avoid a mishap.
SCC 4 – Influential
Generates information of safety-related nature which is used by the operator to make decisions but
Does not require operator action to avoid a mishap
Safety Warning: Spare Battery Overheating
Shutdown Automatically Activated
-No Operator Action Required
AssignSCC
56©2015 All Rights Reserved 12 May 2015
SOFTWARE CONTROL CATEGORIES
Level Name Description
5 No Safety Impact
(NSI)
Software functionality that does not possess command or control authority over safety-significant hardware systems, subsystems, or components and does not provide safety-significant information. Software does not provide safety-significant or time sensitive data or information that requires control entity interaction. Software does not transport or resolve communication of safety-significant or time sensitive data.
SCC 5 – No Safety Impact
S/W does not possess command or control authority over safety-significant hardware systems, subsystems, or components and does not provide safety-significant information.
S/W does not provide safety-significant or time sensitive data or information that requires control entity interaction.
S/W does not transport or resolve communication of safety-significant or time sensitive data.
The Safety Engineer should review ALL software functionality to determine safety-significance and designate NSI or SwCI 1-4 in HTDB
AssignSCC
57©2015 All Rights Reserved 12 May 2015
Assign SwCI
From MIL-STD-882E Section 4.4.1
One Severity Category is combined with one SCC to derive a Software Criticality Index (SwCI)
A SwCI is assigned to each SSSF mapped to the software design architecture
Software with SwCI 2 through SwCI 4 typically require progressively less design, analyses, and test rigor than high criticality (SwCI 1) software
SOFTWARE SAFETY CRITICALITY MATRIX
SEVERITY CATEGORY
SOFTWARE CONTROL
CATEGORY
Catastrophic (1)
Critical (2) Marginal (3)Negligible
(4)
1 SwCI 1 SwCI 1 SwCI 3 SwCI 4
2 SwCI 1 SwCI 2 SwCI 3 SwCI 4
3 SwCI 2 SwCI 3 SwCI 4 SwCI 4
4 SwCI 3 SwCI4 SwCI 4 SwCI 4
5 SwCI 5 SwCI 5 SwCI 5 SwCI 5
AssignSwCI
58©2015 All Rights Reserved 12 May 2015
Define Level of Rigor Tasks
The SwCI is used in Software System Safety Analysis to define the Level of Rigor (LOR) Tasks which specify the amount and type of analyses and testing required to assess the Software’s Contribution to the System Level Risk
From MIL-STD-882E Section 4.4.1
SOFTWARE SAFETY CRITICALITY MATRIX
SEVERITY CATEGORY
SOFTWARE CONTROLCATEGORY Catastrophic (1) Critical (2) Marginal (3) Negligible (4)
1 SwCI 1 SwCI 1 SwCI 3 SwCI 4
2 SwCI 1 SwCI 2 SwCI 3 SwCI 4
3 SwCI 2 SwCI 3 SwCI 4 SwCI 4
4 SwCI 3 SwCI4 SwCI 4 SwCI 4
5 SwCI 5 SwCI 5 SwCI 5 SwCI 5
SwCI LEVEL OF RIGOR TASKS
SwCI 1 Program shall perform analysis of requirements, architecture, design and code; and conduct in-depth safety-specific testing.
SwCI 2 Program shall perform analysis of requirements, architecture, and design; and conduct in-depth safety-specific testing.
SwCI 3 Program shall perform analysis of requirements and architecture; and conduct in-depth safety-specific testing.
SwCI4 Program shall conduct safety-specific testing.
SwCI 5 Once assessed by safety engineering as Not Safety, then no safety specific analysis or verification is required.
DefineLOR
Tasks
59©2015 All Rights Reserved 12 May 2015
Tailored LOR Tasks
LOR Tasks are generalized in MIL-STD-882E so they can be tailored to add additional clarity to specific LOR Tasks
The Safety Engineer performs the required analyses & tests as identified in the tailored LOR Tasks
From AOP 52 Table 3-5 Level of Rigor Matrix
TailorLOR
Tasks
SwCI 1 – High Risk
SwCI 2– Serious Risk
SwCI 3 -Moderate Risk
SwCI 4- Low Risk
SwCI 5 -No Safety Risk
60
PHASESwCI DESIGN CODE UNIT TEST INTEGRATING
UNIT TESTSYSTEM
INTEGRATION
1 -High Risk
• Design Team Review
• Safety Review • SCF Linked to
SW Reqmts• SCF Linked to
Design Architecture
• Safety Fault Tolerant Design
• Design Code Walkthrough
• Independent Code Review
• Safety Code Analysis
• SCF Code Review
• Safety Fault Detection, Fault Tolerance
• Test Case Review
• Independent Test Review
• Failure Mode Effect Testing
• Safety Test Result Review
• Test Case Review
• Independent Test Review
• Failure Mode Effect Testing
• 100% Regression Testing
• Safety Test Result Review
• Test Case Review
• Independent Test Review
• Failure Mode Effect Testing
• 100% Regression Testing
• Safety Test Result Review
©2015 All Rights Reserved 12 May 2015
Tailored LOR Tasks:SwCI 1 – High Risk
From AOP 52 Table 3-5 Level of Rigor Matrix
TailorLOR
Tasks
61©2015 All Rights Reserved 12 May 2015
Tailored LOR Tasks:SwCI 2 – Serious Risk
From AOP 52 Table 3-5 Level of Rigor Matrix
TailorLOR
Tasks
PHASESwCI DESIGN CODE UNIT TEST INTEGRATING
UNIT TESTSYSTEM
INTEGRATION
2 –Serious Risk
• Design Team Review
• Prioritizing Safety Review
• SCF Linked to SW Reqmts
• SCF Linked to Design Architecture
• Design Code Walkthrough
• Independent Code Review
• Safety Code Analysis to Prioritized Modules
• SCF Code Review
• Safety Fault Detection, Fault Tolerance
• Test Case Review
• Independent Test Review
• Failure Mode Effect Testing
• 100% Thread Testing
• Safety Test Result Review
• Test Case Review
• Independent Test Review
• Failure Mode Effect Testing
• 100% Regression Testing
• Safety Test Result Review
• Test Case Review
• Independent Test Review
• Failure Mode Effect Testing
• 100% Regression Testing
• Safety Test Result Review
62©2015 All Rights Reserved 12 May 2015
Tailored LOR Tasks:SwCI 3 – Moderate Risk
From AOP 52 Table 3-5 Level of Rigor Matrix
TailorLOR
Tasks
PHASESwCI DESIGN CODE UNIT TEST INTEGRATING
UNIT TESTSYSTEM
INTEGRATION
3 -Moderate Risk
• Design Team Review
• Minimal Safety Review
• SCF Linked to SW Reqmts
• SCF Linked to Design Architecture
• SCF Code Review
• Safety Fault Detection, Fault Tolerance
• Test Case Review
• Independent Test Review
• Failure Mode Effect Testing
• Safety Test Result Review
• Test Case Review
• Independent Test Review
• Failure Mode Effect Testing
• Safety Test Result Review
• Test Case Review
• Independent Test Review
• Failure Mode Effect Testing
• Safety Test Result Review
63©2015 All Rights Reserved 12 May 2015
Tailored LOR Tasks:SwCI 4 – Low Risk
From AOP 52 Table 3-5 Level of Rigor Matrix
TailorLOR
Tasks
PHASESwCI DESIGN CODE UNIT TEST INTEGRATING
UNIT TESTSYSTEM
INTEGRATION
4 -Low Risk
• Design Team Review
• Minimal Safety Review
• Normal Software Design Process IAW Software Development Plan (SDP)
• No specific tasks
• Test Case Review
• Independent Test Review
• Safety Test Result Review
• Test Case Review
• Independent Test Review
• Safety Test Result Review
• Test Case Review
• Independent Test Review
• Safety Test Result Review
64©2015 All Rights Reserved 12 May 2015
Tailored LOR Tasks:SwCI 5 – No Safety Risk
From AOP 52 Table 3-5 Level of Rigor Matrix
TailorLOR
Tasks
PHASESwCI DESIGN CODE UNIT TEST INTEGRATING
UNIT TESTSYSTEM
INTEGRATION
5 -No Safety Risk
• Normal Software Design Process IAW Software Development Plan (SDP)
• Normal Software Design Process IAW SDP
• Normal Software Design Process IAW SDP
• Normal Software Design Process IAW SDP
• Normal Software Design Process IAW SDP
65©2015 All Rights Reserved 12 May 2015
Develop Technical Evidence: Verify & Validate
Identified Hazards cannot be closed until their mitigating SSRs are successfully verified and validated Verification and Validation (V&V) process is key factor in
allowing hazards to be closedValidate SSRs Evaluate the System and Software components during the
development process to determine whether SSRs are metVerify SSRs Verify that the SSRs successfully prevent or mitigate their
related hazards through: Examination or Inspection Analysis Demonstration Testing
Update the Requirements Traceability Verification Matrix (RTVM)
DevelopTechnicalEvidence
66©2015 All Rights Reserved 12 May 2015
Develop Technical Evidence: Safety Code Analysis
Item Concern
Adherence to SRRs Verify that the code corresponds to the SRRs & Design Requirements
Syntax Ensure correct use of syntax for specific language
Logic Identify logic & structure errors
Data Review data tables & variables in safety-critical threads within code modules
Interfaces Review message traffic & data transfers to ensure correct & timely transfer of information
Measurement Run software metrics to identify design concerns – deep nesting, long functions, etc.
Tagging Verify Safety Significant Code modules are tagged
Requirements Ensure all SSRs, SSSFs, and Safety Critical code have been successfully tested
Constraints Evaluate if design constraints are reasonable; e.g., Exception handling; bounds on data structures; race conditions; control flow
Safe Coding Practices Evaluate for known unsafe techniques such as multi-threading
Coding Standards Evaluate adherence to safety coding standards (e.g., STANAG 4404)
The Safety Engineer evaluates the high-criticality software for items in table below using available analysis tools & visual inspection
When a problem is found, a Trouble Report is written & corrective action taken
DevelopTechnicalEvidence
67©2015 All Rights Reserved 12 May 2015
FHA: Develop Technical Evidence: Compliance to Coding Standards
Item ConcernUnexpected Jumps
Ensure no unexpected jumps to Arbitrary Locations (e.g., Go To). Note: In ‘C’, if no matching statement is found to the switch expression and if there is no default, then no action is performed.
Overwrites Since store access is implicitly via pointers & pointers can be formed without restriction, very few forms of access & assignment can be shown to be safe.
Semantics Consider the validity of the logic statement. Note, in ‘C’, some semantics cannot be determined at compile-time and must be evaluated at run-time.
Math Math results. E.g., unsigned integer arithmetic is modulo word length without overflow detection; sign of result of integer division not defined in ’C’; programmer must use signal to handle division by zero.
Operational Arithmetic
In ‘C’ action taken on integer overflow is undefined. Ensure overflow is handled (e.g., using a signal in recovery logic).
Data Typing Variable initialization; ensure no overwriting of data by casting (i.e., converting one data type to another).
Memory Allocation
Ensure memory exhaustion is handled. Note that recovery handling is programmer-specified since in ‘C’ there is no specific signal for overflow.
Exception Handling
‘C’ defines 6 signals for recovery from a detected event (divide by 0). There is no requirement that the corresponding event is detected so use of the signal will depend on implementation of the detection.
Throughout development, the Software and Safety Engineers analyze the code to ensure that the code meets performance and safety standards in DoD-accepted standards (e.g., AOP 52)
Example: Emphasis is placed on inspection of the following for a program written in ‘C’:
DevelopTechnicalEvidence
68©2015 All Rights Reserved 12 May 2015
Develop Technical Evidence: Map Hazards to Software Safety Tests
Safety Critical code and requirements mapped to associated Test Plan and Test Procedures– Test Plan and Test Procedures reviewed
• Inputs provided to Test Team– Documented in Verification Report (VR)
Verify Test Case adequately addresses the requirement from a mishap/hazard perspective
If tests are insufficient, a Trouble Report is written and changes to the existing test or the creation of a new test is required
Review outcome of testing to ensure no anomalies exist that contribute to mishap/hazard– All test results reviewed with no safety critical failures occurring during FQT
DatabaseContains SafetyRequirements
Test PlanIdentifies Test Cases
used to validate each Requirement
Test ProceduresContains detailed
steps for performing each Test Case
Map Hazards to Safety TestsDevelopTechnicalEvidence
69©2015 All Rights Reserved 12 May 2015
Develop Technical Evidence: (Objective Quality Evidence)
SSR SCF SwCI Verification Method Comments
SwCI RA AA DA CA Test
While in PSuM Search, the mount position commands shall be limited to not exceed the AAW Search slew rate or the maximum EO Stab slew rate.
Mount Movement
1 X NP*
X P P Requirement Modified SRR #4; Safety Test Case SA-324 being developed;* No change to existing Architecture
The System shall synchronize to the network time server before allowing the external connections.
External Interfaces
1 X X X X X IDD Analyzed &Ada & C Code Analyzed 1/13-1/17/2015Safety Test Case SA-192 Passed 3/16/2015
The System shall accept Weapon State Control for both Anti-Air Warfare (AAW) and Surface Modes:a) Weapon Safe (AAW Only)b) No Firec) Permit Fire
Fire Interlocks
1 X X X X X Ada & C Code Analyzed 11/10-11/13/2014Safety Test Case SA-131 Passed 2/13/2015
The WCC software shall contain a Periodic System Operability Test titled Search Antenna Beam Control Periodic System Operability Test 17 (PSOT 17) that is initiated by using a SOT select number of 17 with the standard CODE 30 entry method.
System Radiation
2 X X P NP P Requirement Modified SRR #5; Safety Test Case SA-327 being developed
Key: RA – Requirements Analysis CA – Code Analysis P – Planned AA – Architecture Analysis SCF – Safety Critical Function NP – Not Planned DA – Design Analysis SwCI – Software Criticality Index X - Completed CA – Code Analysis Tasks in Green are Completed
DevelopTechnicalEvidence
70©2015 All Rights Reserved 12 May 2015
Evaluate System RiskLOR Tasks Unspecified or Incomplete
EvaluateSystem
Risk
71©2015 All Rights Reserved 12 May 2015
FHA: Evaluate System RiskSoftware Contribution to System Risk
From MIL-STD-882E Section B.2.2.5.d(2)
Risks associated with system hazards that have software causes and controls may be acceptable based on evidence that hazards, causes, and mitigations have been identified, implemented, and verified
If the software design does not meet safety requirements, then there is a contribution to risk associated with inadequately verified software hazard causes and controls
EvaluateSystem
Risk
72©2015 All Rights Reserved 12 May 2015
Evaluate System RiskSummary
After completion of all software safety engineering analysis, software development and LOR tasks, results will be used as evidence or input to assign software’s contribution to the System’s risk
Safety, Systems Engineering and Development Team shall:- Evaluate results of safety verification activities - Perform an assessment of confidence for each SSR and SSSF
- The relationship between the software system safety activities, system hazards and risk is illustrated below:
Recommend Estimating Severity, Probability, and Risk Assessment Code (RAC) using processes described in the SSPP and MIL-STD-882E going forward
From MIL-STD-882E Appendix B.2.2.5.d(1)
EvaluateSystem
Risk
73©2015 All Rights Reserved 12 May 2015
The End