failure mode effects and criticality analysis (fmeca)/fault ... · web view2012/12/12  · log 211...

80
Log 211 Supportability Analysis Student Guide Lesson 6: Failure Mode Effects and Criticality Analysis (FMECA)/Fault Tree Analysis (FTA) Content Slide 6-1. Failure Mode Effects and Criticality Analysis (FMECA)/Fault Tree Analysis (FTA) Welcome to Lesson 6 on Failure Mode Effects and Criticality Analysis (FMECA) and Fault Tree Analysis (FTA). January 2013 Final v1.3 1 of 80

Upload: others

Post on 26-Jan-2021

12 views

Category:

Documents


0 download

TRANSCRIPT

LOG 211 Supportability Analysis

Student Guide

Log 211 Supportability Analysis

Student Guide

Failure Mode Effects and Criticality Analysis (FMECA)/Fault Tree Analysis (FTA)

Content

Slide 61. Failure Mode Effects and Criticality Analysis (FMECA)/Fault Tree Analysis (FTA)

Welcome to Lesson 6 on Failure Mode Effects and Criticality Analysis (FMECA) and Fault Tree Analysis (FTA).

Introduction

Content

Slide 62. Topic 1: Introduction

Technology Maturation & Risk Reduction

Slide 63. Life Cycle Management Framework: Where Are You? What Influence Do You Have?

Failure Mode Effects and Criticality Analysis (FMECA) and Fault Tree Analysis (FTA) are critical for effective system design that meets Reliability, Maintainability, and performance requirements. Both analyses identify system failures and causes and recommended mitigation strategies to reduce the risk of failure.

The FMECA and FTA are fundamental in validating the design. Failures, their consequences, and their mitigation are essential to influencing the design for Supportability. The maximum benefit of completing FMECA and FTA is realized when the investigation of failures is conducted during the Technology Maturation and Risk Reduction (TMRR) and Engineering and Manufacturing Development (EMD) phases of a system’s life cycle rather that after the system’s design is finalized.

Failure modes and their mitigation are validated through the following reviews:

Alternative Systems Review (ASR)

System Functional Review (SFR)

Preliminary Design Review (PDR)

Critical Design Review (CDR)

Developmental Test and Evaluation (DT&E)

Functional Configuration Audit (FCA)

Production Readiness Review (PRR)

Physical configuration Audit (PCA)

Operational Test and Evaluation (OT&E)

Where Are You?

FMECA/FTA analyses occur continuously as a system’s design matures and operational data is gathered from the field.

For competitive prototypes, the initial analysis of system failures, failure mechanisms, and criticality begins in the Technology Development Phase. The earlier these analyses are conducted, the more opportunity to eliminate or mitigate failures through design.

FMECA/FTA are then conducted again during Engineering & Manufacturing Development, as more data become available with system maturity.

Finally, FMECA/FTA are revisited, when required, during Operations & Support, when additional fault data is collected or critical incidents occur which require further investigation into root causes.

What Influence Do You Have?

The Reliability Engineers conduct FMECA and FTA. The Life Cycle Logistician (LCL) plays a prominent role in reviewing the maintenance planning recommendations and modifications that result from these analyses for effectiveness and suitability. The LCL understands each analysis and how they are interrelated, the more impact the LCL will have on achieving an effective and affordable Product Support Strategy.

This role is detailed in Lesson 9: The Maintenance Task Analysis (MTA).

Content

Slide 64. FMECA/FTA Lesson Approach

The Set Up, Analyze, and Report Findings approach, as shown on this slide, will frame the discussion on FMECA/FTA. This lesson will provide a detailed description of each of these three process steps.

FMECA Key Questions

How can the system fail?

What are the consequences of failure?

FTA Key Questions

Given a single, undesirable event (usually a failure with serious or catastrophic consequences), what is the cause or combination of causes?

What is the probability of that critical event?

What design or maintenance changes will increase system Reliability and prevent the critical failure?

Content

Slide 65. Topics and Objectives

Overview of FMECA and FTA

Content

Slide 66. Topic 2: Overview of FMECA and FTA

Slide 67. What Are FMECA and FTA?

The Failure Mode and Effects Analysis (FMEA) is a Reliability evaluation and design review technique that examines the potential failure modes within a system to determine the effects of failures on equipment or system performance. Each hardware and software failure mode is classified according to its impact on system operating success and personnel safety. The FMECA’s ‘C’ is for Criticality, which assigns a criticality rating based on severity of impact and frequency. Some level of expert judgment is required to assign criticality rankings.

FMECA analysis is a “bottom up” system analysis. This approach begins looking at the effects of failure at the lowest level of the system hierarchy, and tracing upwards to determine the end effect of each failure on system performance.

Fault Tree Analysis (FTA) is a systematic methodology for defining a single undesirable event and determining all possible reasons (combination of failures) that could cause the event to occur in a “top down” analysis. The FTA focuses on a select subset of failures, specifically those that can cause a catastrophic “top event”, while the FMECA progresses sequentially through all possible system failure modes regardless of severity.

Content

Slide 68. FMECA/FTA: Process Map

FMECA and FTA promote greater understanding of the system design, from identifying design deficiencies to improving maintenance process effectiveness.

Content

Slide 69. What Are FMECA/FTA? Influencing Design

FMECA and FTA provide uniform methods for analyzing failures and their effects before finalizing the design. The goal is to improve the system to achieve Reliability and safety requirements effectively and affordably.

Specifically, FMECA and FTA evaluate the system against:

Design requirements

Design criteria

Performance requirements

FMECA/FTA Reliability, safety, and design analyses assess the validity of design enhancements to assure Reliability and critical safety issues are appropriately mitigated or eliminated.

FMECA/FTA are conducted continuously as part of the closed loop Systems Engineering process defined in Lesson 5: R&M.

Content

Slide 610. What Are FMECA/FTA? Promoting Supportability & Process Efficiency

In addition to recommending design changes to eliminate or mitigate failure modes, FMECA/FTA map failures to corrective and preventive maintenance strategies that reduce the likelihood and mitigate the impact of system failures.

FMECA/FTA provide data for:

Reliability and Maintainability Analyses (e.g., reliability block diagrams)

Reliability Centered Maintenance (RCM) Analysis

Maintenance Task Analysis (MTA)

Level of Repair Analysis (LORA)

Additional FMECA/FTA refinements

Root failure analysis (diagnostic routines for fault detection and fault isolation)

Determining useful life of a system

Developing built-in test, troubleshooting, and quality assurance methods

Developing maintenance manuals and troubleshooting guides

Content

Slide 611. What Are FMECA/FTA? Inputs and Outputs

This diagram provides a high-level view of the inputs, process, and outputs of both FMECA and FTA.

Content

Slide 612. FMECA/FTA and the ASOE Model

FMECA and FTA are the foundation of the Affordable System Operational Effectiveness (ASOE) Model, performing the following functions:

Determining what drives system failures

Assessing failure criticality/impact on system Availability and safety

Recommending remediating action

These FMECA and FTA attributes contribute to ASOE by exposing and prioritizing design flaws early to assure design optimization and mission effectiveness, while reducing Life Cycle Cost/Total Ownership Cost.

Content

Slide 613. ASOE Trade-off: Capability vs. Maintenance

FMECA and FTA serve to balance design effectiveness and process efficiency by mitigating failures early in the design process to achieve an affordable solution:

Does the design meet all requirements in the CDD? Does the design meet the KPPs?

What redesign efforts should be undertaken to mitigate failure modes that prohibit achieving technical performance and mission requirements? Note that reliance on a Maintainability-focused maintenance strategy may not mitigate failure modes.

Trade-off considerations:

The cost of redesign vs. the risk/probability of mission failure

The cost of proactive maintenance vs. the probability of system failure or safety hazard to personnel

Set Up – Preparing for FMECA and FTA

Content

Slide 614. Topic 3: Set Up – Preparing for FMECA and FTA

Slide 615. Set Up – FMECA & FTA

Set Up is similar for both FMECA and FTA: each requires up-front planning and selection of an appropriate tool to conduct the analyses. Additionally, FMECA and FTA draw from similar data inputs.

Content

Slide 616. Build a Plan: Process and Data Management

SAE GEIA-STD-0007

Planning for FMECA/FTA should include the phases of Set Up, Analysis, and Report Findings, and should consider initial and iterative analyses based on design updates and field data.

Failure Mode Effects and Criticality Analysis Planning

FMECA planning includes:

Ground rules & assumptions

FMECA approach (hardware, software, functional, combination)

Lowest indenture level for analysis. Guidelines:

Lowest level specified in LSA candidate list

Lowest level assigned Level I (Catastrophic) and Level II (Critical) severity category

Specified/intended maintenance and repair level for items assigned Level III (Marginal) or Level IV (Minor) severity

Contractor’s procedures for implementing requirements

General statements on what constitutes a failure (performance parameters and allowable limits)

Use of analysis to provide design guidance

Contractor’s procedures for updating FMECA with design changes

FMECA worksheet formats (organization and documentation of FMECA methods)

Coordination of effort (FMECA results are inputs into other analyses)

Failure rate data sources

Coding system (identification of system functions/equipment for tracking failure modes)

Fault Tree Analysis Planning

FTA uses a similar planning methodology to FMECA. However, the FTA is geared toward the most significant or catastrophic failure events. Planning should incorporate provisions of DoD RAM Guide and MIL-STD-882D Standard Practice for System Safety, with particular emphasis on Appendix A (Guidance for Implementation of a System Safety Effort).

By keeping the safety program in view, the FTA will naturally link to the safety performance requirements, to include:

Quantitative requirements

Mishap risk requirements

Safety design requirements—interlocks, redundancy, fail safe and fire suppression

Unacceptable condition elimination

Reduction of mishap risk to acceptable level

FTA planning should include considerations for:

Functional analysis of highly complex systems

Observation of combined effects on the top event

Evaluation of safety requirements and specifications

Evaluation of system Reliability, human and software interfaces

Evaluation of potential corrective actions

Simplification of maintenance and troubleshooting

Logical elimination of causes for an observed failure

Role of the Integrated Product Team (IPT)

Members of the IPT team include engineering, design, logistics, and maintenance professionals, who contribute their expertise for FMECA/FTA analysis. During Set Up, the IPT:

Identifies roles: Who is doing what?

Defines analysis goal

Defines schedule/timeline

Establishes Working-level Integrated Product Team (WIPT) expectations, roles and objectives

Establishes report processes: FMECA/FTA worksheets, preliminary updates and final reports.

Coordinates SAE GEIA-STD-0007 Logistics Product Database update process

Content

SAE GEIA-STD-0007

Slide 617. Determine Data Inputs and Analysis Tools

Analysis Inputs for FMECA and FTA:

1. System configuration and design characteristics

Identify system functions down to lowest indenture identified

Identify each item/configuration and its performance requirements

Types of data:

Engineering data, studies, drawings

Technical specifications/development plans

Design reports, data

Functional block diagrams/schematics

Commercial off-the-shelf (COTS)/Government Furnished Equipment (GFE): Vendor information

COTS/GFE: Original equipment manufacturer (OEM)

Developmental Testing results

Test result reports

Engineering investigation reports

Failure investigation reports

Modeling and simulation data

Reliability inputs—Reliability Analyses

Reliability characteristics of system

Mean Time Between Failure (MTBF)

Failure characteristics: PF curve, wear out, random

Time to Failure (calculated or estimated) for non-reparable items

Failure mode occurring within service life of equipment

Reliability Block Diagrams (RBDs)

Reliability data

MIL-HDBK-217 prediction

Operational data/test data (given similar conditions/ items)

Safety and Hazard Analysis (MIL-STD-882D) (Human Systems Integration)

Troubleshooting guides/charts for existing equipment

Subject Matter Experts with knowledge of equipment and operating context

Operator

Maintainer

In-service engineering agent – The activity that performs sustaining engineering requirements

Technical representative – Called a ‘Tech Rep,’ Normally a master level technician from the OEM or In service engineering organization that troubleshoots complex faults and updates troubleshooting procedures for the entire agency.

Program Manager

COTS/GFE Only: Maintenance history

Existing/previous maintenance plans/tasks

Existing/previous maintainer/operator manuals

In-service performance data

Age exploration data

Item repair histories

Failure reporting/corrective action system reports

Computerized Maintenance Management System (CMMS) data

Previous FMECA, FTA, RCM analyses

Failure Reporting, Analysis, and Corrective Action System (FRACAS)

FRACAS is system of reporting and analyzing failures, recommending corrective action

Developed from Test & Evaluation (T&E) events and field failure/repairs

Common data captured in FRACAS include field MTTR, MTBF, Reliability growth, failure analysis (incident, type, location, root cause, etc.)

Production inspection records after the system is fielded

FMEA/FMECA/FTA Tool Sets

Spreadsheet template (FMEA/FMECA)

LSAR (SAE GEIA-STD-0007 compliant tools): SLICwave, powerLOG-J, EAGLE, Omega (FMEA/FMECA)

Data management and reporting

Item analysis and failure criticality calculation

Windchill Quality Solutions—(FMEA/FMECA/FTA)

Data management and reporting

FMECA functionality to identify failures and plan for mitigation

RCM++ (FMECA/FMECA/RCM)

Data management and reporting for RCM Analysis

Full-featured FMEA/FMECA functionality

Maintenance task selection

Optimal interval calculation for preventive repairs/replacement

Cost comparison

Supports industry standards for RCM (e.g., ATA, MSG-3, SAE JA1011 and SAE JA1012)

MPC: Maintenance Program Creation Software (FMEA/FMECA/RCM)

MSG-3-compliant maintenance creator tool for aircraft/aerospace industry

Analyses included for significant items, functions, failure modes, effects, causes, and tasks

Analysis – FMECA

Content

Slide 618. Topic 4: Analysis – FMECA

Content

Slide 619. Analysis – FMECA

FMECA primarily examines hardware failures, both critical and non-critical. Analysis candidates include components (parts), systems/subsystems, processes, and functions.

A person knowledgeable of the application and operation of the system, such as a design or Reliability Engineer, typically conducts the analysis, because experience-based judgment is required to assign effectively the criticality factors.

Content

Slide 620. FMECA Analysis: Process Map

FMECA consists of two analyses:

Failure Mode Effects Analysis (FMEA)

Analytical Process

Functions: Defines the intended purpose of the system under analysis

Functional Failure: Defines what constitutes a failure of the system to perform its function

Failure Modes: Identifies potential ways that functional failure may occur (failure modes) and the root causes for the failure modes (failure mechanisms)

Effect: Assesses impact (effects) of each failure mode on equipment and entire system performance (higher-level systems)

Analysis begins at lowest level of indenture, then works up to successively higher system levels

Examines single-point failures (versus impact of multiple/simultaneous/combined failures)

Criticality Analysis (CA)

Analyzes severity of effects of the failure mode

Analyzes probability of occurrence of the failure mode

Ranks failure modes by severity and probability

FMECA may approach analysis in two ways:

Hardware analysis: The FMECA evaluates individual hardware items and their failure modes.

Functional analysis: In this approach, the function and outputs of each item are evaluated. Often, this approach is used when individual hardware items cannot be uniquely identified.

Note: Complex systems may use both hardware and functional analyses.

Content

Slide 621. Define System to Analyze: FMECA

In order to conduct FMECA, clearly and thoroughly define the system under analysis, including:

Mission functions (tasks and outputs) and operational mode

Environment, mission, times, equipment utilization, functions and outputs of each item

System restraints

Internal and interface functions for each item

Lowest indenture level to be analyzed

Performance requirements down to lowest indenture level to be analyzed

Failure definitions (in general vs. specific failures)

System definition also includes constructing functional block diagrams, which illustrate the operation, interrelationships, and interdependencies between functions of a system. In short, they illustrate the functional flow of a system, which is then used to determine failure impact on the various levels of indenture. Diagrams may be functional or reliability block diagrams.

Content

Slide 622. Define Functions: What Should the System Do?

The first step in FMEA portion of FMECA is to define the functions of the system or component under review.

What is the desired capability of the system (task)?

How well must the system perform, based on user needs (upper and lower limits)?

Under what circumstances must the system perform?

When describing functions, identify primary and secondary functions:

Primary function: Main reason the item exists

Secondary function: Additional functions the item is required to perform, such as:

Warning or status indicators

Safety functions

Fluid containment

Comfort and aesthetics

Environmental protections

Controlling features

“Do not combine” functions

When describing functions:

Define operating context/scenarios

Use clear, concise language

Use verb, direct object, and specific limits

Description of functions are found in:

Performance specifications

Operating and Maintenance manuals

Engineering Drawings and Lists

Reliability Block diagrams

Content

Slide 623. Define Functional Failures: How Does the System Fail to Perform?

Functional failure is performance that falls outside specified parameters. This failure may be total or partial.

When describing functional failures:

Restate defined function

Define all possible functional failures for each system function

Give upper and lower limits of failure, if different from functional criteria

Include compensating provisions for failure, which are used to determine failure effects, severity, and consequences:

Redundant systems

Safety devices

Operator actions to mitigate failure

Content

Slide 624. Define Failure Modes & Causes: Why Does the Failure Occur?

Failure modes are all the causes for a functional failure that may occur. Failure mechanisms identify all possible root causes for each failure mode.

Failure Modes (Failure Conditions)

Typical failure conditions, or modes, include:

Failure to operate at required time

Failure to stop operating

Operating before or after required time

Inconsistent operation

Degraded capability

Keep the following in mind when identifying failure modes:

Be descriptive and specific (e.g., failure, part, location, event, timing, mission/operational phase, etc.)

List failure modes separately when they vary by effects, rates, detection methods, possible failure management strategies

When combining similar failure modes, design preventive maintenance around the most severe consequence and combined rates

Failure Mechanisms (Root Causes of Failure Modes)

List all possible causes of failure mode:

Why does the component fail to operate at required time?

What causes the component to stop operating?

What causes the component to operate before the required time, or after the required time?

Why is operation inconsistent?

What may cause degraded capability?

Note: Diagram displayed on slide is an Ishikawa, or fishbone, diagram. Its purpose is to show causes of a specific event.

Content

Slide 625. Analyze Failure Effects: What Are Impacts on the System?

Failure effects describe the impact of the effects of a failure mode on the functional capability of the system under analysis. In other words, what happens when a component or system fails to function and how serious are those consequences?

The impact of primary failures, and their secondary effects, are assessed at three levels of indenture:

Local

Effect of failure mode on the item under analysis

This item is the focus of compensating provisions and other corrective and preventive maintenance actions

Next Higher

Effect on next higher level of indenture

Effect on system/subsystem

End Item

Effect on the system/asset, or the ‘”System of Systems”

Keep in mind the following when describing failure effects:

Include description of effect severity

Include detail to accurately assess the consequences of the failure

Describe effects on personnel safety, environment, mission, assets, economics

Describe operating context (e.g., mission usage/profiles)

List different effects based on usage scenarios

Describe operator/maintainer methods to detect failure occurrence, including means (e.g., visual/audible warnings, sensors, Built-In-Test)

Describe operator/maintainer actions to restore function (assuming no existing preventive maintenance tasks)

Describe existing compensating provisions, if applicable

Content

Slide 626. Failure Impact: Strike Talon RDB Example

This slide presents indenture levels B and C of the Strike Talon UAV. Using these reliability block diagrams, what is the impact of a failure of one Card Crypto on the UAV systems?

Content

Slide 627. Determine System Effects: powerLOG-J

FMECA results are documented directly in the SAE GEIA-STD-0007 Logistics Product Database, powerLOG-J in the Strike Talon case study.

Content

Slide 628. Qualitative Criticality Analysis: How Severe Are the Failure Effects?

Criticality of a failure mode is based on the severity of the effect of that mode on the end item and the probability, or frequency, of that failure’s occurrence (Mean Time between Failure).

The purpose of criticality analysis is twofold:

Measure worst case effect of a failure or design error

Determine priority for correcting issues (design changes or corrective/preventive maintenance to mitigate critical failures)

While criticality is defined by your specific organization’s policy and contract terms, general categories of severity are:

Category I – Catastrophic

Death, destruction, significant breach of environmental regulation, damage over $1 million, downtime > 2 days

Category II – Critical

Severe personal injury, major property/system damage >$100K, inability to perform critical mission (mission loss), downtime 24 hours < 2 days

Category III – Marginal

Minor injury, minor property/system damage $1K < $100K, degraded ability to perform a critical mission, downtime 8 < 24 hours

Category IV – Minor

No personal injury, property/system damage <$1K, unscheduled maintenance/repair, downtime <8 hours

Notes:

Categorize the same failure mode differently, based on operating context/phase/scenario.

Involve Human Systems Integration Safety representative (where applicable) to assist in recognizing/classifying events having harmful consequences to people, to equipment, and to the mission.

Criticality Matrix: Severity vs. Frequency

Frequent

> 1 per 1,000 miles

Probable

> 1 per 20,000 miles

Occasional

> 1 per 50,000 miles

Remote

> 1 per 80,000 miles

Improbable

< 1 per 100,000 miles

Catastrophic

High

(red)

High

(red)

High

(red)

Medium

(yellow)

Acceptable

(green)

Critical

High

(red)

High

(red)

Medium

(yellow)

Low

(light green)

Acceptable

(green)

Marginal

Medium

(yellow)

Medium

(yellow)

Low

(light green)

Acceptable

(green)

Acceptable

(green)

Minor

Acceptable

(green)

Acceptable

(green)

Acceptable

(green)

Acceptable

(green)

Acceptable

(green)

Content

Slide 629. Quantitative Criticality Analysis: What is the Risk Priority Number?

The Risk Priority Number (RPN) is a quantitative ranking approach used in many FMECA and FTA tool sets. The RPN is useful in determining the most significant failure events that are most appropriate for further modeling in the Fault Tree Analysis.

Car Cooling System Risk Priority Number Matrix

Item / Functional Description

Potential Failure Mode

Mode %

Potential Local Effect(s)

Potential End Effect

Severity (S)

Potential Cause(s) of Failure

Occurrence (O)

Current Controls Prevention

Current Controls Prevention

Detection(D)

Risk Priority Number(S*O*D)

Car Cooling System

(Provides Fluid around Engine, Maintains Fluid Temperature within Operating Parameters)

Water Pump Degraded Operation

15.00

Reduced Coolant Fluid Flow

Engine Over Heats

9

Failed Water Pump Belt

5

Check Belts for Proper Tension

Replace Water Pump 60k Miles

8

360

Car Cooling System

Radiator Degraded Operation

15.00

Reduced Coolant Flow; Hot Coolant

Engine Over Heats

6

Clogged Radiator

5

Clean Radiator Every 5 years

Change Fluid Periodically

9

270

Car Cooling System

Fluid Temperature Loss of Control

30.00

Hot Coolant

Engine Over Heats

7

Stuck Thermostat

6

 

 

7

294

Car Cooling System

Cooling Fan Does not Spin

10.00

Hot Coolant

Engine Over Heats

7

Defective Cooling Fan

4

 

 

4

112

Car Cooling System

Leaking Radiator Fluid

30.00

Radiator Fluid Low

Engine Over Heats

8

Radiator Corrosion

6

Change Radiator Fluid Periodically

Clean Radiator Every 5 years

1

48

Content

Slide 630. Determine Criticality: powerLOG-J

Content

SAE GEIA-STD-0007

Slide 631. Analyze & Allocate Failure Modes: powerLOG-J

The Analyze and Allocate task links faults to their maintenance strategies.

A failure mode may have several different root causes, each with varying probabilities. The SAE GEIA-STD-0007 tool allocates the likelihood of each failure mechanism. As a result, a single failure mode may have different triggers, corrective actions, and preventive maintenance tasks, depending on the individual cause.

An individual maintenance task, such as remove and replace a tire, may have several failure modes that would trigger that task. These triggers may be corrective (flat tire) or preventive (replace every 50,000 miles).

Content

Slide 632. Failure Modes Map to Maintenance Tasks

Analysis - FTA

Content

Slide 633. Topic 5: Analysis – FTA

Content

Slide 634. Analysis – FTA

Unlike FMECA, which examines an entire system, FTA focuses on a specific part of the design or a single undesirable or catastrophic event in order to determine the lower level contributors.

FTA:

Is useful with complex functional paths

Is used with software, hardware, and human interface systems

Considers mission profile/operational mode/environment, which impact hardware configuration, functional paths, application stresses, and critical interfaces

Results may include design change or redundancy to mitigate or prevent failure

Content

Slide 635. FTA Analysis: Process Map

Content

Slide 636. Define Undesirable Event

The first step in an FTA is to identify the undesired or catastrophic event to undergo analysis. The undesired event is determined by:

Critical Event

Safety, such as loss of life or aircraft

Operations, such as loss of production or mission

FMECA Results

FMEA unable to identify all effects of a failure mode and, therefore, unable to determine criticality.

FMECA determines that a failure mode is serious, but further analysis is required to determine if the failure is caused by multiple failures, or to determine what combinations of lower level events lead to top event.

Maintenance

Troubleshooting is complex

Engineers with knowledge of the system, or systems analysts with engineering backgrounds, define the event. Examples are:

Design: Flight safety, munitions handling safety, safety of operating/maintenance personnel

Event: Crash of commercial airliner with no survivors

Event: Loss of spacecraft and astronauts on space exploration mission

Event: Vehicle does not start when ignition key is turned

Event: No spray when demanded from containment spray injection system in a nuclear reactor

Content

Slide 637. Define Undesirable Event: Family Car: Critical Failures

This slide presents the criticality of several failure modes of the family car, identified through FMECA.

Content

Slide 638. Construct Fault Tree

Unlike the tabular approach of FMECA, Fault Tree Analysis is graphical. FTA builds a logic diagram depicting parallel and sequential failure events (causes) and their probabilities that result in the top level event.

The top level event is the single undesired or critical event under analysis. Consider the scope of that event when building the diagram:

If the event is too broad, the tree becomes unmanageable

If the event is too narrow, the tree fails to provide managers/engineers with sufficient data to make cost-effective decisions

Describe level of risk or circumstances where event becomes intolerable

Next, identify first level, second level, and third level contributors (causes) to that top event. System analysts/system designers with full knowledge of the system complete a list of causes (faults) to study through the fault tree, numbering and sequencing the faults in order of occurrence.

Faults are the state of the system or component, and can be hardware, human, or other faults. Fault descriptions include what occurs, when, and how.

Primary fault: fails within qualified environment

Secondary fault: fails outside qualified environment

Command fault: human operation of component

Note: Only causes with a probability of 0 or higher of affecting the top event are included in the FTA. Exact probabilities are impossible (due to cost/time); therefore, computer software is often used to conduct analysis.

Logic gates and event symbols represent the relationship between events, linking branches together.

Event Symbols

Illustrate the different types of events (e.g., no fault scenarios)

Symbols include: Rectangle, circle, diamond, triangle, house, oval

Gate symbols

Illustrate the relationship between lower events that lead to the higher event in the sequence

AND Gate: Both input events must occur for event to happen

OR Gate: At least one input event must occur for event to happen

Gate inputs are the lower level fault events

Gate outputs are the higher level fault events

Source of FTA image: www.edrawsoft.com

Content

Slide 639. Constructing Fault Tree: Family Car: Engine Overheats

Content

Slide 640. FTA Analysis: Qualitative Analysis

Once the fault tree is complete, identify all possible direct and indirect hazards impacting the system and evaluate for possible system improvement.

Qualitative analysis identifies all credible, single and multiple lower level failure modes (causes) that lead to the top level event.

Analyzes multiple failures/combinations of failures

Analyzes events in parallel and in sequence

Drills down to lowest required fault levels

Describes each fault and when it occurs

Identifies Minimal Cut Sets (MCS) – The shortest paths to failure indicate where system is most vulnerable

Smallest number of basic event combinations that cause the top event

Includes only those failures which are realistic

In an MCS, all failures are needed to create top event (if one event does not occur, top event does not occur)

Ranks failures

1st: Single-point failures (one failure causes top level event)

2nd: Dual-point failures (two failures in combination cause top level event)

3rd: Three-point failures, etc. (three or more failures in combination cause top level event)

Slide 641. FTA Analysis: Quantitative Analysis

Quantitative analysis determines the probability and frequency of all combinations of lower level events that lead to the top level event, for ranking purposes.

Usually represented in terms of unreliability

Mathematical model (algorithms, MARCOV)

Calculates probability/frequency of top level event, given probability of lower level failure modes leading to the critical failure (i.e., summing probability of minimal cut sets together)

Requires knowing failure rates, down to the lowest level events that lead up to the top level event

Requires component history and lengthy analysis

Result is ranking of failure modes by contribution to top level event

Slide 642. Mitigating Fault Risk through Design

Fault Tree Analyses impact design through a risk mitigation process. By identifying the most probable and critical paths to failure, design and maintenance strategies are devised to meet Reliability requirements effectively.

AND Gate Math: Redundant Thermostat in Model 2

Where Q0(t) is the probability that the overall top event occurs at time t.

Q0(t) = Pr((F(t) G(t))

= qF(t) qG(t)

= 0.6 times 0.6

Q0(t) = 0.36

Reliability = 1 minus Q0(t)

R = 1 minus 0.36

R = 0.64 or 64%

OR Gate Math: Engine Overheats Model 2

Where Q0(t) is the probability that the overall top event occurs at time t.

Q0(t) = Pr(A(t) B(t))

= Pr(A(t) + Pr(B(t) minus Pr(A(t) Pr Pr(B(t))

= qA(t) + qB(t) minus qA(t) times qB(t)

= (0.0676 + 0.005) minus (0.0675 times 0.005)

= 0.0725 minus 0.0003375

Q0(t) = 0.0721625

Reliability = 1 minus Q0(t)

= 1 minus 0.0721625

= 0.9278375 or 92.8% rounded

Note: Changes in design, including changes to Reliability or product structure, must go back through design engineers and applicable RAM-C and RCM Supportability analyses. Updates are then made to the Logistics Product Database. These updates are coordinated through IPTs and are consolidated under the Maintenance Task Analysis to include changes to cage codes, part numbers, MTBF, replacement rates, schedules, tools, and task procedures that result from FMECA/FTA recommendations.

Content

Slide 643. Mitigating Fault Risk through Design, Continued

Content

Slide 644. Concord Disaster – Paris: Tuesday, 25 July, 2000

On Tuesday, July 25, 2000, a Concord crashed shortly after take-off from Paris. All one hundred and thirteen people on board perished.

This slide and the following one present the Fault Tree Analysis conducted during the aircraft mishap investigation to determine the chain of events leading to the catastrophic event.

Select the links:

Concorde Air Crash Investigation - Part 3 (10:00) http://www.youtube.com/watch?v=zHY2PyEwGtg&feature=fvst

Concorde Air Crash Investigation - Part 4 (10:06) http://www.youtube.com/watch?v=Zd0pN0izgF4&feature=fvwrel

Content

Slide 645. Concord Disaster: FTA Continued

Report Findings – FMECA and FTA

Content

Slide 646. Topic 6: Report Findings: FMECA and FTA

Content

Slide 647. Report Findings: FMECA & FTA

Results are summarized in a formal report and disseminated to the IPTs, per contractual requirements. These reports can be preliminary, updates or final, and are often synchronized with design reviews to determine whether the design has been improved such that it will reduce or eliminate significant or catastrophic events.

Content

Slide 648. Report & Implement Findings

Recall the FMECA/FTA process chart. During the Report Findings phase, analysis results are reviewed and approved by the IPT, and applicable data elements are entered into the Logistics Product Database for use in subsequent Supportability analyses, such as Reliability & Maintainability (R&M), previous FTAs, RCM Analysis, and Maintenance Task Analysis (MTA).

SAE GEIA-STD-0007

Content

Slide 649. FMECA Report

The results of FMEA and Criticality Analyses are presented in interim and final reports. Report contents include:

Level of analyses

Results summary of Reliability and safety critical components

System definition

Data sources and analysis techniques

Resultant analysis data

Worksheets for each failure mode:

Identification number

Function

Failure modes and causes

Mission phase and operational mode

Failure effects and their probability

Failure detection method (e.g., audible warning signs, automatic sensing devices)

Compensating provisions

Actions by operator to mitigate impact of failure

Design provisions such as redundant or back-up systems

Severity classification

Ground rules, analysis assumptions, and block diagrams

Indenture level

Ranking of failure modes by severity and probability of effects

Category I and II failures, highlighted

Recommended design changes to eliminate or mitigate consequences of failure, and a review of the effectiveness of these actions

Single point failures

Failures requiring corrective design/mitigating action

Failures not mitigated by design

Interim reports guide design maturation by highlighting:

Category I and II failure modes—ranking failures according to severity of failure on equipment operation and personal safety

Unresolved single-point failures—highlighting areas needing corrective action

Visibility of system interface features and problems

Location of performance monitoring and fault sensing test equipment or test points

Comparison of alternative designs

Content

Slide 650. FTA Report

The FTA report includes:

Executive summary

Scope of analysis (what is and is not analyzed)

System description (brief)

Description/severity bounding of top level event

Analysis boundaries (e.g., physical, operational, human, interfaces)

The analysis

Method of analysis

Software

Fault tree diagram

Data sources

Common causes

Sensitivity tests, if applicable

Cut sets

Path sets, if applicable

Trade studies, if applicable

Findings

Top level event probability

System vulnerability

Primary contributors

Possible actions to mitigate risk

Troubleshooting guidance

Conclusions and Recommendations

Risk comparisons

Additional analyses required, including methods

Content

Slide 651. Report Coordination: IPT Communication Paths

FMECA/FTA results are routed through the appropriate Integrated Product Team (IPT), which is responsible for approval of actions to resolve any issues identified. The specific IPT team accountable for addressing identified problems depends on the recommendation. For example:

Design Interface impacts are reported to:

Test & Evaluation IPT

Product Support Management IPT

Systems Engineering IPT

Maintenance Planning & Management impacts are reported to:

Product Support Management IPT

Systems Engineering IPT

Exercise

Content

Slide 652. Topic 7: Exercise

Content

Slide 653. Exercise Overview

Summary

Content

Slide 654. Topic 8: Summary

Content

Slide 655. Takeaways

Content

Slide 656. Summary

Congratulations! You have completed Lesson 6 on Failure Mode Effects and Criticality Analysis (FMECA) and Fault Tree Analysis (FTA).

6-70 of 71

January 2013

Final v1.3

January 2013

Final v1.3

6-71 of 71