[ieee 2009 annual reliability and maintainability symposium (rams) - fort worth, tx, usa...

5
1-4244-2509-9/09/$20.00 ©2009 IEEE Reliability Analysis Techniques: How They Relate To Aircraft Certification Mark S. Saglimbene, Director Reliability, Maintainability and Safety Engr., The Omnicon Group, Inc., Key Words: R&M in Product Design, Reliability, System Safety SUMMARY & CONCLUSIONS Classic reliability analysis techniques, namely, Reliability Prediction, Fault Tree Analysis (FTA) and Failure Mode Effect Analysis (FMEA) are the framework for the aircraft certification process. These innovative technique have been utilized since the 1990s with the advent of the Society of Automotive Engineer’s Aerospace Recommended Practice 4761 (SAE ARP 4761). Today, SAE ARP 4761 is the defacto standard used for aircraft certification. SAE ARP 4761 draws heavily on reliability techniques with roots steeped in the military programs of the 20 th century. 1 INTRODUCTION Before examining the current aircraft certification process it is important to review each of these analysis techniques. 2 RELIABILITY PREDICTION AS THE BACKBONE OF RELIABILITY ANALYSIS Reliability prediction has been used as a reliability engineering tool for over 50 years. Although reliability prediction is only one element of a well-structured reliability program it is the backbone of these complimentary analyses. However, it is imperative to note that in order to be effective, this process must be complemented by other elements. 2.1 History of Reliability Prediction MIL-HDBK-217 is highly recognized in military and commercial industries. It is probably the most internationally accepted empirical reliability prediction method. The last version is MIL-HDBK-217F, which was released in 1991 and had two revisions: Notice 1 in 1992 and Notice 2 in 1995. The MIL-HDBK-217 predictive method consists of two parts; one is known as the parts count method and the other is called the part stress method [1]. The parts count method assumes typical operating conditions of part complexity, ambient temperature, various electrical stresses, operation mode and environment. The part stress method requires the specific part’s complexity, application stresses, environmental factors, etc. to determine the parts failure rate. MIL-HDBK-217 methodology attempts to calculate the constant failure portion of a components life cycle. It does not deal with early failures or end of life wear-out failures. Figure 1 represents the classic “Bathtub Curve” used to diagram the constant failure rate period in the life of an electronic component. 2.2 Discussion of Empirical Methods Although empirical prediction standards have been used for many years, it is always vital to understand the limitations of the information obtained using these values. The advantages and disadvantages of empirical methods have been frequently debated over the years. A brief summary from the publications in industry, military and academia is presented below. 2.3 Advantages of empirical methods: 1. Easy to use, with the availability of component models exist. 2. Relatively good performance as indicators of inherent reliability. 3. Provide an approximation of field failure rates. 2.4 Disadvantages of empirical methods 1. A large part of the data used by the traditional models is obsolete. 2. Failure of the components is not always a result of component-intrinsic mechanisms but can be caused by the system design. 3. The reliability prediction models are based on industry- average values of failure rate, which are neither vendor- specific nor device-specific. 4. The difficulty in collecting good quality field and manufacturing data, which are needed to define the adjustment factors, such as the Pi factors in MIL-HDBK- 217. 3 FMEA (FAILURE MODE AND EFFECTS ANALYSIS) Failure Mode and Effects Analysis (FMEA) is a systematic analysis approach that identifies potential failure modes in a system. It also identifies critical or significant design or process characteristics that require special controls to prevent or detect failure modes. FMEA is a tool used to prevent problems from occurring. 3.1 History of FMEA The use of FMEA is not a recent method of analysis. This technique has been in existence for many years.. Before any documented format was developed, engineers would try to

Upload: mark-s

Post on 09-Mar-2017

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [IEEE 2009 Annual Reliability and Maintainability Symposium (RAMS) - Fort Worth, TX, USA (2009.01.26-2009.01.29)] 2009 Annual Reliability and Maintainability Symposium - Reliability

1-4244-2509-9/09/$20.00 ©2009 IEEE

Reliability Analysis Techniques: How They Relate To Aircraft Certification

Mark S. Saglimbene, Director Reliability, Maintainability and Safety Engr., The Omnicon Group, Inc.,

Key Words: R&M in Product Design, Reliability, System Safety

SUMMARY & CONCLUSIONS

Classic reliability analysis techniques, namely, Reliability Prediction, Fault Tree Analysis (FTA) and Failure Mode Effect Analysis (FMEA) are the framework for the aircraft certification process. These innovative technique have been utilized since the 1990s with the advent of the Society of Automotive Engineer’s Aerospace Recommended Practice 4761 (SAE ARP 4761). Today, SAE ARP 4761 is the defacto standard used for aircraft certification. SAE ARP 4761 draws heavily on reliability techniques with roots steeped in the military programs of the 20th century.

1 INTRODUCTION

Before examining the current aircraft certification process it is important to review each of these analysis techniques.

2 RELIABILITY PREDICTION AS THE BACKBONE OF RELIABILITY ANALYSIS

Reliability prediction has been used as a reliability engineering tool for over 50 years. Although reliability prediction is only one element of a well-structured reliability program it is the backbone of these complimentary analyses. However, it is imperative to note that in order to be effective, this process must be complemented by other elements.

2.1 History of Reliability Prediction

MIL-HDBK-217 is highly recognized in military and commercial industries. It is probably the most internationally accepted empirical reliability prediction method. The last version is MIL-HDBK-217F, which was released in 1991 and had two revisions: Notice 1 in 1992 and Notice 2 in 1995.

The MIL-HDBK-217 predictive method consists of two parts; one is known as the parts count method and the other is called the part stress method [1]. The parts count method assumes typical operating conditions of part complexity, ambient temperature, various electrical stresses, operation mode and environment.

The part stress method requires the specific part’s complexity, application stresses, environmental factors, etc. to determine the parts failure rate.

MIL-HDBK-217 methodology attempts to calculate the constant failure portion of a components life cycle. It does not deal with early failures or end of life wear-out failures. Figure 1 represents the classic “Bathtub Curve” used to

diagram the constant failure rate period in the life of an electronic component.

2.2 Discussion of Empirical Methods

Although empirical prediction standards have been used for many years, it is always vital to understand the limitations of the information obtained using these values. The advantages and disadvantages of empirical methods have been frequently debated over the years. A brief summary from the publications in industry, military and academia is presented below.

2.3 Advantages of empirical methods:

1. Easy to use, with the availability of component models exist.

2. Relatively good performance as indicators of inherent reliability.

3. Provide an approximation of field failure rates.

2.4 Disadvantages of empirical methods

1. A large part of the data used by the traditional models is obsolete.

2. Failure of the components is not always a result of component-intrinsic mechanisms but can be caused by the system design.

3. The reliability prediction models are based on industry-average values of failure rate, which are neither vendor-specific nor device-specific.

4. The difficulty in collecting good quality field and manufacturing data, which are needed to define the adjustment factors, such as the Pi factors in MIL-HDBK-217.

3 FMEA (FAILURE MODE AND EFFECTS ANALYSIS)

Failure Mode and Effects Analysis (FMEA) is a systematic analysis approach that identifies potential failure modes in a system. It also identifies critical or significant design or process characteristics that require special controls to prevent or detect failure modes. FMEA is a tool used to prevent problems from occurring.

3.1 History of FMEA

The use of FMEA is not a recent method of analysis. This technique has been in existence for many years.. Before any documented format was developed, engineers would try to

Page 2: [IEEE 2009 Annual Reliability and Maintainability Symposium (RAMS) - Fort Worth, TX, USA (2009.01.26-2009.01.29)] 2009 Annual Reliability and Maintainability Symposium - Reliability

Figure 1. Bathtub Curve

anticipate what could go wrong with a design or process before it was developed or tested. Since this method applied trial and error techniques, each failure brought a new opportunity to perfect the design. However, this required starting from the beginning time and time again until the failure was eliminated. This technique was both costly and time consuming.

FMEAs were formally introduced in the late 1940’s with the introduction of MIL-STD-1629. Used for aerospace / rocket development, the FMEA and the more detailed Failure Mode and Effects Criticality Analysis (FMECA) were helpful in avoiding preventable failures.

The primary push for failure prevention came during the 1960’s while developing the technology for placing a man on the moon. The automotive industry also used FMEAs effectively for production improvement as well as design improvement.

3.2 FMEA Development

FMEAs are developed in two distinct phases: • The first phase is to postulate each failure mode based on

the functional requirements and then determine the appropriate effects. If the severity of the effect is critical, actions are considered to change the design and eliminate the Failure Mode if possible or protecting the end user from the effect.

• The second phase adds causes and probability of occurrences to each Failure Mode. This is the detailed development section of the FMEA process. In a piece part analysis each component will be listed with its appropriate failure mode and failure rate.

3.3 Benefits of FMEA

• Improves the quality, reliability, and safety of products and processes

• Improves company image and competitiveness • Increases customer satisfaction • Reduces product development timing and cost • Documents and tracks action taken to reduce risk

3.4 Applications for FMEA

• Process - analyze manufacturing and assembly processes. • Design - analyze products before they are released for

production. • Concept - analyze systems or subsystems in the early

design concept stages. • Equipment - analyze machinery and equipment design

before they are purchased.

3.5 FMEA in Aerospace and Defense

FMEA continues to be an integral part of the development of Aircraft, Missile Systems, Radar, Communications, Electronics and other key technologies. Constant innovations in this analysis technique continue to increase its’ effectiveness.

4 FAULT TREE ANALYSIS (FTA)

Fault tree analysis (FTA) is a failure analysis technique in which an undesired system event is analyzed using Boolean logic to combine a series of lower-level events. This analysis method is primarily used to determine the probability of a safety hazard. This process is considered a “Top Down” approach as compared to FMEA which is typically a “Bottom Up” approach.

4.1 History of FTA

Fault Tree Analysis attempts to model and analyze failure processes of engineering and biological systems. FTA is basically composed of logic diagrams that display the state of the system and is constructed using graphical design techniques. Engineers were responsible for the development of Fault Tree Analysis because its development requires people with deep understanding of the system architecture as opposed to a mathematician or analyst.

Some people define FTA as another part or technique of reliability analysis. Although both model the same major aspect they have arisen from two different perspectives. Reliability was basically developed by mathematicians, while FTA as stated above was developed by engineers. FTA was initially developed for projects that cannot tolerate

Page 3: [IEEE 2009 Annual Reliability and Maintainability Symposium (RAMS) - Fort Worth, TX, USA (2009.01.26-2009.01.29)] 2009 Annual Reliability and Maintainability Symposium - Reliability

any error. Bell Telephone Laboratories started the development of

FTA during early 60's for the U.S. Air Force. Later, U.S. nuclear power plants and the Boeing Company used the system extensively. FTA is used in safety engineering as well as all major fields of engineering.

4.2 Why Fault Tree Analysis?

Since no system functions perfectly, dealing with a subsystem failure is a necessity, and any working system eventually will have a fault in some place. However, the probability for a complete or partial success is greater than the probability of a complete failure or partial failure. Because assembling a complete system level FTA can be a lengthy and expensive task, the preferred method is to consider subsystems. In this way dealing with subsystems can assure less chance for error and overall fewer system analysis hours. Using computer modeling tools, the subsystems integrate to form a well analyzed total system.

4.3 Methodology

In Fault Tree Analysis, an undesired system effect is

taken as the top event of a logic tree. There is only one top event and all elemental events must branch down from it. When fault trees are labeled with actual failure probabilities computer programs can calculate top event probabilities.

4.4 The Fault Tree Diagram

The FTA is usually written using conventional logic gate symbols. The route through a tree between an event and an initiator in the tree is called a Cut Set. The shortest credible way through the tree from fault to initiating event is called a Minimal Cut Set.

Many different approaches can be used to model a FTA, but the most common and popular way can be summarized in a few steps. FTA is used to analyze a fault event and that one and only one top event can be analyzed in a single fault tree. FTA analysis involves five steps: 1. Defining the undesired event 2. Obtaining an understanding of the system 3. Constructing the fault tree 4. Evaluating the fault tree 5. Controlling the hazards

Figure 2. FTA Example from ARP 4761

4.5 Definition of the undesired event

For aircraft certification the Functional Hazard Analysis defines the hazards to be examined via FTA. An engineer with extensive and comprehensive knowledge of the design of the system is the best person to define the undesired events. Undesired events are then used to define the various top events that make up the FTA, one top event for each FTA; no two events will be used to make one FTA.

4.6 Obtain an understanding of the system

Once the undesired event is selected, all causes with probabilities of affecting the undesired event are studied and analyzed. Getting exact numbers for the probabilities leading to the event is usually unlikely because of time and cost constraints. However selecting elemental events from FMEA

makes this practical. Computer software is used integrate FMEA and FTA leading to less costly system analysis.

Proper interface with system designers having full knowledge of the system is key to insure that no cause is overlooked which could affect the undesired event. For the selected event all causes are then numbered and sequenced in the order of occurrence and then are used for the next step which is drawing or constructing the fault tree.

4.7 Construction of the fault tree

At the outset, the undesired event must be selected and the system must be analyzed to identify all the causing effects and, if possible, their probabilities. Once this is accomplished the fault tree can be constructed. The Fault Tree is based on “AND” and “OR” gates which define the major characteristics of the top event.

Page 4: [IEEE 2009 Annual Reliability and Maintainability Symposium (RAMS) - Fort Worth, TX, USA (2009.01.26-2009.01.29)] 2009 Annual Reliability and Maintainability Symposium - Reliability

4.8 Evaluate the fault tree:

After the fault tree has been assembled for a specific undesired event, it is evaluated, compared to system requirements and analyzed for any possible system improvement.

4.9 Controlling the hazards:

After identifying the hazards, all possible methods are explored to decrease the probability of occurrence. While this step is very specific and differs largely from one system to another, it is an integral step in the process.

5 RELIABILITY ANALYSES AND ARP 4761

ARP 4761 “Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne Systems and Equipment” utilizes each of the above analysis techniques to determine compliance with related Federal Aviation Regulations (FARs).

Although ARP 4761 methodology defines the System Safety Assessment (SSA) as the primary certification document, the primary analyses used to perform this assessment is Reliability Prediction, FMEA and FTA which

have endured and matured over most of the last half century.

6 TYING IT ALL TOGETHER

The interrelation is as follows: Reliability Prediction values are used in developing quantitative FMEAs. Each failure mode in the FMEA is related to component parts, their respective failure rate modified by several factors including failure mode distribution which allocates the total failure rate of a component or function to each of its failure modes.

These failure modes are in turn are used to provide the “elemental events” for the Fault Tree Analysis. FTAs are calculated for each critical hazard identified.

Ultimately for aircraft certification, i.e. FAR 25.1309, Fault Tree Analysis results are used to show compliance with the appropriate requirements.

The functional hazard assessment (FHA) analyses the potential consequences on safety resulting from the loss or degradation of system functions. Using service experience, engineering and operational judgment, the severity of each hazard effect is determined qualitatively and is placed in a class. Safety objectives determine the maximum tolerable probability of occurrence of a hazard, in order to achieve a tolerable risk level.

Figure 3. Quantitative Hazard Requirements (Re: ARP 4761)

7 THE PROCESS WORKS - RECENT CERTIFICATION EXPERIENCE

Recently, we were given a certification effort that included a brand new aircraft design in the new Very Light Jet (VLJ) aircraft type. We would oversee system certification of the whole aircraft. This was exciting because never before had I been involved in ALL the systems on one aircraft. It was a contemporary design and the schedule was very aggressive. The initial effort was to prepare Preliminary System Safety Analyses (PSSAs) for each of the critical systems. As work

progressed it was obvious that this aircraft design presented new and unique challenges.

As defined in ARP 4761, PSSA includes qualitative analyses (FHA, FMEA, and FTA) meant to ensure that the design is robust enough, so that under critical failure scenarios, there is sufficient inherent redundancy to enable the continued safe operation of the aircraft.

The effort was proceeding on schedule until the braking system analysis turned up a potential deficiency. The braking system is a typical light aircraft braking system with two independent hydraulically actuated disc brakes on each of the

Page 5: [IEEE 2009 Annual Reliability and Maintainability Symposium (RAMS) - Fort Worth, TX, USA (2009.01.26-2009.01.29)] 2009 Annual Reliability and Maintainability Symposium - Reliability

two main landing gear wheels. Although there is independence with each side isolated from the other, this aircraft required differential braking for steering. The aircraft did not employ a nose wheel steering system. The ground steering function would be performed using differential braking and a free castering nose wheel.

The deleterious result of this unique design (for a jet aircraft) was first exposed during the formulation of the functional hazard analysis where hazards relating to loss of braking were combined with hazards related to loss of directional control. These hazards are then analyzed using FMEA and FTA.

The braking design was adequate for braking but certainly not robust enough when the additional function of directional control was added. This included the ground steering function while taxiing, in the initial part of the take-off roll, and in the latter part of the landing roll.

The additional hazards that were postulated uncovered a potentially catastrophic loss of directional control. This means that the failure of one wheel brake could cause the loss of directional control, and at high speed this could lead to departure from the runway and catastrophic loss of the aircraft.

Our recommendation to mitigate this severity was to employ an independent means of directional control. At high speed this requirement is covered by the rudder. However the rudder loses control authority at lower speeds. These lower speeds are still high enough to cause catastrophic loss of the aircraft if the aircraft were to depart the runway.

The proposal was to include rudder and nose wheel steering as mitigating functions for the catastrophic loss of directional control.

With a nose wheel steering system, loss of one side of braking, although contributing to excessively long landing distance, would not necessarily lead to loss of directional control. This is because any yawing moment introduced by off center braking force could be countered by the rudder at high

speed or nose wheel steering at lower speeds. Failure of nose wheel steering could be mitigated by differential braking, thereby neither system failure would contribute to a catastrophic event. Ultimately a design change was instituted to include nose wheel steering in the design.

REFERENCES

1. SAE ARP 4761, “Guidelines and Methods for Conducting the Safety Assessment Process on Civil Airborne Systems and Equipment”, December 1996

2. MIL-HDBK-217F, Reliability Prediction of Electronic Equipment, 1991, Notice 1 (1992) and Notice 2 (1995).

3. MIL-STD-1629A, "Procedures for Performing a Failure Mode, Effects and Criticality Analysis” 24 November 1980, Notice 1, 7 June 1983 and Notice 2, 28 November 1984.

BIOGRAPHY

Mark Saglimbene The Omicon Group, Inc. 40 Arkay Drive Hauppauge, NY 11788 631-436-7918 x306

Email: [email protected]

Mark Saglimbene has over twenty-five years experience in reliability, maintainability, and safety (RM&S) for electronic and electro-mechanical systems such as avionics, computer network systems, and aerospace systems. He has performed RM&S analyses on complex military systems as well as flight critical commercial aircraft systems leading to government certification. He has a B. S. in Electrical Engineering from Polytechnic Institute of New York (currently Polytechnic Institute of New York University) and is an Instrument Rated Private Pilot.