[american institute of aeronautics and astronautics aiaa guidance, navigation, and control...

11
American Institute of Aeronautics and Astronautics 1 Reliability-Based Modeling & Analysis of Fault-Tolerant Flight Control Systems N. Eva Wu * and Oguz A. Aydin Binghamton University, Binghamton, NY 13902-6000 This paper gives a brief tutorial on reliability modeling of safety critical systems, and discusses the control of such systems for enhancement of their safety and design lives. Fault tolerant control is tackled as a supervisory control problem of a finite state stochastic failure process. This view allows us to quantify, based on a prescribed the reliability requirement, the desired level of redundancy, the quality of redundancy management, and the effectiveness of maintenance policy. The new development with respect to our previous effort is the extension from Markov to non-Markov reliability models. Using a pitch axis control example, the benefit of using aerodynamically redundant surfaces is assessed. In addition, the effect of hardware aging, the effect of the risk in redundancy management, and the effect of frequency of maintenance are also examined. Nomenclature AGAN = as good as new ) ( , c u t c φ = coverage ) , ( k k v X C = instantaneous cost associated with entering state k X under supervisory control k v CFR = constant failure rate ) | , ˆ ( φ φ c t F = conditional cdf of the unbiased estimate of φ at critical clearance time t c e = event E = countable event set [.] E = expectation F = absorbing (failure) state ) (t F = lifetime distribution of a unit ) (t F = reliability of a unit ) (t F 0 = initial failure distribution ) (t F β = weibull distribution CDF function of shape parameter β ) (t G e = event life distribution ) (t H = holding (sojourn) time distribution function at state O , t-step transition matrix in a Markov process H(s, t) = state transition matrix (function) from state at time s, to state at time t ) (t H = holding (sojourn) time distribution function at state O j I = indicator function for subsystem j S IFR = increasing failure rate ) ( φ u J = achieved performance with control law u at φ ) ( φ th J = minimum required control performance with control law u at φ k = time index * Professor, Department of Electrical and Computer Engineering, Binghamton University, Binghamton, NY 13902. Ph.D. candidate, Department of Electrical and Computer Engineering, Binghamton University, Binghamton, NY 13902. AIAA Guidance, Navigation, and Control Conference and Exhibit 15 - 18 August 2005, San Francisco, California AIAA 2005-6429 Copyright © 2005 by the American Institute of Aeronautics and Astronautics, Inc. All rights reserved.

Upload: oguz

Post on 09-Dec-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [American Institute of Aeronautics and Astronautics AIAA Guidance, Navigation, and Control Conference and Exhibit - San Francisco, California ()] AIAA Guidance, Navigation, and Control

American Institute of Aeronautics and Astronautics

1

Reliability-Based Modeling & Analysis of Fault-Tolerant Flight Control Systems

N. Eva Wu* and Oguz A. Aydin† Binghamton University, Binghamton, NY 13902-6000

This paper gives a brief tutorial on reliability modeling of safety critical systems, and discusses the control of such systems for enhancement of their safety and design lives. Fault tolerant control is tackled as a supervisory control problem of a finite state stochastic failure process. This view allows us to quantify, based on a prescribed the reliability requirement, the desired level of redundancy, the quality of redundancy management, and the effectiveness of maintenance policy. The new development with respect to our previous effort is the extension from Markov to non-Markov reliability models. Using a pitch axis control example, the benefit of using aerodynamically redundant surfaces is assessed. In addition, the effect of hardware aging, the effect of the risk in redundancy management, and the effect of frequency of maintenance are also examined.

Nomenclature

AGAN = as good as new )(, cu tcφ = coverage

),( kk vXC = instantaneous cost associated with entering state kX under supervisory control kv CFR = constant failure rate

)|,ˆ( φφ ctF = conditional cdf of the unbiased estimate of φ at critical clearance time tc

e = event E = countable event set

[.]E = expectation F = absorbing (failure) state

)(tF = lifetime distribution of a unit )(tF = reliability of a unit )(tF0 = initial failure distribution )(tFβ = weibull distribution CDF function of shape parameter β

)(tGe = event life distribution )(tH = holding (sojourn) time distribution function at state O , t-step transition matrix in a Markov process

H(s, t) = state transition matrix (function) from state at time s, to state at time t )(tH = holding (sojourn) time distribution function at state O

jI = indicator function for subsystem jS IFR = increasing failure rate

)(φuJ = achieved performance with control law u at φ )(φthJ = minimum required control performance with control law u at φ

k = time index * Professor, Department of Electrical and Computer Engineering, Binghamton University, Binghamton, NY 13902. † Ph.D. candidate, Department of Electrical and Computer Engineering, Binghamton University, Binghamton, NY 13902.

AIAA Guidance, Navigation, and Control Conference and Exhibit15 - 18 August 2005, San Francisco, California

AIAA 2005-6429

Copyright © 2005 by the American Institute of Aeronautics and Astronautics, Inc. All rights reserved.

Page 2: [American Institute of Aeronautics and Astronautics AIAA Guidance, Navigation, and Control Conference and Exhibit - San Francisco, California ()] AIAA Guidance, Navigation, and Control

American Institute of Aeronautics and Astronautics

2

M = transition matrix O = operating state

O = intermediate non-absorbing state X = countable state space X0, 0x = initial state

)(tP = transition function matrix )',',( exxp = state transition probability from state x to state x’ upon occurrence of event e’

jip , (s,t) = state transition function from state i at time s to state j at time t

)(xp0 = initial probability mass function )(tQ = transition rate matrix

jiq , = rate of state transition from i to j

)(tr = conditional failure rate

jS = subsystem j T = mission time, time between maintenance ct = critical clearance time

u = control law *u = optimal control law kv = supervisory control

)( 0xVπ = total expected cost under policy π with initial state x0

α = discount factor for total expected cost β = shape parameter in a Weibull distribution γ = uniform transition rate in a uniformization process

)(xΓ = set of feasible events θ = characteristic parameter in a Weibull distribution λ = rate parameter for exponential distribution

aλ = failure rate for subsystems in block-a

bλ = failure rate for subsystems in block-b )(tFπ = probability at failure state F )(tjπ = state probability for state j

*π = optimal policy that minimizes the total expected cost φ = real valued vector in fault parameter set, Ω

φ = unbiased estimate of φ (.)ψ = structure function

(.)pψ = structure function for independent parallel subsystems

(.)sψ = structure function for independent series subsystems Ω = fault parameter set associated with a parameterized model.

I. Introduction afety-critical systems in this paper refer to systems that may result in loss of human lives when failed, such as manned aerospace vehicles. This paper considers specifically modeling and analysis of flight control systems for

the purpose of investigating their reliability and finding cost-effective ways to enhance their reliability. Fault tolerant control is tackled as a supervisory control problem of finite state stochastic failure processes. Under this

S

Page 3: [American Institute of Aeronautics and Astronautics AIAA Guidance, Navigation, and Control Conference and Exhibit - San Francisco, California ()] AIAA Guidance, Navigation, and Control

American Institute of Aeronautics and Astronautics

3

formalization, a process is said to have entered an absorbing state if a failure has occurred at the system level. Otherwise it is in a non-absorbing state. Supervisory control aims at controlling the rates of state transitions in such a way that entering an absorbing (system failure) state is maximally avoided.

Publications reporting new development in fault-tolerant control methods have flourished following the

overview paper by Patton on the 1997 situation in the field.1 Typically faults are modeled as unknown additive exogenous signals, and fault-tolerant control aims at alleviating the effect of the signals.2,3 This treatment of faults avoids the burden to determine the root causes. But most work along this direction has been oblivious of the effect of faults that impair systems' inherent ability to allow an effective performance restoration, with few exceptions.4-6 An alternative treatment of faults involves more detailed fault modeling and diagnosis.7-10 Fault-tolerant control then amounts to reacting to specific conditions in such a way that their adverse effects are compensated. Applications of fault-tolerant control to flight control systems have been most fruitful due to their safety critical nature.12 Few publications, however, specifically address the issue of modeling the transient process in accommodating faults,12,13 and the issue of reliability.14,15

To clarify between faults and failures, the following definitions are introduced.16 A fault is an unpermitted deviation of at least one characteristic property or variable of the system. A failure is a permanent interruption of a (sub)system's ability to perform a required function under specified operating conditions. In this paper, fault-tolerance is defined as the ability of a flight control system to prolong the time to system failure despite the failures of some of its subsystems.

Highly reliable systems make use of redundancy to achieve fault-tolerance, due to limited reliability of subsystems.17 Utilization of analytic redundancy18 provided by static and dynamic relations among system variables can further reduce the probability of exhaustion of hardware in a cost-effective manner. Fault-tolerance can be achieved through control reconfiguration that capitalizes on redundant sensing and actuating functionalities. Viewed as a discrete state system, control reconfiguration introduces an intermediate non-absorbing state O in the transition process from operating state O to the absorbing (failure) state F as shown in Figure 1. Let )(tFπ denote the probability at state F. Then

tttHdHtHt aF

1tHt

0

bF

t∀=≤∫ −=

≤−≤),()()()()( )(

)(max)( πττπ

ττ (1)

evidences the improved reliability with control reconfiguration, where H and H are the holding time distribution functions at state O and O , respectively. In addition, with adequate redundant control authority and well-designed reconfiguration strategies, )()( )()( tt a

Fb

F ππ << can be achieved for all Tt ≤ , where T is any conceivable single mission time of a vehicle.

This paper establishes a reliability model of a flight control system, based on which the benefit of control reconfiguration is assessed. Our interest also lies with quantifying the effect of aging of vehicle subsystems, the effect of vehicle maintenance policy, and the effect of decision risks that accompany any reactive control reconfiguration. Aging of hardware is captured in the subsystem failure time distributions that use increasing failure rates (IFR), as opposed to the conventional models that use constant failure rates (CFR). Since maintenance generally does not restore the vehicle condition to as good as new (AGAN), maintenance policies affect the state of the vehicle reliability even at the onset of a mission. This generally results in )()( tt AGAN

FF ππ ≥ . The potentially dramatic effect is due to the decision risks associated with control reconfiguration, which, in the event of bypassing

Figure 1. High-level state transition (a) without control reconfiguration, and (b) with control reconfiguration

Page 4: [American Institute of Aeronautics and Astronautics AIAA Guidance, Navigation, and Control Conference and Exhibit - San Francisco, California ()] AIAA Guidance, Navigation, and Control

American Institute of Aeronautics and Astronautics

4

state O , as indicated by the dashed transition in Figure 1 (b), can worsen the vehicle reliability. The paper will state the conditions under which control reconfiguration becomes liable. The paper is organized as follows. Section II provides the background on stochastic discrete state modeling of failure processes. Section III establishes the reliability model of a pitch axis control system, where all subsystem times to failure are exponentially distributed. Section IV considers the case where the distributions of times between failures are no longer exponential. The section also presents the numerical results of our investigation on the effects of management of redundancy, frequency of maintenance, and aging of subsystems.

II. Background This section gives a highly condensed presentation about discrete state stochastic processes, which is drawn from

chapters 6 and 7 of the book by Cassandras and Lafortune,18 and about the notions of age and reliability of systems, which follows Barlow and Proschan.20

A. Discrete State Stochastic Processes A stochastic timed automaton is a timed automaton specified by a six-tuple (E, X, Γ(x), p(x,x’,e’), p0(x) , G)

where E is a countable event set, X is a countable state space, Γ(x) is a set of feasible events defined for all x∈ X, with Γ(x) ⊆ E, p(x,x’,e’) is a state transition probability defined for all x, x’∈X, and e’ ∈ E, such that p(x,x’,e’)=0 for all e’ ∉Γ(x), )(xp0 is the probability mass function Pr[X0=x], x∈X of the initial state X0, and G = Ge : e ∈ E all is a stochastic clock structure. In a Poisson clock structure, arrival process of every e ∈ E is a Poisson process that has independent stationary increment without simultaneous occurrences: Pr[n occurrences up to t] = (λt)ne-λt/(n!). A Poisson process has the properties that interevent time distribution Ge is exponential, and that residual life distribution is also Ge.

A Markov chain is a stochastic process X(t) with state space X, generated by a stochastic timed automaton equipped with a Poisson clock structure, where P[X(tk+1)≤xk+1|X(tk)≤xk,…,X(t0)≤x0]= P[X(tk+1)≤xk+1|X(tk)≤xk]. The process is said to be memoryless in that all past state information is irrelevant, and that how long the process has been in the current state is irrelevant. Define ', xxp as the (total) probability of transition from x to x’ regardless of the event that results in the transition,

∑ ×=Γ∈ )(

', ),'()',,'(xe

xx xepexxpp (2)

where p(e’,x) is the event probability that the next event to occur is e’ given that the current state is x, and can be calculated by

∑=ΛΛ

=Γ∈ )(

' )(,)()(

),'(xe

ee x

xxxep λλ (3)

and the assumption of exponential event life distribution )(,)( xee1tG t

e e Γ∈∀−= −λ (4) has been assumed. More generally, the transition probability is written as ]|'Pr[)(', xXxXkp k1kxx ==≡ + , which may depend on time. Therefore, a discrete-time Markov process is fully specified by the three-tuple (X, px,x’ (k), p0(x)). The set of the discrete states in X can always be mapped onto the set of positive integers 1,2, …. The transition probabilities originated from state i satisfies ∑ ∀=jall ji i1kp ,)(, . Define the n-step transition matrix

)],([),( , nkkpnkkH ji +≡+ . The Chapman-Kolmogorov equation nkuknkuHukHnkkH +≤<+=+ ),,(),(),( (5)

can be derived using the Markovian properties. The process is said to be homogeneous if H(k,k+n) can be expressed as H(n). In particular, H(1) ≡P=[pi,j] is called a transition probability matrix. Define state probability for state j as

]Pr[)( jXk kj =≡π , and ])()([)( mkkk 10 πππ = . Then kP0k )()( ππ = . Similarly, a continuous time Markov process is fully specified by a state space X, a set of transition functions

tsisXjtXtsp xx ≤==≡ ],)(|)(Pr[),(', , (6)

Page 5: [American Institute of Aeronautics and Astronautics AIAA Guidance, Navigation, and Control Conference and Exhibit - San Francisco, California ()] AIAA Guidance, Navigation, and Control

American Institute of Aeronautics and Astronautics

5

and an initial pmf )(xp0 . The evolution of the process can be described by the following (forward) Chapman-Kolmogorov equation of transition function matrix )],,([),( ', tsptsH xx= ,,', 21xx = ,

tst

ItttHtQIssHtQtsHt

tsH0t

≤∆

−∆+≡==∂

∂→∆

,),(lim)(,),(),(),(),( . (7)

In a homogeneous Markov process, all transition functions are independent of the time instants s, t, and dependent only on (t-s). Therefore, ][)( , jiqQtQ == = constant with ∑−= ≠ij jiii qq ,, . Let ),()( ττ += ttHP , the Chapman-Kolmogorov equation and its solution are

I0PQPP == )(,)()( ττ , and ττ QeP −=)( , (8) from which state probability

)()()( tP0t ππ = (9) can be obtained. Differentiating (9) yields

mD ,,,)()()( ,, 21jtqtqtji

ijijjjj =∑+=≠

πππ (10)

Equation (10) can be interpreted as a probability flow balance equation, where )(tjπ is the level of a probability

fluid in node j, the first term in the right side of Equation (10) is the total flow out of node j with flow rate jjq , , and

the second term is the total flow into state j with flow rate jiq , from node i.

B. Theoretical failure distributions (as opposed to empirical distributions)

The reliability of a unit (or component, or subsystem, or system) corresponding to a mission time t is )()( tF1tF −= , where F is the lifetime distribution of the unit. The corresponding conditional reliability and

conditional failure probability of the unit of age T are

0TFTtF1TtFTF

tTFTtF >−=+= )(),|()|(,)(

)()|( . (11)

The conditional failure rate at time τ is

)()(

)()()(lim)(

ττ

ττττ

Ff

FFtF

t1r

0t=−+=

→ (12)

Therefore, reliability in terms of failure rate is given by

))(exp()( ∫−=t

0drtF ττ (13)

0 10000 200000.3

0.4

0.5

0.6

0.7

0.8

0.9

1Weibull Distribution (β=3, θ=2000)

End

of m

issio

n fa

ilure

pro

babi

lity

Number of missions flown0 10000 20000

0.99

0.995

0.996

0.997

0.998

0.999

1Exponential Distribution (λ =1/θ)

Number of missions flown

MIssion time=1 unit

MIssion time=2 units

MIssion time=5 units

Mission time=10 units

Figure 2 Effect of aging in hardware to unit reliability

Page 6: [American Institute of Aeronautics and Astronautics AIAA Guidance, Navigation, and Control Conference and Exhibit - San Francisco, California ()] AIAA Guidance, Navigation, and Control

American Institute of Aeronautics and Astronautics

6

A unit that does not age in the sense that its reliability over an additional period of duration t is the same regardless of its present age T can be shown to obey an exponential distribution where r=constant (constant failure rate or CFR). According to (11) and (12), a unit that ages has a decreasing conditional reliability, and an increasing failure rate or IFR, respectively. Weibull distribution

00te1tF t >≥−= − βλβθ

β ,,,)( )/( (14) is an IFR example when 1>β with rate given by

.,)()( )( 1ttr 1 >= − βθθ

β β (15)

Figure 2 in describing the lifetime of a flight control system hardware unit that is inspected after every mission flown. The data points on the graphs are the conditional reliability at the end of missions (right before the inspection). When the hardware unit is found to be healthy during the inspection, it is to continue the next mission without being replaced. It can be seen that after certain number of missions, the deterioration of reliability due to aging hardware is apparent. Therefore, aging effect on a system can be detrimental when the maintenance policy in place is not sufficiently aggressive.

III. CFR Model and Supervisory Control of a Pitch Axis Control System

A. Markov model of PACS This section describes a high-level reliability model of a pitch axis control system (PACS). The way the model is

established is similar to that described for an AFTI-F16 flight control system.21

The bottom block diagram in Figure 3 shows the dependency of functional modules in a pitch axis control system. The first four blocks are a computer power supply block, an I/O control module block, a pilot command sensor block, and an aircraft state sensor block. These are followed by a pitch effector block. The upper block diagram of Figure 3shows the functional dependencies of subsystems in the pitch axis control system. It reflects the available redundant control authorities in the system and the extent such redundancy is utilized for subsystem failure recovery. Therefore our reliability analysis is focused on the upper block only. Each effector channel in this block contains an actuator-surface subsystem, which is preceded by a group of three active identical computer/effector (C/E) interface subsystems. The functional dependency of the fault tolerant flight control system altogether is described by a two-layer parallel-to-series interconnection scheme.

The reliability indicator used in the following discussion is the probability of loss of control. It estimates the system compliance with applicable safety-of-flight criterion and provides an indication of the impact of added or reduced hardware redundancy as well as the flight control system reconfiguration capability. Each small box in Figure 3 represents a subsystem. The symbols λa, λb shown in the small boxes are the subsystem failure rates in terms of number of failures per hour. Under the assumption of low subsystem failure rates, short mission time, and all components are as good as new at the beginning of each mission, constant failure rates (exponential distribution) are appropriate. The safety requirement for the inner layer parallel configuration is 1-out-of-3 (fail-operational/fail-operational/loss-of-control). The safety requirement for the outer layer parallel configuration in the pitch effector channels is 1-out-of-2 (fail-operational/loss-of-control).

Figure 3 Functional dependence for a pitch axis control system

Page 7: [American Institute of Aeronautics and Astronautics AIAA Guidance, Navigation, and Control Conference and Exhibit - San Francisco, California ()] AIAA Guidance, Navigation, and Control

American Institute of Aeronautics and Astronautics

7

The redundancy architecture shown in Figure 3 does not truly reflect how effector channel hardware is configured. It must be understood as an effective redundancy configuration which assumes that any anomaly in an effector channel serious enough to warrant a control adaptation or reconfiguration action for failure accommodation can do so promptly and successfully. In reality, however, due to uncertainties in the model of the system to be controlled, uncertainties in the models of signals exerted on the system, and the limited processing capability, considerable risks exist in making a decision on the corrective action. These decision risks must be taken into consideration in reliability assessment. The risks encountered may include overly slow or severe transients, false alarm, missed detection, false identification, false reconfiguration, and lack or exhaustion of redundancy. The notion of coverage is used to account for such risks. It is defined as the probability that the system is recovered and operative given that a particular subsystem failure has occurred. It represents an attempt to separate the handling of failures from the occurrence of failures.

The detailed discussion on obtaining the Markov model of the pitch axis control system can be found in Ref. 15, where coverage of failures is the focus of the study. Let φ be a real-valued vector in fault parameter set Ω associated with a parameterized model of, or a symbolic parameter indicating some discrete failure event in a controlled system subject to failures. Coverage is defined as15

∫=≥Ω∈∈ )()(|ˆ

, )|,ˆ()(φφφφ

φ φφthu JJ

ccu tdFtc (16)

where tc is the critical clearance time representing the maximum period allowed between the occurrence of a subsystem failure and the establishment of a post failure equilibrium which includes the departing trajectory from a pre-failure equilibrium in its region of attraction, )|,ˆ( φφ dtdF is the conditional pdf of the unbiased estimate of parameter φ, )(φuJ and )(φthJ are the achieved performance and the minimum required control performance with control law u at φ.

B. Supervisory Control Assume now that the continuous-time Markov model established for the pitch axis control has been converted into a discrete-time process through uniformization. This is accomplished by choosing a uniform transition rate

∑≥ ≠ij jiq ,γ for all state i, and then turning the transition rates iiji qq ,, , into transition probabilities ,/, γjiq

γ/,iiq1 + , respectively, for all i, j, and i ≠ j. For the converted discrete-time process, consider the following total expected cost18

,,),()( 10vXCExV0k

kkk

0 <<∑=∞

=ααππ (17)

where k is the time index, x0 is the initial state, α is the discount factor, C(Xk,vk) is the instantaneous cost associated with entering random state Xk ∈ X using supervisory control vk. Assign the cost associated with a supervisory control action by

)(),(),'( , kkkuk tuvtc1vxC =−= φ . (18)

Then it can be shown that the greedy policy π∗ that minimizes )'(xVπ in (17) for a controllable transition from x to x’ is given by a control law satisfying

cdkuu

du tttctc ≤=∈

),(max)( ,*, φφU

. (19)

In addition, the optimal policy π∗ minimizes the overall system reliability. The reader is referred to Ref. 15 for numerical results of a two-control law reconfigurable control setting (U=u1,u2) for the PACS.

Page 8: [American Institute of Aeronautics and Astronautics AIAA Guidance, Navigation, and Control Conference and Exhibit - San Francisco, California ()] AIAA Guidance, Navigation, and Control

American Institute of Aeronautics and Astronautics

8

IV. Reliability of Pitch Axis Control System with Non-Exponential Subsystem Lifetimes

The assumption that time to failure of every subsystem is exponential is relaxed in this section to reflect the aging of hardware units. In particular, the subsystem representing the aggregated actuator and control surface is assumed to have the Weibull distribution. As a result, the convenience of the memoryless property of a Markov process is lost. Instead, the cut set method20 will be applied to the reliability evaluation of the PACS.

A. Reliability as the Expectation of System’s Structure Function There are eight independent subsystems named S1, S2, …, S8 in the PACS as shown in Figure 4. Let I1(t), I2(t), …, I8(t) be their respective indicators, i.e., Ij=1 if Sj functions at t. Otherwise, it is zero. Let ψ(I1(t), I2(t), …, I8(t)) be the structure function of the PACS. ψ =1 if the PACS functions, and ψ =0 if the PACS does not function. The PACS reliability can now be obtained from

]))(I(Pr[))](I([)( 1ttEtFsys === ψψ . (20)

With t suppressed, let sψ and pψ denote the structure functions of independent series subsystems and independent parallel subsystems, respectively. Then

∏===

n

1jjn21n21s IIIIIII ,,,min),,,( ψ and ∏ −−==

=

n

1jjn21n21p I11IIIIII )(,,,max),,,( ψ . (21)

The structure function for the PACS in Figure 4 can now be expressed as ))),,,((),),,,((()I( 8765ps4321psp IIIIIIII ψψψψψψ = (22)

Using one of the minimal path set or minimal cut set can usually simplify the expansion of (22) into mutually exclusive sum of sets. A path set has a series structure function. It is minimal if sψ =1 and if the elimination of any subset in it will lead to .0s =ψ Similarly, a cut set has a parallel structure function. It is minimal if 0p =ψ , and if

any set contains it as a proper subset will lead to 1p =ψ . The PACS structure function is given by either the max of all minimal path sets, or the min of all minimal cut sets. All the numerical results that follow are carried out using the cut set method. The major challenge is the need to deal with large expressions especially when the non-ideal coverage values are included.

B. Numerical results The parameters in the following computations are λ = 10-5 /hourθ = 2× 105 hours, β=2. The values of coverage

are indicated in the figures.

Figure 4 PACS with non-exponential subsystem time to failure

Page 9: [American Institute of Aeronautics and Astronautics AIAA Guidance, Navigation, and Control Conference and Exhibit - San Francisco, California ()] AIAA Guidance, Navigation, and Control

American Institute of Aeronautics and Astronautics

9

0 1 2 3 4 5 6 7 8 9 10

x 104

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Time (hour)

Failu

re p

roba

biiti

es o

f PA

CS w

/o m

aint

enan

ce

ci=0, w. redundancy

ci=1, w/o redundancy

ci=0.95 w. redundancy

ci=1, w. redundancy

Figure 5 Effect of redundancy and coverage provided by control reconfiguration

Figure 5 shows the effect of redundancy to the PACS failure probability as a function of time when no maintenance is in place. It can be seen that redundant pitch axis control system with perfect coverage results in the highest system reliability. As the value of coverage decreases, the system failure probability increases in such a way that as long as coverage is non-perfect, there is always a region at the beginning of the system’s life, over which the redundant system performs worse than the non-redundant system (only one kind of control surface). When coverage value is sufficiently low, the redundant system provides no advantage over the entire time. Therefore, redundancy is beneficial only if it can be highly successfully managed. Figure 6 shows the effect of time between maintenance on the steady state system (PACS) failure probability for a regularly maintained system (every T hours, and subsystems do not fail or age during maintenance). The failed subsystems are replaced by new subsystems duration maintenance, whereas the working subsystems are not replaced, though they age and become more and more prone to failure. Consider the subsystem (actuator-surface) with Weibull time to failure distribution. The time to failure distribution under the maintenance policy is given by, after the kth maintenance (before k-1th),

.,,),()(,)],()[()(

])(

)()[()()()(

m21ktFtFTt0tF1TFtTFTF

tTF1TFtFTFtF

001k1k

1k

1k1k01kk

==<≤−−+=

+−+=

−−

−−−

β

(23)

On the other hand, the subsystem with exponential time to failure distribution is always renewed at the maintenance. Again, the effect of coverage is evident.

Page 10: [American Institute of Aeronautics and Astronautics AIAA Guidance, Navigation, and Control Conference and Exhibit - San Francisco, California ()] AIAA Guidance, Navigation, and Control

American Institute of Aeronautics and Astronautics

10

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5x 104

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

T (time between maintenance, hour)

Failu

re p

roba

bilit

y at

stea

dy-s

tate

Components replaced at maintenance only when found failed

Perfect coverage

cInterface=0.99, cActuator-surface=0.95

cInterface=0.9, cActuator-surface=0.85

Figure 6 Effect of time between maintenance

Since the system is non-repairable during its mission, the maintenance policy is neither age replacement (replaced upon failure or at age T), nor block replacement (replace upon failure or at times T, 2T, 3T,…). The two curves in Figure 7 represent, respectively, replacement for only failed subsystems at the maintenance (kT,), and replacement of all subsystems at the maintenance (kT). The latter policy (maintain to AGAN) allows much longer time between maintenance to deteriorate to the same level of system failure probability as the former. For the parameter values used ( θβλ ,, ), it is found that in order to maintain the system reliability at 10-7 in the long run, subsystems need be replaced every 5575 hours in operation.

Figure 7 Effect of maintenance policy

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5x 104

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

T (time between maintenance, hour)

Failu

re p

roba

bilit

y of

PA

CS

Failed components replaced at maintenance

All components as good as new after maintenance

Page 11: [American Institute of Aeronautics and Astronautics AIAA Guidance, Navigation, and Control Conference and Exhibit - San Francisco, California ()] AIAA Guidance, Navigation, and Control

American Institute of Aeronautics and Astronautics

11

Acknowledgments The authors of this paper thank the National Aeronautics and Astronautics Administration and United States Air

Force Research Laboratory for their support of this work under contracts NCC-1-02009, and F30602-02-C-0225.

References 1Patton, R.J. “Fault-tolerant control: the 1997 situation,” Proc. Safeprocess. 1997. 2Kabore, R., and Wang, H., “Design of fault diagnosis filters and fault-tolerant control for a class of nonlinear systems,”

IEEE Transactions on Automatic Control, vol.46, 2001, pp.1805 -1810. 3Zhou, K., and Zhang, R., “A new controller architecture for high performance, robust, and fault-tolerant control,” IEEE

Transactions on Automatic Control, vol.46, 2001, pp.1613-1618. 4Blanke, M., Frei, C., Kraus, F., Patton, R. J. and Staroswiecki, M., Fault-tolerant control systems, in Control of Complex

Systems, K. Astrom, M. Blanke, A. Isidori, W. Shaufelberger and R Sanz (Eds), Springer-Verlag, 2001, Chapter 8. 5Wu, N.E, Zhou, K., and Salomon, G., Reconfigurability in linear time-invariant systems, Automatica, vol.36, 2000, pp.1767-

1771. 6M. Bodson, and Peterson, J., “Fast control allocation using spherical coordinates,” IEEE Conference on Decision and

Control, 1999. 7Frank, P.M., Editor, Advances in Control, Springer-Verlag, 1999. 8Chen, J., and Patton, R. Robust Model-Based Fault Diagnosis for Dynamic Systems, Kluwer, 1998. 9Gertler, J., Fault Detection and Diagnosis in Engineering Systems, Marcel Dekker, 1998. 10Mangoubi, R., Robust Estimation and Failure Detection: A Concise Treatment, Springer-Verlag, 1998. 11Banda, S.S., editor, Special issue on reconfigurable flight control, International Journal of Robust and Nonlinear Control,

vol.9, 1999, pp.999-1115. 12Mahmoud, M., Jiang, J., and Zhang, Y., “Effects of fault detection and isolation to the stability of fault tolerant control

systems,” Proc. American Control Conference, 2001. 13Zhang, X., Polycarpou, M., and Parisini, T., “Integrated design of fault diagnosis and accommodation schemes for a class

of nonlinear systems,” Proc. of the 40th IEEE Conference on Decision and Control, 2001. 14Walker, B., “Fault tolerant control system reliability and performance prediction using semi-Markov models,” Proc.

Safeprocess, 1997. 15Wu, N. Eva, “Coverage in fault tolerant control,” Automatica, vol.40, 2004, pp.537-548. 16 R. Isermann, and P. Balle, P., “Trends in the application of model-based fault detection and diagnosis of technical

processes,” Control Engineering Practices, vol.5, 1997, pp.709-719. 17R.W. Butler, “The SURE approach to reliability analysis,” IEEE Trans. Reliability, vol.41, 1992, pp 210-218. 18Chow. E.Y., and Willsky, A.S., “Analytical redundancy and the design of robust detection systems,” IEEE Transaction on

Automatic Control, vol. 29, 1984, pp.603-614.19Cassandras, C. G., and Lafortune, S., Introduction to Discrete Event Systems, Kluwer Academic Publishers, 1999. 20Barlow, R. E., and Proschan, F., Statistical Theory of Reliability and Life Testing Probability Models, Holt, Rinehart, and

Winston, 1975. 21Wu, N. E., “Reliability analysis of AFTI-F16 SRFCS using Assist and Sure,” Proc. American Control Conference, 2002.