ieee transactions on reliability, vol. 62, no. 2, june 2013...

16
IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 379 Evaluation and Comparison of Mixed Effects Model Based Prognosis for Hard Failure Junbo Son, Qiang Zhou, Shiyu Zhou, Xiaofeng Mao, and Mutasim Salman, Senior Member, IEEE Abstract—Failure prognosis plays an important role in effective condition-based maintenance. In this paper, we evaluate and com- pare the hard failure prediction accuracy of three types of prog- nostic methods that are based on mixed effect models: the degra- dation-signal based prognostic model with deterministic threshold (DSPM), with random threshold (RDSPM), and the joint prog- nostic model (JPM). In this work, the failure prediction perfor- mance is measured by the mean squared prediction error, and the power of prediction. We have analyzed characteristics of the three methods, and provided insights to the comparison results through both analytical study and extensive simulation. In addition, a case study using real data has been conducted to illustrate the compar- ison results as well. Index Terms—Degradation signal, hard failure prognosis, mixed effects models, performance comparison. ACRONYMS RUL Remaining useful life DSPM Degradation signal-based prognostic model with deterministic threshold RDSPM Degradation signal-based prognostic model with random threshold JPM Joint prognostic model FP, FN False positive, false negative MSE Mean squared error AMSE Approximated mean squared error NOTATION True failure time for th unit Failure time Manuscript received July 30, 2012; revised November 28, 2012, December 16, 2012, January 28, 2013, and February 05, 2013; accepted February 05, 2013. Date of publication April 29, 2013; date of current version May 29, 2013. This work was supported by General Motors and the National Science Foundation under Grant 1161077. Associate Editor: E. Zio. J. Son and S. Zhou are with the Department of Industrial and Systems Engi- neering, University of Wisconsin-Madison, WI, USA (e-mail: [email protected]; [email protected]). Q. Zhou is with the Department of Systems Engineering and Engineering Management, City University of Hong Kong, Kowloon, Hong Kong (e-mail: [email protected]). X. Mao and M. Salman are with the General Motors Research and Develop- ment (e-mail: [email protected]; [email protected]). Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/TR.2013.2259205 Mean value of Censored time for th unit Estimated failure time for th unit Acceptable error range Predicted remaining useful life at prediction time instant by using model A Degradation signal at time instant for th unit Random measurement noise Vector of xed-effect parameters Vector of random-effects parameters for th unit True degradation signal at time instant for th unit Any general multivariate parametric distribution Deterministic degradation signal threshold Random degradation signal threshold PDF of the random degradation signal threshold CDF of failure time Time instant when prediction is made CDF of failure time for th unit obtained by using model A Estimated CDF of failure time for th unit at prediction time instant by using model A PDF of failure time for th unit obtained by using model A Estimated PDF of failure time for th unit at prediction time instant by using model A Survival function for th unit obtained by using model A Estimated survival function for th unit at prediction time instant by using model A Estimated mean RUL for th unit by using model A Estimated median RUL for th unit by using model A Vector of time-dependent regression functions 0018-9529/$31.00 © 2013 IEEE

Upload: others

Post on 11-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 …homepages.cae.wisc.edu/~zhous/papers/06509984.pdf · IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 379

IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 379

Evaluation and Comparison of Mixed EffectsModel Based Prognosis for Hard Failure

Junbo Son, Qiang Zhou, Shiyu Zhou, Xiaofeng Mao, and Mutasim Salman, Senior Member, IEEE

Abstract—Failure prognosis plays an important role in effectivecondition-based maintenance. In this paper, we evaluate and com-pare the hard failure prediction accuracy of three types of prog-nostic methods that are based on mixed effect models: the degra-dation-signal based prognostic model with deterministic threshold(DSPM), with random threshold (RDSPM), and the joint prog-nostic model (JPM). In this work, the failure prediction perfor-mance is measured by the mean squared prediction error, and thepower of prediction. We have analyzed characteristics of the threemethods, and provided insights to the comparison results throughboth analytical study and extensive simulation. In addition, a casestudy using real data has been conducted to illustrate the compar-ison results as well.

Index Terms—Degradation signal, hard failure prognosis, mixedeffects models, performance comparison.

ACRONYMS

RUL Remaining useful life

DSPM Degradation signal-based prognostic model withdeterministic threshold

RDSPM Degradation signal-based prognostic model withrandom threshold

JPM Joint prognostic model

FP, FN False positive, false negative

MSE Mean squared error

AMSE Approximated mean squared error

NOTATION

True failure time for th unit

Failure time

Manuscript received July 30, 2012; revised November 28, 2012, December16, 2012, January 28, 2013, and February 05, 2013; accepted February 05, 2013.Date of publication April 29, 2013; date of current version May 29, 2013. Thiswork was supported by General Motors and the National Science Foundationunder Grant 1161077. Associate Editor: E. Zio.J. Son and S. Zhou are with the Department of Industrial and Systems Engi-

neering, University ofWisconsin-Madison, WI, USA (e-mail: [email protected];[email protected]).Q. Zhou is with the Department of Systems Engineering and Engineering

Management, City University of Hong Kong, Kowloon, Hong Kong (e-mail:[email protected]).X. Mao and M. Salman are with the General Motors Research and Develop-

ment (e-mail: [email protected]; [email protected]).Color versions of one or more of the figures in this paper are available online

at http://ieeexplore.ieee.org.Digital Object Identifier 10.1109/TR.2013.2259205

Mean value of

Censored time for th unit

Estimated failure time for th unit

Acceptable error range

Predicted remaining useful life at predictiontime instant by using model A

Degradation signal at time instant for th unit

Random measurement noise

Vector of fixed-effect parameters

Vector of random-effects parameters for th unit

True degradation signal at time instant forth unit

Any general multivariate parametricdistribution

Deterministic degradation signal threshold

Random degradation signal threshold

PDF of the random degradation signal threshold

CDF of failure time

Time instant when prediction is made

CDF of failure time for th unit obtained byusing model A

Estimated CDF of failure time for th unit atprediction time instant by using model A

PDF of failure time for th unit obtained byusing model A

Estimated PDF of failure time for th unit atprediction time instant by using model A

Survival function for th unit obtained by usingmodel A

Estimated survival function for th unit atprediction time instant by using model A

Estimated mean RUL for th unit by usingmodel A

Estimated median RUL for th unit by usingmodel A

Vector of time-dependent regression functions

0018-9529/$31.00 © 2013 IEEE

Page 2: IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 …homepages.cae.wisc.edu/~zhous/papers/06509984.pdf · IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 379

380 IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013

Fig. 1. Two major types of prognosis methods. (a) Prognosis based on time-to-failure data; (b) prognosis based on degradation signals.

Vector of random coefficients for the mixedeffects model

Hazard function for th unit

Baseline hazard function

Vector of time-invariant covariate for th unit

Vector of coefficients for time-invariantcovariates

Vector of coefficients for the degradation signal

Expectation under the distribution of estimatedremaining useful life

Expectation under the distribution of trueremaining useful life

Prior distribution of random variable

Vector of population mean of random vector

Variance-covariance matrix of random vector

Variance of measurement noise

Sample size of historical database

Number of replication for Monte Carlosimulation

I. INTRODUCTION

P REDICTION for the remaining useful life (RUL) of anin-service unit plays a critical role in engineering practice.

A significant amount of work exists in this field. Two types ofdata are often used in prognosis: (i) time-to-failure data as de-scribed in [1]–[6] and the references therein, and (ii) degrada-tion signals as described in [7]–[13] and the references therein.Time-to-failure data are used to estimate the probability distri-bution of failure time, as shown in Fig. 1(a), and then used topredict the RUL. The prognosis based on time-to-failure datais more statistically rigorous when a large amount of time-to-failure data can be collected. However, the time-to-failure dis-tribution obtained is for the entire population, rather than an in-dividual unit. Thus, time-to-failure data are more suitable forsystem design optimization in terms of reliability performance,rather than prognosis on individual units.

The degradation signal is a signal selected as the indicator ofthe system degradation. In the degradation signal based prog-nosis, a mathematical model is established to describe how thesignal evolves, and then the RUL is determined based on theforecast of the degradation signal. This method was originallyproposed for highly reliable units, because it is hard to get suf-ficient time-to-failure data for such units for traditional relia-bility analysis [7]. A popular degradation signal based method,denoted as the DSPM method, is to use a mixed effects modelto describe the degradation path, and use a Bayesian updatingapproach to obtain the posterior distribution of the degradationsignal of a specific unit, based on a prior distribution and the ob-served degradation path of that specific unit. The RUL is thencalculated based on the posterior distribution. In this way, theprognosis for an individual unit can be achieved [8], [14]. In de-termining the RUL, the assumption of soft failure is required, asshown in Fig. 1(b). Soft failure refers to the failure defined as thedegradation signal reaches a pre-specified threshold. However,in many situations, the soft failure assumption may not be valid.For example, an automotive battery fails when it cannot crankthe engine and start it. It is known that a vital health indicator ofthe battery is its internal resistance, which is often used as thedegradation signal to predict the RUL of the battery. However,the internal resistance level leading to cranking failure is not“crisp”. In other words, it is very difficult, and sometimes un-reasonable, to define a constant threshold for it [15]–[17], [40].In such cases, the failure is referred to as “hard failure”, meaningthe component keeps working until it breaks down.Despite its importance in practice, much less work exists

for hard failure prognosis than for soft failure. One class ofmethods is to extend the conventional DSPM by using a randomthreshold [15]–[17], [40]. These methods are initially designedfor prognosis on the population level. However, the methodin [40] can be used for individual unit prognosis through astraightforward extension. We denote this method as RDSPM.Another recent method for hard failure prognosis is the jointprognostic model (JPM) based method [18]–[21]. The jointprognostic framework is composed of two stages: (i) the offlinemodeling stage using the historical data from a large numberof units, and (ii) the online prognosis and model updating stageusing the newly collected signal from an in-service unit. Inthe JPM, a probabilistic hazard structure that depends on thedegradation signal is used to describe the failure mechanism.No threshold is assumed in this method.Due to the popularity of DSPM, and the relatively limited

works for hard failure prognosis, people tend to adopt the DSPMin practice even when the soft failure assumption does not ex-actly hold. Thus, it is highly desirable to evaluate under whatconditions this practice is reasonable, and under what condi-tions exact hard failure models should be used. However, to thebest of our knowledge, a comprehensive evaluation and com-parison of the prognosis performance for hard failures are notavailable in the current literature. The goal of this paper is toprovide such systematic evaluation and comparison for the threemethods, DSPM, RDSPM, and JPM.Because prognosis is relatively under-developed compared

to diagnosis, more focus on the prognosis has been on devel-oping a new modeling scheme and methods [22], rather than

Page 3: IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 …homepages.cae.wisc.edu/~zhous/papers/06509984.pdf · IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 379

SON et al.: EVALUATION AND COMPARISON OF MIXED EFFECTS MODEL BASED PROGNOSIS FOR HARD FAILURE 381

on performance evaluation and comparison. A few performanceanalysis papers for prognostic methods are available in the sur-vival analysis field in biostatistics [23]–[27]. In those works,the most common ways to evaluate the prognostic model arethe receiver operating characteristic (ROC) curve, and the corre-sponding area under the curve (AUC), yet their focus is not eval-uating the RUL prediction accuracy. Rather, those works focuson determining the degradation signals’ significance [23], [24],[27]. The comparison studies are conducted using various typesof different signals and their combinations, not among differentmodels. Our work is different because our primary interest is notcomparing different signals but evaluating and comparing dif-ferent prognostic models in terms of their prediction accuracy.In the engineering field, several quantitative metrics have beenproposed to evaluate the performance of prognostic algorithms[22]. To make the comparison relevant to engineering practice,appropriate performance metrics should be used. In this work,we select the mean squared error (MSE) of the predicted RULand the false positive and false negative rate (FP/FN) as theprediction metrics. The definition of MSE is straightforward asgiven in (1), and it is one of the most widely used performancemetrics in engineering [28].

(1)

The definitions of FP/FN rate and a relevant term, Power, aregiven in (2).

(2)

These probabilities should be evaluated under the true RUL dis-tribution. Fig. 2 illustrates this definition graphically. We useFP/FN rate as the performance metric because it is highly rele-vant to engineering practice. In reality, it is meaningless to pre-dict the occurrence of the failure to be at an exact time pointbecause it is always zero. Thus, the prediction of failure is oftenreported as a time interval in practice. The definition of FP/FNrate captures the characteristics of such interval prediction, andevaluates the probability of the correctness of the prediction. Asshown in Fig. 2, the FN rate is actually the misdetection rate,while the FP rate is the false alarm rate. These two probabil-ities are critical to customer satisfaction and the maintenancecost. The MSE and FP/FN rate have very different characteris-tics. The MSE purely depends on the point estimate of the RUL,while the power depends on the shape of the true RUL distribu-tion. In this paper, both metrics will be used in the performanceevaluation and comparison.There are other performance metrics recently developed for

prognostic models such as the performance index. Theindex is defined as a binary metric that evaluates whether

the prediction accuracy at specific time instantfalls within a specified -bound [22]. This metric has been usedfor visual evaluation of the prediction accuracy. Despite its in-tuitive interpretation, it is not suitable for evaluating multiple

Fig. 2. Illustrative definition of FP/FN.

units. For example, we need to show multiple plots toillustrate the prediction accuracy for different units [39]. Theprediction power used in this paper is in fact conceptually sim-ilar to the index, but can be applied more easily to multipleunits. Besides the index, other recently proposed perfor-mance metrics are RUL precision index, RUL accuracy-preci-sion index, and RUL on-line steadiness index, etc. [35]. The firsttwo can be obtained by the RUL confidence intervals computedat time instant , and the last one can be obtained by the stan-dard error or the variance of the RUL estimate. In this study,the standard deviation and the confidence intervals of the RULestimates will be provided. Thus, these metrics can be easily ob-tained as well.The rest of this paper is organized as follows. Section II ad-

dresses the detailed modeling procedure and structure of thethree models, DSPM, RDSPM, and JPM. An analytical study isconducted in Section III for obtaining useful insights, and an ex-tensive simulation comparison with more complicated modelsis followed in Section IV. A discussion section is provided inSection V to compare the methods under failure models otherthan hard failure. A case study has been done using real datato further illustrate the comparison result in Section VI. Finally,Section VII summarizes and concludes the paper.

II. REVIEW OF DSPM, RDSPM, AND JPM

A. Degradation Signal Based Prognostic Model With FixedThreshold (DSPM)

The modeling and parameter estimation for the DSPM havebeen well-studied [7]. In this method, the degradation signalpath for the th unit is often expressed as a mixed effect model

(3)

where is assumed to be the same for all units, and isvarying from unit to unit. In this way, the model incorporatesboth the common characteristics among the population, and theindividual characteristics for each unit. is assumed to followa multivariate normal distribution with unknown parameters tobe estimated; however, any general multivariate distribution

can be used as well. Measurement error is oftenassumed to follow a normal distribution with mean zero, andit can also be assumed to be generated by other stochastic

Page 4: IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 …homepages.cae.wisc.edu/~zhous/papers/06509984.pdf · IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 379

382 IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013

processes such as the Brownian motion [8]. If we let denotethe failure time, the distribution of is

(4)

In most cases, the distribution function of does not havea closed form due to the complexity of and . How-ever, under certain simple model , and a relatively simple para-metric probability density function for random-effect parame-ters , the closed form can be obtained. The mean RUL canbe geometrically described as the area under the survival curve.The survival function (also called reliability function), definedas , can be easily obtained based on the cu-mulative distribution of . By integrating the survival functionwith respect to time, the mean RUL from DSPM can be esti-mated as . The detailed mathematics isshown in Section III for a linear degradation model.

B. Degradation Signal Based Prognostic Model With RandomThreshold (RDSPM)

In this method, it is assumed that the threshold is a randomvariable following a certain distribution. Here, the failurethreshold is assumed to follow a Gamma distribution, and isestimated from historical data. Under this assumption, we canformulate the failure time, often referred to as the first hittingtime, as

(5)

For the RUL prediction at time instant , we need to truncatethe distribution at . This truncation can be done by replacing

with , and adjusting the domain of integra-tion accordingly. Further extending (5), the CDF of the RDSPMfor the th unit at time can be expressed as (6)

(6)

Because , the mean RUL attime instant can be obtained as

(7)

Also, the median residual life can be obtained by.

C. Joint Prognostic Model (JPM)

The JPM estimates the RUL distribution for individual unitsusing both the time-to-failure data and the degradation signals. It

consists of two parts. The first part is a mixed-effect degradationsignal model,

(8)

With this degradation path model, we can establish the modelfor time-to-failure data as the second part,

(9)

where is mathematically defined as. This model can be

viewed as an extension of the popular Cox proportional hazardmodel [32] that is widely used in reliability analysis [6]. Notethat , , are the same for different units from the samepopulation. They represent population characteristics while theindividual specific information is contained in and . In thisway, the degradation signals and the time-to-failure data areintegrated. The detailed parameter estimation procedures aregiven in the Appendix. Given the hazard function, the systemsurvival function can be obtained as

(10)

Once we have the survival function, the mean RUL estima-tion procedure is the same as that of the DSPM. An illustrationis given in the following section, with a linear degradation path.

III. PERFORMANCE EVALUATION AND COMPARISONUNDER LINEAR DEGRADATION MODELS

In this section, we shall investigate the failure prediction ac-curacy of the DSPM and the JPM. Comparing with DSPM,RDSPM has one more layer of integration with respect to thethreshold, making it analytically intractable. Hence, we leavethe detailed discussion of the RDSPM to the numerical studysection.As a first step, we analyze and compare JPM and DSPM to

see how both models behave with respect to different parameterlevels. In this study, the degradation signal is assumed to be amixed effects model with linear signal propagation, as shown in(11).

(11)

The random coefficient varies from unit to unit, and it is as-sumed to follow the normal distribution ; andfollows the normal distribution . This assumption iscommon in literature [7], [8]. The linear degradation signal issimple, yet it is still meaningful for both research and real-lifeapplications. It has been used in many applications [8], [30]; andfurthermore, the linear degradation signal model can handle the

Page 5: IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 …homepages.cae.wisc.edu/~zhous/papers/06509984.pdf · IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 379

SON et al.: EVALUATION AND COMPARISON OF MIXED EFFECTS MODEL BASED PROGNOSIS FOR HARD FAILURE 383

exponential degradation signal with a logarithm transformation[8].With this linear degradation model, the hazard function of the

system is assumed to be

(12)

In this model, the baseline hazard is assumed to be linear, andis the baseline coefficient. Under (11) and (12), we can derive

the survival function at time instant as

(13)

and the probability density function of the residual life is

(14)

where . The derivation of (14) is straightforward, andomitted here. Assume we have a RUL prediction at timefor the th unit, as . Then, for the th unit, the predictionpower, and the MSE of the prediction can be obtained based on(1), and (2) as

(15)and

(16)

Note that, because of the observation noise, the RUL predic-tion is a random variable. In (15), the inner probabilityis taken under the true distribution of , while the outer expec-tation is taken under the distribution of . In (16), we needto take the expectation on both and . A simple deriva-tion shows that can be decomposed as

. Note that the second term is determined solelyby the characteristic of the system, and not by the accuracy ofprediction. Thus, we may drop the second term, and simplify(16) to the approximated MSE (AMSE) expressed below.

(17)

In our study, we also noticed that the impact of the random-ness of is not significant when the observation noise is mod-erate. Thus, the expectation over can be further dropped forboth AMSE in (17) and power in (15) by assuming as a fixedvalue. By dropping the expectation over , the number of iter-ations to compute AMSE and power will reduce significantly.In the following studies, AMSE and the power without expecta-tion over are used to save computation time. Equations (15),

and (17) without give the prediction power, and AMSE fora specific unit. To obtain the overall performance, we need totake the expectation over the entire population

(18)

(19)

Note that the is taken over the distribution of the randomcoefficient parameter .

A. Residual Life Prediction for the DSPM

Given observations of the degradation path, denoted as, the mean RUL prediction at time

based on the degradation model is given as

(20)

for , and the subscript repre-sents the DSPM.To predict the mean RUL, we need to predict the future degra-

dation signal path after time . The estimation procedure forthe parameters of the mixed effects model in (11) has beenwell-studied [20], [21], [30]. In this section, we assume thata large amount of historical data is available, and we can getaccurate estimates of the population parameters in this model,including the parameter , the distribution of with its ownparameters of and , and the noise variance . The esti-mation errors in those terms are hence ignored here [33]. Thus,to predict the degradation signal path , we only need an es-timate of the value for the specific th unit, and determinethe threshold value .When the DSPM is needed to be applied without a pre-spec-

ified threshold , the value of is often taken as ,where is themedian life of the population [15]. In this study,we will follow the same way. The estimation of the value ofis more involved. Unlike the estimation procedure for , theneeds to be estimated by the restricted maximum likelihood

(REML) approach [30]. After obtaining all the population-wiseestimates, we need to update the parameters. The Bayesian up-dating method has been a popular method to estimate the valueof given the observations of degradation and its populationdistribution as prior. Using the Bayesian method, we first ob-tain the posterior distribution of as

(21)

whereis the density function ofgiven , and is given as

a normal distribution . After multiple steps of

Page 6: IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 …homepages.cae.wisc.edu/~zhous/papers/06509984.pdf · IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 379

384 IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013

algebra as in [8], we can find the posterior distribution of asa normal distribution , where

(22)

with . Using the posterior distributionof , we can get the predicted cumulative RUL distribution forunit as

(23)

where , and is thecumulative standard normal distribution. With (23), we can ob-tain the predicted probability density function of and the pre-dicted survival function for unit as below, and the derivationsare shown in the Appendix.

(24)

(25)

Given the predicted survival function, the expected meanRUL, and the expected median RUL of the unit can be obtainedas , and , respectively.In practice, both of them have been used as the prediction ofRUL for the unit [2], [31]. Note that for a given unit withobservation of its degradation signal until time , bothand are fixed numbers. However, due to the measurementnoise, even for the same unit , observed degradation signalswill be different in different replications. Thus, andare in fact random variables. If the measurement noise is smallas in most practical cases, the variation will be small. Using theRUL estimate, the performance metrics can be computed.

B. Residual Life Prediction for the JPM

Given observations of the degradation path,, the predicted hazard function under the JPM at time

instant is given as

(26)

where the expectation is taken under the posterior distributionof , which can be obtained using the same Bayesian updatingprocedure as that described in Section III-A. Because the pos-terior distribution of is a normal distribution,follows a lognormal distribution. Based on this result, the ex-pected value of the hazard rate function can be re-formulated as

(27)

Using this predicted hazard rate, we can get the survival func-tion as

(28)

where the subscript indicates that the predicted survival func-tion is under the JPM.Because the density function of the RUL is related to the

hazard function and the survival function as, we can get the prediction of the density func-

tion under the JPM framework as

(29)

Given the predicted survival function, the expected meanRUL, and the expected median RUL of the unit can be obtainedas , and , respectively.

C. Comparison and Analysis of Prediction Accuracy

In this section, we will evaluate and compare the predictionaccuracy of the DSPM and the JPM. As mentioned in the pre-vious section, both the expected mean RUL and the expectedmedian RUL could be used as point estimates of the RUL. Herewe will use themedian RUL, i.e., and , as the expectedRUL prediction for its simplicity. In this study, we set the base-line hazard coefficient as , acceptable error range as10, and the standard deviation of the measurement error as0.001. is assumed to follow , and thefixed parameter is . From the prediction method intro-duced above, we can see that the parameters which affect theperformance of each prognostic model are the time instant ofthe prediction , the variance of random parameter , and themeasurement noise level . The time of prediction determineshow much information the model can get from observations.The variance of a random parameter indicates how each indi-vidual differs from others. The measurement noise level influ-ences the accuracy of the Bayesian update algorithm. We willinvestigate these parameters and see how the JPM and DSPMperform. It is worth mentioning that we have conducted manycase studies under various parameter settings besides this spe-cific setting. In all those cases, we found the conclusions to beconsistent. Thus, this parameter set is used as an illustrativeexample.1) Influence of Prediction Time : Table I shows the perfor-

mance evaluation results for these two expected RUL prediction

Page 7: IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 …homepages.cae.wisc.edu/~zhous/papers/06509984.pdf · IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 379

SON et al.: EVALUATION AND COMPARISON OF MIXED EFFECTS MODEL BASED PROGNOSIS FOR HARD FAILURE 385

TABLE ICOMPARISON UNDER THE SIMPLE MODEL STRUCTURE WITH DIFFERENT LEVEL OF

TABLE IICOMPARISON OF PERFORMANCE FOR A SPECIFIC UNIT

methods. The power and the AMSE are computed and averagedusing (18) and (19), and and are used for the expectedRUL prediction for the DSPM and JPM, respectively.From Table I, we can find that the prediction accuracy gets

better at the later stages of the unit’s life, which is not surprisingbecause a later prediction means more observations on thedegradation path, and hence more information. We can also seethat the performances of JPM and DSPM for this simple modelare comparable, although JPM is slightly better. According toTable I, the variance of the prediction from DSPM is largerthan that of the JPM, which means the performance of theDSPM varies more among different units. On the other hand,the performance of the JPM is more robust.2) Influence of the Variance of Random Parameter :

Larger values of means that the parameter has a largervariation. In this section, all the comparisons are made underthe condition of . Table II shows three different cases,and the corresponding performances of two methods. The meanvalue of the random parameter is set to 0.1. Both 0.095 and0.105 are within the range.As we can see from Table II, when the parameter of the spe-

cific in-service unit deviates from its mean value, the AMSEof DSPM gets significantly larger. On the other hand, the JPMis quite stable in all three cases in terms of the AMSE. Thisresult indicates that the variance of the random parameter hasa negative relationship with the performance of the prognosticmodels, especially for the DSPM. Table III further confirms thisobservation.

Focusing on the last two rows of Table III, we can see the ob-vious deteriorating trend of the DSPM performance. The AMSEincreases remarkably as the variance in increases. For theJPM, the performance is relatively stable. This phenomenon canbe explained by the definition of . The is estimated by ;thus the DSPM can estimate the RUL accurately when the thin-service unit behaves similar to the population average.3) Influence of the Measurement Noise : As mentioned

before, in practice, the measurement error of the sensory de-vice is reasonably small. Normally, it has no significant im-pact on the prediction accuracy. However, to make the com-parison more complete, it is necessary to take a closer look atthe measurement noise and its impact on both models. The up-dating algorithm depends on the observations of the degradationsignal path; therefore, if the measurements are corrupted or notaccurate, the overall prediction performance will be affected.Table IV summarizes the comparison results against differentlevels of measurement noise. Note that unreasonably large mea-surement error is not realistic. Both models are getting worseas measurement noise gets larger. Interestingly, the magnitudeof the increment of AMSE from JPM is larger than that fromDSPM. This result suggests that the JPM is more sensitive tothe measurement noise. Please note that the AMSE and powerreported in Table IV has been obtained without omitting in(15) and (17) because cannot be ignored for the analysis ofthe measurement noise level.

Page 8: IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 …homepages.cae.wisc.edu/~zhous/papers/06509984.pdf · IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 379

386 IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013

TABLE IIICOMPARISON UNDER THE LINEAR MODEL STRUCTURE WITH DIFFERENT LEVELS OF

TABLE IVCOMPARISON UNDER THE SIMPLE MODEL STRUCTURE WITH DIFFERENT LEVEL OF

IV. NUMERICAL STUDIES FOR QUADRATICDEGRADATION MODELS

If the degradation signals and the hazard function do not havelinear forms, it is difficult to obtain a closed-form estimation ofthe survival function. In this section, we conduct a series of sim-ulation studies to evaluate and compare the prediction accuracyusing a more general quadratic degradation model structure.

A. Model Structure and Simulation Procedure

In this study, we assumed that the true degradation signal pathis modeled as a mixed-effects model given as

(30)

where is assumed to follow a multivariatenormal distribution . Besides the degradationsignal, the hazard function is assumed in a more complex formas

(31)where is the Weibull baseline hazard function, andis a time invariant covariate for the th unit which is assumedto be either 0 or 1, indicating two different groups of units. Inreality, the parametric form of the true degradation signal modelis usually unknown. Thus, we use a quadratic polynomial formto model the degradation path instead, as

(32)

and we assume that the hazard function is in the same form ofthe true model as

(33)In this simulation study, we set the true parameters as

in Table V. Note that some parameters have been assignedsmall values to ensure that units will survive for a reasonablylong time. For simplicity, we shall indicate all the popula-tion and individual parameters for the th unit as a vector

.Please note that we have assumed a quadratic signal path

model, while the true path model is not of that form. As we cansee from Table V, we have set the value to zero. This settingindicates that the true failure time is not affected by , and ismainly dependent on the degradation signal with a strong posi-tive relationship.In the simulation study, we have conducted the comparisons

of DSPM, RDSPM, and JPM methods. All the true parametersare unknown to the three models, so have to be estimated. Thiscondition makes the numerical study more realistic.The simulation consists of two parts. The first part is to gen-

erate a historical sample of units with degradation signals andfailure times based on the assumed true model, and then thepopulation parameters in (32) and (33) are estimated based onthe historical data. The key steps of the simulation are summa-rized in Tables VI and VII. Based on the simulated historicaldata, we can estimate the population parameters. Also, step [g]

Page 9: IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 …homepages.cae.wisc.edu/~zhous/papers/06509984.pdf · IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 379

SON et al.: EVALUATION AND COMPARISON OF MIXED EFFECTS MODEL BASED PROGNOSIS FOR HARD FAILURE 387

TABLE VTRUE PARAMETERS FOR THE SIMULATION STUDY

TABLE VISIMULATION ALGORITHM FOR GENERATING HISTORICAL DATA (PART 1)

in Table VII is omitted because we have decided to omitin (15) and (17), except for the analysis of measurement noiselevel.In this study, we use the two-stage method for estimating the

parameters for DSPM adopted in [7], [8], and the population pa-rameter estimation technique for the JPM that can be found in[21]. For the RDSPM, we have an additional estimation step forthe threshold distribution, which is assumed to be the Gammadistribution. Using historical failure records, we can fit the datato the Gamma distribution [33]. In the second part of the simula-tion, we first generate multiple instances of degradation signalsand failure times according to the true model for unit . Afterthat, we make the RUL prediction using the available degrada-tion signals up to time . Then the calculation of the perfor-mance metrics is followed. The detailed simulation procedureis shown in Table VII. The subscript represents the unit forthe prediction.

B. Comparison Results

1) Performance Evaluation and Comparison With LargeSample Size: We used , and . In statisticalinference, the size of 500 is often considered as a large samplesize, and the variation due to the sample uncertainty will besmall. Thus, in this simulation, we run part 1 of the simulationonly once. Then the estimated DSPM, RDSPM, and JPMbased on the historical data are used to evaluate the predictionaccuracy.Table VIII shows the performances at different time points of

prediction. At , the models have no information aboutthe particular unit . Therefore, their performance reflects onlythe population behavior extracted from the historical data. Onceindividual specific information has been obtained, the perfor-mance gets better. It is interesting to note that at the early pre-diction stage, such as , or 20, the mean prediction powerof the JPM is worse than that of DSPM, but it gets significantlybetter at the later stage. This result is because the JPM has an ex-ponential structure in the hazard function, and hence it is more

vulnerable to the accuracy of the parameter estimates, whichdepends largely on the amount of observation data. We alsonote that, although the RDSPM considers the randomness of thethreshold, it performs similar to the DSPM, and worse than theJPM.We also investigated the influence due to the variance of the

random coefficient and measurement noise. To analyze the im-pact of the variance of random coefficient , we have in-creased it by multiplying a scaling constant . The results areobtained by changing from 1 to 3 where . Forthe measurement noise, we chose three different levels as well:0.001, 0.01, and 0.1. Note that the measurement error of 0.01 isthe same as that in Table VIII. The prediction time is fixed at40. Table IX, and Table X summarize the results, respectively.Based on Table IX, we can see that the variance affects the

overall performance, and the DSPM and RDSPM are more vul-nerable to large variance. The AMSE from those two modelsconsistently increasing as the variance increases.According to Table X, the performance of all three models

gets worse as measurement noise increases. However, after acertain level of measurement noise, the ASME of the JPM getsworse significantly. The measurement noise is highly related tothe accuracy of the estimated parameters. If the observed sig-nals were corrupted or too noisy, it could be problematic for allthree models, especially for the JPM due to its complex struc-ture. This problem is a potential weakness of the JPM. Pleasenote that those AMSE and power values reported in Table Xwere obtained including in (15) and (17) because we areconsidering the influence of measurement noises.2) Impact of Sample Size of the Historical Dataset: In this

section, we investigate the impact of the sample size of the his-torical dataset on the prediction accuracy. Table XI shows theperformance with , and Table XII shows the perfor-mance at for different sample sizes.Because the purpose is to investigate the impact of sampling

uncertainty, we need to repeat both Part 1 and Part 2 of the simu-lation algorithm. In other words, in the RUL prediction for eachindividual unit, we regenerate the historical database, and re-es-timate the DSPM, RDSPM, and JPM model parameters. Fromthese results, we can see that the JPM is more sensitive to thesample size. This result is not surprising because more parame-ters need to be estimated in the JPM. However, its performanceis still better than the DSPM at the later stage of the predictionin terms of both the prediction power and the AMSE.In Table XII, an obvious trend can be found. As the sample

size increases, the power increases. The AMSE tends to de-crease because of the smaller estimation error, and the 95% con-fidence interval for the expected mean RUL gets narrower due

Page 10: IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 …homepages.cae.wisc.edu/~zhous/papers/06509984.pdf · IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 379

388 IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013

TABLE VIISIMULATION PROCEDURE FOR RUL PREDICTION AND PERFORMANCE EVALUATION (PART 2)

TABLE VIIICOMPARISON RESULTS

to the reduced uncertainty in the parameter estimation. Noticethat, even in the extreme case , the JPM is still betterthan the DSPM and the RDSPM in terms of both power andAMSE.

V. DISCUSSION

So far, our study has focused on the performance comparisonunder the hard failure case. As we can see from those results,the JPM performs well compare to the DSPM and the RDSPM.

To make the comparison more comprehensive, we will extendthe scope of the study in this section to obtain more insights.

A. Comparison Under the Soft Failure Case

Soft failure is a common type of failure in some applica-tions. Most times, the threshold defining the soft failure is setby experts or reliability engineers or both. The soft failure hasa very different nature from the hard failure due to the way thefailure is defined. In this section, we have repeated the compar-ison study, and analyzed two different cases: (i) soft failure with

Page 11: IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 …homepages.cae.wisc.edu/~zhous/papers/06509984.pdf · IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 379

SON et al.: EVALUATION AND COMPARISON OF MIXED EFFECTS MODEL BASED PROGNOSIS FOR HARD FAILURE 389

TABLE IXIMPACT OF VARIANCE OF RANDOM COEFFICIENT

TABLE XIMPACT OF MEASUREMENT NOISE

fixed threshold, and (ii) soft failure with random threshold. Case(ii) can be viewed as a generalization of (i).In this particular setting, we assumed that the random

threshold in case (ii) follows a Gamma distribution. TheDSPM, and the RDSPM are the methods specifically designedto handle case (i), and case (ii), respectively. Thus they areexpected to perform well. The true degradation signal pathis still assumed as that in (30), and unknown to all models.Table XIII summarizes the results. It can be seen that the DSPMand the RDSPM indeed outperform JPM significantly.

B. Hard Failure Affected by Factors Other Than DegradationSignal

It is generally known that the hard failure is more difficult topredict than the soft failure. Because soft failure depends solelyon the degradation signal and its threshold, the prediction willbe good as long as we have an accurate signal path model. How-ever, it is hard to establish an accurate prognostic model for hardfailure. The hard failure not only depends on the degradationsignal, but possibly on some other factors as well. The flexi-bility of the JPM framework allows those factors to be easilyincorporated. Table XIV illustrates the idea.

The left columns are the results from a numerical study usingsimulated data with a factor other than the degradation signalaffecting the failure time. Specifically, the left four columns arethe results from the simulation study with , which is de-fined in (31). The right columns are the case where the hardfailure time is determined by setting . It is evident thatJPM has a clear advantage over the other two models whichsolely depend on degradation signal.

C. Key Findings

Based on our extensive experiments, and the comparisonsconducted in Sections III, IV, and V, we can summarize somekey findings.1) The threshold-based methods (the DSPM, and theRDSPM) are easy to implement, and quite effectivefor soft failure cases. The idea of these two modelsis straightforward. They keep track of the degradationsignal, and predict the time when the signal first hits thegiven threshold. For hard failure cases, these methodslack the ability of modeling the random correspondence

Page 12: IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 …homepages.cae.wisc.edu/~zhous/papers/06509984.pdf · IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 379

390 IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013

TABLE XIPERFORMANCE COMPARISON WITH SMALL HISTORICAL SAMPLE SIZE

TABLE XIICOMPARISON BETWEEN DIFFERENT SAMPLE SIZE AT

between the failure and the degradation signal. Particu-larly, they are not preferred when the system failure is alsoimpacted by factors other than the degradation signal.

2) The JPM relates the degradation signal to the failurethrough a probabilistic hazard structure, allowing it tohandle hard failures. However, the degradation signal lieswithin an exponential part of the hazard function, makingthe JPM more sensitive to the accuracy of parameterestimates. Therefore, if the prediction is made at the earlystage when observation data points are scarce, or if thesignal noise level is high, the performance of the JPM isunreliable compared to the other two methods.

3) The JPM contains a baseline hazard and possibly other co-variates which are not directly related to the degradationsignal. This condition makes the JPM more flexible, butwith the price of estimating more parameters. Thus, if thesample size of the historical data is small, the performanceof JPM is unreliable. To make the JPM perform well, it isessential to have a reasonably large amount of historicaldata.

VI. CASE STUDY BASED ON REAL DATA

In this section, we used real data to evaluate both methods.The data contain the fail-to-crank times of 14 automotive bat-teries. The degradation signal is the internal resistance, whichhas been commonly used in battery life prediction [21]. Thesedata were collected under the accelerated testing condition. The14 batteries were made by two different manufacturers: eightwere made by A, and the rest were made by B. A time-inde-pendent covariate was added to the model to account for theirpotential difference. The degradation signal paths of the 14 bat-teries and the histogram of the resistances at the failure time areshown in Fig. 3, which clearly shows that the failure does notdepend on a fixed threshold.We used the same model structure shown in (32) and (33) to

predict the failure time. For the DSPM, we used the mean re-sistance value at the week before the failure as the threshold.A quadratic form of the mixed effects model was selected tomodel the resistance signal propagation. Fig. 4 is the graphicalillustration of the model fit. More rigorous goodness-of-fit mea-sures are provided in Table XV, with statistical significance of

Page 13: IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 …homepages.cae.wisc.edu/~zhous/papers/06509984.pdf · IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 379

SON et al.: EVALUATION AND COMPARISON OF MIXED EFFECTS MODEL BASED PROGNOSIS FOR HARD FAILURE 391

TABLE XIIICOMPARISON RESULTS UNDER DIFFERENT FAILURE TIME GENERATING MECHANISM

Under the soft failure scenario, the RDSPM is identical to the DSPM

TABLE XIVCOMPARISON RESULTS WITH OTHER FACTORS AFFECTING THE FAILURE TIME

Fig. 3. Resistance propagation of 14 batteries, and the histogram of resistanceat failure.

each estimated parameters. For the goodness-of-fit measure, theR-square values specifically designed for mixed effects modelwere used [41], [42].Due to the limited sample size, we adopted a leave-one-out

cross-validation approach to evaluate the performance. The pre-diction power is defined here as the (number of correct predic-

tions)/14, and the MSE is evaluated over 14 predictions using(1). “Correct predictions” means that the true RUL is locatedwithin the acceptable error range of the predicted mean RUL.The acceptable error range was set to 2 weeks. The earliestfailure time happened at the 6th week, thus we used a predictiontime from 0 to 5. The final results are presented in Table XVIwith their corresponding standard errors.We can see that the JPM performs better than the DSPM and

RDSPM at the later stage of the prediction. TheMSE of the JPMdecreases remarkably as prediction time increases. However,the MSE of the DSPM and RDSPM has an increasing trend.This trend is an indication that there might be other factors af-fecting the actual failure time which cannot be explained by thedegradation signal. In real applications, it is almost impossibleto identify all of the factors affecting the hard failure time. Aswe investigated in the discussion section, all of them, DSPM,RDSPM, and JPM, have their own advantages and limitations.However, as we can see in this particular case study with realdata, the JPM performs better for the hard failure prognosis withcurrent settings.

Page 14: IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 …homepages.cae.wisc.edu/~zhous/papers/06509984.pdf · IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 379

392 IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013

Fig. 4. Illustration of model fit using mixed effects model.

TABLE XVGOODNESS-OF-FIT MEASURES AND SIGNIFICANCE TEST FOR

ESTIMATED PARAMETERS

TABLE XVICASE STUDY COMPARISON RESULT

VII. CONCLUSION

In this paper, we evaluated and compared the prognosisaccuracy in terms of both the prediction power and the AMSEfor models under two major categories: the threshold-basedmethods (the DSPM, and the RDSPM), and the JPM. TheDSPM is a popular prognosis method in the field of reliabilityengineering when there is a well-defined threshold for thedegradation signal. The RDSPM is an extended form of theDSPM which accommodates the random threshold, allowingit to handle hard failures. In JPM, the predicted degradationsignal is related to the system failure probability through ahazard function. Under the hard failure case, the compre-hensive studies show that the JPM usually outperforms the

threshold-based methods (DSPM and RDSPM) when we have arelatively large set of historical data, and the prediction is madeat the later stage. Due to the fast development of sensing andinformation technology, it has become easier to collect largehistorical dataset. Thus, we envision that the JPM will findmore applications in the future. On the other hand, when thehistorical data are scarce, or there is a given failure thresholdfor the degradation signal, the DSPM and the RDSPM stillpossesses an advantage.In summary, the threshold-based methods are suggested for

soft failure prognosis. They are specifically designed and mo-tivated by the soft failure case; thus they outperform the JPMunder the soft failure scenario. However, for the hard failureprognosis, the DSPM and the RDSPM lack the flexibilityof modeling the random correspondence between the failureand the degradation signal. Particularly, the threshold-basedmethods are not preferred when the system failure is also im-pacted by factors other than the degradation signal. Because ofthe flexible model structure of the JPM, it can handle the hardfailure case more appropriately. However, the data availabilityplays an important role in the prognosis performance for JPM.Thus, in general, threshold-based methods are preferred at theearly stage of prognosis even for the hard failure prediction.However, at later stage of prognosis, the JPM is preferred.We have also identified potential directions to extend the

current work. In this work, we only compare two categories ofprognosis methods that are based on the mixed effects degra-dation model. Some other interesting methods are not includedto limit the scope of this paper. For example, the stochasticfiltering-based method is a very important class of methods forprognosis. In those methods, the degradation states are assumedunobservable. Filtering techniques such as Kalman filteringand particle filtering are used to estimate the states, and thenthe RUL distribution is estimated based on the state estimates[35]–[38]. Although it is often not easy to physically interpretthe unobservable states and assign a linking function betweenthe states and the observation, the filtering based methods arequite effective and computational efficient in many scenarios[14]. It would be interesting to rigorously compare and evaluatethese filtering based methods. Another future direction is to

Page 15: IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 …homepages.cae.wisc.edu/~zhous/papers/06509984.pdf · IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 379

SON et al.: EVALUATION AND COMPARISON OF MIXED EFFECTS MODEL BASED PROGNOSIS FOR HARD FAILURE 393

determine the prediction interval which is related to the choiceof acceptable error range . In practice, people tend to usethe center-based interval prediction, using the mean or medianvalue and the , or the equal-tail-probability prediction in-terval (e.g., confidence interval). For the online prognosis ofindividual units, those interval estimates, often times, do notyield the maximum prediction power. We will investigate alongthis line, and hope to report the results in the future.

APPENDIX

Parameter Estimation of the JPM: Based on the conditionalstatistical independence assumption, the observed data likeli-hood for the th unit in the database can be written as

(34)where represents the survival model,

corresponds to the degradation model, andis a multivariate normal distribution with parameters. Also,

is the event time (the unit either died at timeor censored at time without knowing its actual time of

death), and is an event indicator which takes a value of either0, or 1 to indicate the unit has censored, or died, respectively.Here, is the set of all the parameters. The three componentsin (34) are defined as

(35)

(36)

(37)

where

, , and .Also, denotes the baseline hazard rate, and is the vectorof association parameters linking with the hazard function.In the same manner, is the parameters linking the degradationsignal with the hazard rate.To obtain the parameter estimates using the likelihood func-

tion introduced in (34), should bemaximized. A more detailed estimation procedure can be foundin [20].Derivation of the Predicted p.d.f of : The cumulative failure

distribution can be defined as below.

(38)

(39)

where , is pre-specified threshold value for degradationsignal, and indicates the cumulative standard normal dis-tribution function. The failure distribution should be truncatedbecause the time cannot be smaller than the prediction timeinstant .

(40)

(41)

where, .The failure distribution is the first derivative of cumulative

failure distribution.

(42)

where

.

ACKNOWLEDGMENT

The authors would like to thank the editor and the refereesfor their valuable comments and suggestions.

REFERENCES

[1] W. Q. Meeker and L. A. Escobar, Statistical Methods for ReliabilityData. Hoboken, NJ, USA: Wiley, 1998.

[2] J. D. Kalbfleisch and R. L. Prentice, The Statistical Analysis of FailureTime Data. Hoboken, NJ, USA: Wiley, 2002.

[3] J. F. Lawless, Statistical Models and Methods for Lifetime Data.Hoboken, NJ, USA: Wiley, 2003.

[4] M. Rausand and A. Høyland, System Reliability Theory: Models, Sta-tistical Methods, and Applications. Hoboken, NJ, USA:Wiley, 2004.

[5] A. K. S. Jardine, D. Lin, and D. Banjevic, “A review on machinerydiagnostics and prognostics implementing condition-based mainte-nance,” Mechan. Syst. Signal Process., vol. 20, no. 7, pp. 1483–1510,2006.

[6] Y. Yuan, S. Zhou, C. Sievenpiper, and K.Mannar, “Event log modelingand analysis for system failure prediction,” IIE Trans., vol. 43, no. 9,pp. 647–660, 2011.

[7] C. J. Lu and W. Q. Meeker, “Using degradation measures to estimatea time-to-failure distribution,” Technometrics, vol. 35, no. 2, pp.161–174, 1993.

[8] N. Gebraeel, M. Lawley, R. Li, and J. Ryan, “Residual-life distribu-tions from component degradation signals: A Bayesian approach,” IIETrans., vol. 37, no. 6, pp. 543–557, 2005.

[9] N. Gebraeel, “Sensory-updated residual life distributions for compo-nents with exponential degradation patterns,” IEEE Trans. Autom. Sci.Eng., vol. 3, no. 4, pp. 382–393, 2006.

[10] K. Goebel and P. Bonissone, “Prognostic information fusion for con-stant load systems,” presented at the 8th Int. Conf. Inf. Fusion, 2005.

Page 16: IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 …homepages.cae.wisc.edu/~zhous/papers/06509984.pdf · IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013 379

394 IEEE TRANSACTIONS ON RELIABILITY, VOL. 62, NO. 2, JUNE 2013

[11] K. Kaiser and N. Gebraeel, “Predictive maintenance managementusing sensor-based degradation models,” IEEE Trans. Syst., Man,Cybern., Part A: Syst. Humans, vol. 39, no. 4, pp. 840–849, 2009.

[12] S. Chakraborty, N. Gebraeel, M. Lawley, and H. Wan, “Residual-lifeestimation for components with non-symmetric priors,” IIE Trans., vol.41, no. 4, pp. 372–387, 2009.

[13] A. Elwany and N. Gebraeel, “Real-time estimation of mean remaininglife using sensor-based degradation models,” J. Manufacturing Sci.Eng., vol. 131, no. 5, p. 051005, 2009.

[14] X. Si, W. Wang, C. Hu, and D. Zhou, “Remaining useful life estima-tion—A review on the statistical data driven approaches,” Eur. J. Op-erational Res., vol. 213, no. 1, pp. 1–14, 2011.

[15] P. Wang and D. W. Coit, “Reliability and degradation modeling withrandom or uncertain failure threshold,” in Proc. Rel. MaintainabilitySymp., 2007, pp. 392–397.

[16] I. T. Yu and C. Fuh, “Estimation of time to hard failure distributionsusing a three-stage method,” IEEE Trans. Rel., vol. 59, no. 2, pp.405–412, Jun. 2010.

[17] J. Sun, L. Li, and L. Xi, “Modified two-stage degradation model fordynamic maintenance threshold calculation considering uncertainty,”IEEE Trans. Autom. Sci. Eng., vol. 9, no. 1, pp. 209–212, 2012.

[18] D. R. Cox, “Some remarks on failure-times, surrogate markers, degra-dation, wear, and the quality of life,” Lifetime Data Anal., vol. 5, no. 4,pp. 307–314, 1999.

[19] M. Yu, N. J. Law, and J. M. G. Taylor, “Joint longitudinal-sur-vival-curve models and their application to prostate cancer,” StatisticaSinica, vol. 14, pp. 835–862, 2004.

[20] A. A. Tsiatis and M. Davidian, “Joint modeling of longitudinal andtime-to-event data: An overview,” Statistica Sinica, vol. 14, pp.809–834, 2004.

[21] Q. Zhou, J. Son, S. Zhou, X. Mao, and M. Salman, “Remaining usefullife prediction of individual units subject to hard failure,” IIE Trans.,under revision.

[22] A. Saxena, J. Celaya, E. Balaban, K. Goebel, B. Saha, S. Saha, andM. Schwabacher, “Metrics for evaluating performance of prognostictechniques,” in Proc. Int. Conf. Prognostics Health Manag., 2008, pp.1–17.

[23] Y. Zheng, T. Cai, and Z. Feng, “Application of the time-dependentROC curves for prognostic accuracy with multiple biomarkers,” Bio-metrics, vol. 62, no. 1, pp. 279–287, 2006.

[24] Y. Zheng and P. Heagerty, “Prospective accuracy for longitudinalmarkers,” Biometrics, vol. 63, no. 2, pp. 332–341, 2007.

[25] R. Schoop, E. Graf, and M. Schumacher, “Quantifying the predictiveperformance of prognosticmodels for censored survival data with time-dependent covariates,” Biometrics, vol. 64, no. 2, pp. 603–610, 2008.

[26] R. Schoop, M. Schumacher, and E. Graf, “Measures of prediction errorfor survival data with longitudinal covariates,” Biomed. J., vol. 53, no.2, pp. 275–293, 2011.

[27] D. Rizopoulos, “Dynamic predictions and prospective accuracy in jointmodels for longitudinal and time-to-event data,” Biometrics, vol. 67,no. 3, pp. 819–829, 2011.

[28] R. J. Hyndman and A. B. Koehler, “Another look at measures of fore-cast accuracy,” Int. J. Forecasting, vol. 22, no. 4, pp. 679–688, 2006.

[29] A. H. Christer and W. Wang, “A model of condition monitoringof a production plant,” Int. J. Production Res., vol. 30, no. 9, pp.2199–2211, 1992.

[30] N. M. Laird and J. H. Ware, “Random-effects models for longitudinaldata,” Biometrics, vol. 38, no. 4, pp. 963–974, 1982.

[31] J. P. Klein and M. L. Moeschberger, Survival Analysis: Techniques forCensored and Truncated Data. Berlin, Germany: Springer-Verlag,2003.

[32] D. R. Cox, “Regression models and life-tables,” J. Roy. Stat. Soc., vol.34, no. 2, pp. 187–220, 1972.

[33] H. Liao, W. Zhao, and H. Guo, “Predicting remaining useful life of anindividual unit using proportional hazards model and logistic regres-sion model,” in Proc. Rel. Maintainability Symp., 2006, pp. 127–132.

[34] G. J. Vachtsevanos, Intelligent Fault Diagnosis and Prognosis for En-gineering Systems. Hoboken, NJ, USA: Wiley, 2006.

[35] M. Orchard, F. Tobar, and G. Vachtsevanos, “Outer feedback correc-tion loops in particle filtering-based prognostic algorithms: Statisticalperformance comparison,” Studies Inf. Contr., vol. 18, no. 4, pp.295–304, 2009.

[36] M. Orchard and G. Vachtsevanos, “A particle filtering approach foron-line fault diagnosis and failure prognosis,” Trans. Inst. Measur.Contr., vol. 31, no. 3–4, pp. 221–246, 2009.

[37] M. Orchard, L. Tang, B. Saha, K. Goebel, and G. Vachtsevanos, “Risk-sensitive particle-filtering-based prognosis framework for estimationof remaining useful life in energy storage devices,” Studies Inform.atics and Contr., vol. 19, no. 3, pp. 209–218, 2011.

[38] C. Chen, G.Vachtsevanos, andM.Orchard, “Machine remaining usefullife prediction: An integrated adaptive neuro-fuzzy and high-order par-ticle filtering approach,” Mechan. Syst. Signal Process., vol. 28, pp.597–607, 2012.

[39] A. Saxena, J. Celaya, I. Roychoudhury, S. Saha, B. Saha, and K.Goebel, “Designing data-driven battery prognostic approaches forvariable loading profiles: Some lessons learned,” presented at the Eur.Conf. Prognost. Health Manag. Soc., 2012.

[40] B. Nystad, G. Gola, and J. Hulsund, “Lifetime models for remaininguseful life estimation with randomly distributed failure thresholds,”presented at the Eur. Conf. Prognost. Health Manag. Soc., 2012.

[41] L. Magee, “R2 measures based on Wald and likelihood ratio joint sig-nificance tests,” Amer. Statist., vol. 44, pp. 250–253, 1990.

[42] M.Gurka, “Selecting the best linear mixedmodel under REML,”Amer.Statist., vol. 60, no. 1, pp. 19–26, 2006.

Junbo Son received a B.S. in Industrial Systems and Information Engineering(2010) from the Korea University, South Korea. He is currently pursuing thePh.D. degree in Industrial and Systems Engineering at the University of Wis-consin-Madison, U.S.A.

Qiang Zhou is an assistant professor at the Department of Systems Engineeringand Engineering Management, City University of Hong Kong. He received aB.S. in Automotive Engineering (2005), and a M.S. in Mechanical Engineering(2007) from Tsinghua University, China; and a M.S. in Statistics (2010), anda Ph.D. in Industrial Engineering (2011) at University of Wisconsin-Madison,U.S.A. His research interests include statistical modeling and analysis of com-plex engineering systems. He is a member of INFORMS.

Shiyu Zhou is a Professor in the Department of Industrial and Systems Engi-neering at the University of Wisconsin-Madison. He received his B.S., and M.S.in Mechanical Engineering from the University of Science and Technology ofChina in 1993, and 1996, respectively; and his M.S. in Industrial Engineering,and the Ph.D. in Mechanical Engineering from the University of Michigan in2000. His research interests include in-process quality and productivity im-provement methodologies by integrating statistics, system and control theory,and engineering knowledge. He is a recipient of a CAREER Award from theNational Science Foundation and the Best Application Paper Award from IIETransactions. He is a member of IIE, INFORMS, ASME, and SME.

XiaofengMao received the B.E. degree from PekingUniversity, Beijing, China,in 2001; and the M.S., and Ph.D. in mechanical engineering from The Pennsyl-vania State University, University Park, in 2007, and 2010, respectively. From2011 to 2012, he did research on diagnosis and prognosis of li-ion and lead-acidbatteries, and propulsion motors in General Motors Research and Developmentas a researcher. In 2012, he joined the Chassis Controls group in General Mo-tors. His research interests include nonlinear and robust control, hybrid electricvehicles, and chassis control.

Mutasim Salman is a Lab Group manager and a Technical Fellow in the Elec-trical, Controls and Integration Lab of GM Research and Development Center.Mutasim received his bachelor’s degree in Electrical Engineering from Univer-sity of Texas at Austin; M.S., and Ph.D. in Electrical Engineering with special-ization in Systems and Control from University of Illinois at Urbana-Cham-paign. He also has an Executive MBA. He joined the GM R&D staff in 1984.He has the responsibility of development and validation of algorithms for stateof health monitoring, diagnosis, prognosis, and fault tolerant control of vehiclecritical systems. He had an extensive experience in hybrid vehicle, modeling,control, and energy management strategies. He has several GM awards, holds37 patents, and has coauthored more than 60 refereed technical publications anda book.