tutorial session 4: return on investment · discrete event simulation (des) • dynamic simulation...

19
1 University of Maryland Copyright © 2015 CALCE 1 calce 1 Prognostics and Health Management Consortium Tutorial Session 4: Return on Investment (Determining the Cost and Value of PHM) Peter Sandborn (301) 405-3167 [email protected] Prognostics TM PHM Conference 2015 University of Maryland Copyright © 2015 CALCE 2 calce 2 Prognostics and Health Management Consortium Prognostics and Health Management • Components • Interconnects • Boards • LRUs • Wiring • Systems Prognostics Sensors Analysis Methods Predicted remaining life Time Acceleration Factor (AF) Failure of prognostic sensors Failure of actual circuit Predicted remaining life Time Acceleration Factor (AF) Failure of prognostic sensors Failure of actual circuit Time Hazard Rate Infant Mortality Useful life Wear-out Prognostic cell Actual circuitry Time Hazard Rate Infant Mortality Useful life Wear-out Prognostic cell Actual circuitry Prognostics Analysis Accumulating damage Determining remaining life Data collection and reduction Deriving Value Decision making This Tutorial

Upload: others

Post on 24-Aug-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tutorial Session 4: Return on Investment · Discrete Event Simulation (DES) • Dynamic simulation (models changes over time) • State variables change only at a discrete set of

1

University of MarylandCopyright © 2015 CALCE

1calceTM

1Prognostics and Health Management Consortium

Tutorial Session 4:Return on Investment

(Determining the Cost and Value of PHM)

Peter Sandborn(301) 405-3167

[email protected]

PrognosticsTM

PHM Conference 2015

University of MarylandCopyright © 2015 CALCE

2calceTM

2Prognostics and Health Management Consortium

Prognostics and Health Management

• Components• Interconnects• Boards• LRUs• Wiring• Systems

Prognostics Sensors

Analysis Methods

Predicted remaining life

Time

Acc

eler

atio

n F

acto

r (A

F)

Failure of prognostic sensors

Failure of actual circuit

Predicted remaining life

Time

Acc

eler

atio

n F

acto

r (A

F)

Failure of prognostic sensors

Failure of actual circuit

Time

Haz

ard

Rat

e

Infant Mortality

Useful life Wear-out

Prognostic cell

Actual circuitry

Time

Haz

ard

Rat

e

Infant Mortality

Useful life Wear-out

Prognostic cell

Actual circuitry

PrognosticsAnalysis

Accumulating damage

Determining remaining life

Data collection

and reduction

Deriving Value

Decision making

This Tutorial

Page 2: Tutorial Session 4: Return on Investment · Discrete Event Simulation (DES) • Dynamic simulation (models changes over time) • State variables change only at a discrete set of

2

University of MarylandCopyright © 2015 CALCE

3calceTM

3Prognostics and Health Management Consortium

Cost and Return on Investment (ROI) Analysis

Peter Sandborn(301) 405-3167

[email protected]

PrognosticsTM

PHM Conference 2015

University of MarylandCopyright © 2015 CALCE

4calceTM

4Prognostics and Health Management Consortium

The Fundamental Maintenance/Value Tradeoff

Goal (depending on application) is either to:

Perform maintenance so that the remaining useful life (RUL) and the remaining useful performance (RUP) are minimized (use up as much of the life as possible), while simultaneously avoiding failures (unscheduled maintenance)

or

Find the optimum mix of scheduled and unscheduled maintenance that minimizes the life-cycle cost

Note, we are not advocating the avoidance of all failures here, this is NOT a “safety” argument, it is an economic argument.

Page 3: Tutorial Session 4: Return on Investment · Discrete Event Simulation (DES) • Dynamic simulation (models changes over time) • State variables change only at a discrete set of

3

University of MarylandCopyright © 2015 CALCE

5calceTM

5Prognostics and Health Management Consortium

Deriving Value from PHM at the System and Enterprise Levels

– System-level PHM value means taking action based on prognostics to manage one specific instance of a system, e.g., one truck or one turbine. The actions tend to be “real-time” and consist of:

• Modify how you sustain the system (e.g., call ahead to arrange for a maintenance action)

• Modify the mission (e.g., reduce speed, take a different route)

• Modify the system (e.g., adaptive re-configuration)

– Enterprise-level PHM value means taking action based on prognostics to manage an enterprise, e.g., a whole fleet of trucks or a farm of turbines. The actions are longer-term strategic planning things (usually not real-time):

• Optimizing the logistics

• Management via availability and other outcome-based contracts

The optimum for an individual system instance ≠ the optimum for the system instance within a population if the population is managed via an outcome-based contract

University of MarylandCopyright © 2015 CALCE

6calceTM

6Prognostics and Health Management Consortium

Cost of PHM Implementation• Development cost

– Hardware and software design, development, testing and qualification– Integration costs

• Additional costs associated with product manufacturing– Recurring cost per product for additional hardware, additional processing, additional

recurring functional testing– Installation costs

• Cost of creating and maintaining the infrastructure to make effective use of the PHM data

– Cost of data archiving– Cost of maintaining the PHM structures (logistics footprint)– Cost of training personnel– Cost of creating and maintaining documentation– Cost of changing the logistics/maintenance culture

• Cost of performing the necessary analysis to make it work– Cost of data collection– Cost of data analysis– Cost of false positives

• Financial costs (cost of money)– $1 today (to implement PHM) costs more than $1 to repair tomorrow

Page 4: Tutorial Session 4: Return on Investment · Discrete Event Simulation (DES) • Dynamic simulation (models changes over time) • State variables change only at a discrete set of

4

University of MarylandCopyright © 2015 CALCE

7calceTM

7Prognostics and Health Management Consortium

Potential Cost Avoidance (Return)

Associated with PHM

• Failures avoided– Minimizing the cost of unscheduled maintenance– Increasing availability– Reducing risk of loss of system– Increased human safety

• Minimizing loss of remaining life– Minimizing the amount of remaining life thrown away by

scheduled maintenance actions

• Logistics (reduction in logistics footprint)– Better spares management (quantity, refreshment, locations)

o Lead time reductiono Better use of (control over) inventory

– Minimization of investment in external test equip– Optimization of resource usage

• Repair– Better diagnosis and fault isolation (decreased inspection time,

decreased trouble shooting time)– Reduction in collateral damage during repair– Reduction in post-repair testing

• Reduction in redundancy (long term)– Can redundancy be decreased for selected sub-systems?

• Reduced waste stream– Less to end-of-life (dispose of) – disposal avoidance– Reduction in take-back cost

• Reduction in no-fault-founds• Eases design and qualification

of future systems• Reduced liability• Warranty claim verification

(resulting in warranty reserve fund reduction)

Problem: Future cost avoidance is a hard sell

University of MarylandCopyright © 2015 CALCE

8calceTM

8Prognostics and Health Management Consortium

PHM Cost ModelDiscrete-event simulation that follows a population of

sockets through their lifetime from first LRU installation to retirement of the socket.– “Discrete-event simulator” refers to the simulation of a

timeline, where specific events are added to the timeline and the resulting event order and timing can be used to analyze throughput, cost, availability, etc.

– “Socket” refers to one instance of an installation location for an LRU.

– “Population” means that the simulator is stochastic (governed by the laws of probability) so that a statistically significant number of non-identical fielded systems can be assessed and the results are distributions rather than single values.

Page 5: Tutorial Session 4: Return on Investment · Discrete Event Simulation (DES) • Dynamic simulation (models changes over time) • State variables change only at a discrete set of

5

University of MarylandCopyright © 2015 CALCE

9calceTM

9Prognostics and Health Management Consortium

Discrete Event Simulation (DES)

• Dynamic simulation (models changes over time)

• State variables change only at a discrete set of points in time (i.e., at “events”)

• Event = something that happens to the system at an instant in time that may change the state of the system (by definition, nothing relevant to the model changes between events)

• Discrete (successive changes are separated by finite amounts of time)

• Timeline = the sequence of events and their calendar times

• Path = a particular sequence of events

University of MarylandCopyright © 2015 CALCE

10calceTM

10Prognostics and Health Management Consortium

Following Sockets vs. LRUs

The discrete-event simulation follows a population of sockets through their lifetime (socket = the installation location of an LRU); issues with modeling sockets:– Easy to calculate socket cost and availability

– Implicit assumption of a stable population of LRUs

– Not-good-as-new repair – easy to model if you assume the same LRU comes back to the socket after repair

Alternatively, a simulation could follow LRUs; issues with following LRUs:– Repaired LRUs don’t necessarily go back into the same socket, so

you must model the LRU supply chain

– How are socket failures accounted for?

– Difficult to calculate socket cost and availability

Page 6: Tutorial Session 4: Return on Investment · Discrete Event Simulation (DES) • Dynamic simulation (models changes over time) • State variables change only at a discrete set of

6

University of MarylandCopyright © 2015 CALCE

11calceTM

11Prognostics and Health Management Consortium

Discrete Event Simulation for PHM

Time

Repeat the process for many socketsGenerate a histogram of the computed quantities

1) Add time zero implementation costs:• Base LRU recurring cost• PHM LRU recurring cost• LRU/socket non-recurring costs• System recurring cost

For one socket:

2) Predict failure date of LRU in the socket

3) Determine PHM predicted removal date

• Incorporates prognostic distance or safety margin

4) Maintain system on either the actual failure date or the PHM predicted removal date (whichever comes first)

• LRU replacement/repair cost• Costs associated with operational

profile

5) Start over at step 2) with a new or repaired LRU in the socket and continue process from the socket maintenance date until the end of the field support life

Infrastructure cost6) Compute/accumulate:

• Life-cycle cost/socket• Cost/operating hour • Availability• Failures avoided• Number of LRUs/socket

Every value used is “sampled” from a probability distribution that represents the input parameter

University of MarylandCopyright © 2015 CALCE

12calceTM

12Prognostics and Health Management Consortium

Discrete Event Simulation – Data-Driven

• 1,000 sockets simulated

• Small steps in the graph correspond to annual accumulation of infrastructure costs

• Big jumps in cost correspond to replacement of the LRU (average of 5 LRUs used per socket over the support life)

Unscheduled maintenance events (PHM missed these)

Scheduled maintenance events (PHM caught these)

Page 7: Tutorial Session 4: Return on Investment · Discrete Event Simulation (DES) • Dynamic simulation (models changes over time) • State variables change only at a discrete set of

7

University of MarylandCopyright © 2015 CALCE

13calceTM

13Prognostics and Health Management Consortium

Very Simple Baseline Data Assumptions for the Example Cases

• 0 investment cost• 0 infrastructure cost• Spares assumed to be available and purchased as needed

Variable in the model Value used for example analysis

Production cost (per unit) $10,000 Time to failure 5000 operational hours most likely value (symmetric triangular

distribution with variable d istribution width) Operational hours per year 2500

Sustainment life 25 years Unscheduled Scheduled

Value of each hour out of service $10,000 $500 Time to repair 6 hours 4 hours

Time to replace 1 hour 0.7 hours Cost of repair (materials cost) $500 $350 Fraction of repairs requiring

replacement of the LRU (as opposed to repair o f the LRU)

1.0 0.7

Various values and distributions(TTF)

University of MarylandCopyright © 2015 CALCE

14calceTM

14Prognostics and Health Management Consortium

PHM Approaches• Reliability-Based – Historical reliability predictions used to produce

failure distributions– Average part, average environmental stress

• Model-Based (Physics of Failure) – Observing the environmental stress history that a system has been subjected to and deciding based on an understanding of a nominal system, that the system may be unhealthy (indirect method) – degradation inferred from environmental stress

– Average part, actual environmental stress

• Data-Driven – Directly observing the system and deciding that it looks unhealthy (e.g., precursor to failure, LRU-dependent canaries, anomaly detection) – measured properties are correlated to actual failure

– Actual part, actual environmental stress

• Fusion – Combines data-driven and physics of failure approaches together to enable calculation of remaining useful life (RUL)

Page 8: Tutorial Session 4: Return on Investment · Discrete Event Simulation (DES) • Dynamic simulation (models changes over time) • State variables change only at a discrete set of

8

University of MarylandCopyright © 2015 CALCE

15calceTM

15Prognostics and Health Management Consortium

Background – Time to Failure (TTF) Distributions

• Doesn’t have to be “time”, could be thermal cycles, miles, takeoffs and landings, etc.

• Triangular distributions shown for simplicity

Pro

babi

lity

LRU Time-to-Failure (TTF)

Nom

inal

LR

U

The LRU’s TTF distribution represents variations in manufacturing and materials

LRU Instance

Actual TTF (sample)Time to Failure (t), hours

Fra

ctio

n of

par

ts f

aili

ng, f

(t)

0 100 200 300 400 500 600 700 800 9000

0.30

0.20

0.10

If the test was run until all instances of the product failed, then the area under the curve is 1 and we have a PDF of failures

University of MarylandCopyright © 2015 CALCE

16calceTM

16Prognostics and Health Management Consortium

100000

150000

200000

250000

300000

0 2000 4000 6000 8000 10000 12000

Fixed Scheduled Maintenance Interval (operational hours)

Eff

ec

tiv

e L

ife

Cy

cle

Co

st

(pe

r s

oc

ke

t)

1000 hours

2000 hours

4000 hours

6000 hours

8000 hours

10000 hours

Fixed Interval Scheduled Maintenance

TTF Distribution is

narrow

TTF distribution is wide

(10,000 sockets followed)

Width of the time to failure distribution

~Unscheduled maintenance solution

Single Socket

Page 9: Tutorial Session 4: Return on Investment · Discrete Event Simulation (DES) • Dynamic simulation (models changes over time) • State variables change only at a discrete set of

9

University of MarylandCopyright © 2015 CALCE

17calceTM

17Prognostics and Health Management Consortium

Data-Driven Methodologies(Precursor to Failure, Health Monitoring, LRU Dependent Canary)

• Example: A canary or other monitored structure is manufactured with the LRUs, i.e., it is coupled to a particular LRU’s manufacturing or material variations

• Prognostic Distance = length of time (in operational hours) before system failure that the prognostic structures are designed to indicate failure

Pro

babi

lity

LRU Time-to-Failure (TTF)P

roba

bili

tyPHM structure Time-to-Failure

Prognostic Distance

Sam

pled

TT

F

fore

cast

and

m

aint

enan

ce in

terv

al

for

the

sam

ple

Nom

inal

LR

U

The LRU’s TTF distribution represents variations in manufacturing and materials

LRU Instance

Actual TTF (sample)

Indicated Replacement Time

LRU Instance Actual TTF

(TTF distribution of the monitored

structure)

University of MarylandCopyright © 2015 CALCE

18calceTM

18Prognostics and Health Management Consortium

Model-Based Methodologies (LRU Independent, Life Consumption Monitoring (LCM), LRU Independent Canary)

• The PHM structure (or sensors) are manufactured independent of the LRUs, i.e., it is not coupled to a particular LRU’s manufacturing or material variations

• Safety Margin (Designed Prognostic Distance) = length of time (in operational hours) before failure of the nominal LRU that the PHM approach/structure is design to indicate failure.

Pro

babi

lity

LRU Time-to-Failure (TTF)

LRU Instance

Actual TTF (sample)

Pro

babi

lity

PHM structure Time-to-Failure

Safety Margin

Sam

pled

TT

F

fore

cast

and

m

aint

enan

ce in

terv

al

for

the

sam

ple

Nom

inal

LR

U

Nom

inal

LR

U

Indicated Replacement Time

The LRU’s TTF distribution represents variations in manufacturing and materials

LRU Instance Actual TTF

Page 10: Tutorial Session 4: Return on Investment · Discrete Event Simulation (DES) • Dynamic simulation (models changes over time) • State variables change only at a discrete set of

10

University of MarylandCopyright © 2015 CALCE

19calceTM

19Prognostics and Health Management Consortium

Comments

• The fundamental difference between the two models:• Data-Driven = the TTF distribution associated with the PHM structure

(or sensor/canary) is unique to each LRU instance• Model-Based = the TTF distribution associated with the PHM

structure (or sensor/canary) is tied to the nominal LRU and knows nothing about manufacturing/material variations between LRU instances

• Notes:• Failure does not have to be characterized by time – it could be cycles,

etc.• Triangular distributions are only used for simplicity

University of MarylandCopyright © 2015 CALCE

20calceTM

20Prognostics and Health Management Consortium

Data-Driven vs. Model-Based Methods(Varying TTF Distribution Width)

(10,000 sockets followed)

Single Socket

Data-DrivenModel-Based

100000

110000

120000

130000

140000

150000

160000

170000

180000

190000

200000

0 500 1000 1500

Safety Margin (operating hours)

Eff

ec

tiv

e L

ife

Cy

cle

Co

st

(pe

r s

oc

ke

t)

1000 hr TTF width, 1000 hr LCM width

2000 hr TTF width, 1000 hr LCM width

4000 hr TTF width, 1000 hr LCM width

100000

110000

120000

130000

140000

150000

160000

170000

180000

190000

200000

0 500 1000 1500

Prognostic Distance (operating hours)

Eff

ec

tiv

e L

ife

Cy

cle

Co

st

(pe

r s

oc

ke

t)

1000 hr TTF width, 1000 hr HM width

2000 hr TTF width, 1000 hr HM width

4000 hr TTF width, 1000 hr HM width

Pro

babi

lity

Pro

babi

lity

LRU Time-to-Failure (TTF)

TTF WidthVariations in TTF distribution width

DD width

DD width

DD width

MB width

MB width

MB width

Page 11: Tutorial Session 4: Return on Investment · Discrete Event Simulation (DES) • Dynamic simulation (models changes over time) • State variables change only at a discrete set of

11

University of MarylandCopyright © 2015 CALCE

21calceTM

21Prognostics and Health Management Consortium

(10,000 sockets followed)

Single Socket

Data-DrivenModel-Based

100000

110000

120000

130000

140000

150000

160000

170000

180000

190000

200000

0 500 1000 1500

Safety Margin (operating hours)

Eff

ec

tiv

e L

ife

Cy

cle

Co

st

(pe

r s

oc

ke

t)

2000 hr TTF width, 1000 hr LCM width

2000 hr TTF width, 2000 hr LCM width

2000 hr TTF width, 4000 hr LCM width

100000

110000

120000

130000

140000

150000

160000

170000

180000

190000

200000

0 500 1000 1500

Prognostic Distance (operating hours)

Eff

ec

tiv

e L

ife

Cy

cle

Co

st

(pe

r s

oc

ke

t)

2000 hr TTF width, 1000 hr HM width

2000 hr TTF width, 2000 hr HM width

2000 hr TTF width, 4000 hr HM width

PHM WidthVariations in PHM distribution width P

roba

bili

tyP

roba

bili

ty

PHM Approach Time-to-Failure

Data-Driven vs. Model-Based Methods(Varying PHM Distribution Width)

MB width

MB width

MB width

DD width

DD width

DD width

University of MarylandCopyright © 2015 CALCE

22calceTM

22Prognostics and Health Management Consortium

Data-Driven vs. Model-Based Method Observations

Single Socket

1) The model-based approach is highly dependent on the LRU’s TTF distribution

2) Data-driven methods are approximately independent of the LRU’s TTF distribution

3) All things equal,* optimum prognostic distances for data-driven methods are always smaller than optimum safety margins for model-based methods,** and therefore,

4) All things equal,* data-driven PHM methods will always result in lower life cycle cost solutions that model-based methods**

*All things equal = same LRUs, same shape and size distribution associated with the PHM approach

**Assumes that you have a choice, i.e., that there is a data-driven method that is applicable – there may not be

Page 12: Tutorial Session 4: Return on Investment · Discrete Event Simulation (DES) • Dynamic simulation (models changes over time) • State variables change only at a discrete set of

12

University of MarylandCopyright © 2015 CALCE

23calceTM

23Prognostics and Health Management Consortium

Multiple Socket Systems

Coincident time = time interval within which different sockets should be treated by the same maintenance action.

If │Timecurrent maintenance action – Timerequired maintenance action on LRU i │ < Timecoincident

then LRU i is addressed at the current maintenance action

Coincident time = 0 means that each socket is treated independentlyCoincident time = infinite means that any time any LRU in the

system demands to be fixed, all sockets are fixed no matter what life expectancy they have

Next predicted maintenance action for socket i (from PHM approach)

Current maintenance action for the system

University of MarylandCopyright © 2015 CALCE

24calceTM

24Prognostics and Health Management Consortium

LRU2 Timeline

CumulativeTimeline L

RU

1 and LR

U2

Replaced

< Coincident time

LRU instance-specific TTF samples

LR

U1 and L

RU

2 R

eplaced

< Coincident time

Multi-Unit Timelines

LRU1 Timeline

> Coincident time

LR

U2 R

eplaced

Etc…end of support

Two different LRUs that are part of the same system

Page 13: Tutorial Session 4: Return on Investment · Discrete Event Simulation (DES) • Dynamic simulation (models changes over time) • State variables change only at a discrete set of

13

University of MarylandCopyright © 2015 CALCE

25calceTM

25Prognostics and Health Management Consortium

500000

550000

600000

650000

700000

750000

800000

850000

900000

1 10 100 1000 10000 100000

Coincident Time (operational hours)

Mea

n C

ost

(p

er

sys

tem

)

Many-LRU System

(Dissimilar LRUs)

6 different LRUs

LRU #1, γ* = 19900 hr (data-driven)LRU #2, γ* = 9900 hr (unscheduled maintenance)LRU #3, TTF** = 10,000 hr (fixed interval sched maintenance)LRU #4, TTF** = 20,000 hr (data-driven)LRU #5, TTF** = 20,000 hr (data-driven)LRU #6, TTF** = 5,000 hr (fixed interval sched maintenance)

*Weibull distributions**Symmetric triangular distributions

2 LRU System

6 LRU System

Minimum lifecycle cost

Multiple Sockets

Co

st p

er

Sys

tem

All LRUs maintained separately

All LRUs maintained at the same time

University of MarylandCopyright © 2015 CALCE

26calceTM

26Prognostics and Health Management Consortium

Evaluating the Return on Investment (ROI) Associated with PHM

What is ROI?

Investment

Investment-ReturnROI

Why evaluate the ROI?– To build a business case for implementation– To perform cost/benefit analysis on different prognostic approaches– Evaluate when PHM may not be warranted

Interpreting ROI:0 = breakeven (no cost impact)

> 0 there is a direct cost benefit

< 0 there is no direct cost benefit

Investment

Investment-AvoidanceCost

Page 14: Tutorial Session 4: Return on Investment · Discrete Event Simulation (DES) • Dynamic simulation (models changes over time) • State variables change only at a discrete set of

14

University of MarylandCopyright © 2015 CALCE

27calceTM

27Prognostics and Health Management Consortium

Formulating an ROI for PHM

• ROI relative to unscheduled maintenance gives

Cus = total life cycle cost using unscheduled maintenanceCPHM = total life cycle cost using the selected PHM approachIPHM = PHM investment cost

PHM

PHMus

I

CC ROI

• Investment cost

INFRECNREPHM CCCI

CNRE = PHM non-recurring costsCREC = PHM recurring costsCINF = PHM infrastructure costs

University of MarylandCopyright © 2015 CALCE

28calceTM

28Prognostics and Health Management Consortium

ROI for PHM (continued)

• Not so fast! Is IPHM complete? Are there other investment costs too?

• Example: Employing PHM will result in as many or maybe more maintenance events as unscheduled maintenance. If PHM results in the need for more spare replacement units, is the cost of these units an investment cost?

• The costs of: false alarm resolution, procurement of a different quantities of LRUs, and variations in maintenance costs are not included in the investment cost because they are the result of the investment and are reflected in CPHM

• CPHM must also include the cost of money differences associated with purchasing LRUs at differently timed maintenance events

This cost is a result of the investment, not part of the investment

Page 15: Tutorial Session 4: Return on Investment · Discrete Event Simulation (DES) • Dynamic simulation (models changes over time) • State variables change only at a discrete set of

15

University of MarylandCopyright © 2015 CALCE

29calceTM

29Prognostics and Health Management Consortium

Sandel ST3400 TAWS/RMI Electronic Display Unit

Example: PHM Return on Investment

• 502 Aircraft in fleet (Southwest Airlines)

• 2 sockets per aircraft• Support life: 20 years• Negligible false alarms assumed• 7% discount rate

Feldman, Jazouli, and Sandborn, “A Methodology for Determining the Return on Investment Associated with Prognostics and Health Management,” IEEE Trans. On Reliability, June 2009.

University of MarylandCopyright © 2015 CALCE

30calceTM

30Prognostics and Health Management Consortium

Precursor to Failure

$76,000

$78,000

$80,000

$82,000

$84,000

$86,000

$88,000

0 200 400 600 800 1000 1200

Prognostic Distance (hours)

Lif

eCyc

le C

ost

per

So

cket

Prognostic Distance

• For a data-driven PHM approach, an analysis is performed to find the prognostic distance that yields the lowest cost

• The prognostic distance that produces the lowest costs is a function of the inputs and is application-specific

• For the data-driven approach, used here, the prognostic distance was chosen as 475 hours

All failures avoided, but lots of remaining life thrown away

Few failures avoided, approaching unscheduled maintenance

Data-Driven

Page 16: Tutorial Session 4: Return on Investment · Discrete Event Simulation (DES) • Dynamic simulation (models changes over time) • State variables change only at a discrete set of

16

University of MarylandCopyright © 2015 CALCE

31calceTM

31Prognostics and Health Management Consortium

Time (years)

Cum

ulat

ive

Cos

t per

Soc

ket

Data-Driven PHM Approach

Unscheduled Maintenance

Cumulative Life-Cycle Costs

University of MarylandCopyright © 2015 CALCE

32calceTM

32Prognostics and Health Management Consortium

-5.0

0.0

5.0

10.0

15.0

20.0

25.0

30.0

35.0

0 500 1000 1500 2000 2500 3000 3500 4000 4500

Annual Infrastructure Cost per Socket ($)

Ret

urn

On

In

vest

men

t

ROI Analysis: Data-DrivenThe evaluation of ROI (relative to unscheduled maintenance) as a function of various implementation costs

Breakeven Point

> 0, Cost benefit

< 0, No cost benefit

Mea

n R

etu

rn o

n In

vest

men

t

Page 17: Tutorial Session 4: Return on Investment · Discrete Event Simulation (DES) • Dynamic simulation (models changes over time) • State variables change only at a discrete set of

17

University of MarylandCopyright © 2015 CALCE

33calceTM

33Prognostics and Health Management Consortium

ROI as a Function of Time

First maintenance event

Small ROI increase (at 3rd maintenance event), which coincides with inventory replenishment. PHM saves money on maintenance events, BUT buys more expensive spares (that include PHM recurring costs)

PHM missed this failure, therefore, it was resolved as unscheduled maintenance event

Annual inventory cost is greater for PHM, since the spares are more expensive

ROI of data-driven PHM relative to unscheduled maintenance

University of MarylandCopyright © 2015 CALCE

34calceTM

34Prognostics and Health Management Consortium

ROI for the Application of PHM to Wind Turbines (EADS)

TRIADE includes temperature, pressure, vibration, strain and acoustic sensors that can be used to monitor the health of turbine blades in a wind turbine

Time history of 1000 wind turbines:

Fixed interval maintenance

Using TRIADE (data-driven PHM)

Haddad, Sandborn, Jazouli, Foucher, and Rouet, “Guaranteeing High Availability of Wind Turbines,” Proc. ESREL, 2011.

Page 18: Tutorial Session 4: Return on Investment · Discrete Event Simulation (DES) • Dynamic simulation (models changes over time) • State variables change only at a discrete set of

18

University of MarylandCopyright © 2015 CALCE

35calceTM

35Prognostics and Health Management Consortium

Other Things to Consider …

• Redundancy• Not “as good as new” repair• Socket failures• Multiple failure mechanisms• Simple canaries are modeled as LRU independent fuses,

but may actually be mixtures of fuses and LRU-independent methods

• Second order uncertainty (uncertainty about uncertainty) may be a very real thing for this analysis

• Determining the right shape and size of distributions associated with various PHM approaches

University of MarylandCopyright © 2015 CALCE

36calceTM

36Prognostics and Health Management Consortium

Making Business Cases

• Cost avoidance is a hard sell

• Program/project management hears that the “sky is falling” every day from engineers

• Resource allocation is based on:1) The occurrence of serious

events2) Projected future impact (cost,

availability)

Page 19: Tutorial Session 4: Return on Investment · Discrete Event Simulation (DES) • Dynamic simulation (models changes over time) • State variables change only at a discrete set of

19

University of MarylandCopyright © 2015 CALCE

37calceTM

37Prognostics and Health Management Consortium

ResourcesGeneral Cost/Maintenance Model Description:

P.A. Sandborn and C. Wilkinson, “A Maintenance Planning and Business Case Development Model for the Application of Prognostics and Health Management (PHM) to Electronic Systems,” Microelectronics Reliability, Vol. 47, No. 12, pp. 1889-1901, December 2007.

http://www.enme.umd.edu/ESCML/Papers/MR_8772-SandbornComplete.pdf

Return on Investment Modeling:

K. Feldman, T. Jazouli, and P. Sandborn, “A Methodology for Determining the Return on Investment Associated with Prognostics and Health Management,” IEEE Trans. on Reliability, pp. 305-316, June 2009.

http://www.enme.umd.edu/ESCML/Papers/Feldman_et_al_IEEE_Trans_Rel.pdf

PHM Applied to Electronics:

M. Pecht editor, Prognostics and Health Management of Electronics, ed. M. Pecht, John Wiley & Sons, Inc., Hoboken, NJ, 2008.