1 using excel to implement software reliability models norman f. schneidewind naval postgraduate...

1

Using Excel to Implement Software Reliability Models

• Norman F. Schneidewind

• Naval Postgraduate School

• 2822 Racoon Trail,

• Pebble Beach, California, 93953, USA

• Voice: (831) 656-2719

• Fax: (831) 372-0445

• [email protected]

2

Outline• Introduction• Characteristics of Excel Implementation• Combined Software Reliability Tools – Excel Approach• Structure of Combined Approach• Notation for Prediction Worksheet• Equations for Prediction and Comparison Worksheets• Example Prediction Worksheet• Analysis of Prediction Worksheet• Notation for Actual – Prediction Comparisons Worksheet• Example Actual – Prediction Comparisons Worksheet• Analysis of Comparison Worksheet• Cumulative Failure Prediction Plots• Validation of Failure Count Predictions• Time to Failure Plot• Validation of Time to Failure Predictions• Conclusions• Excel Demo

3

Introduction

• CASRE and SMERFS, hereafter referred to as SRT (software reliability tools), were developed prior to the availability of mature spreadsheet programs.– Programs like Excel were not an option, but things

have changed.

• In Excel, the user can create equations, do data and statistical analysis, make plots, an do programming, using Visual Basic.

• In SRT, the programming of the models has been done for the user, but the functionality is fixed until the next revision.

4

Characteristics of Excel Implementation #1

• Advantages:– Almost all practitioners have Excel. A minority of

practitioners have SRT.– Easier for practitioners to use than SRT.– Typically, failure data is provided by practitioners in

Excel.– Improve technology transfer:

• Predictions can be made by the researcher in the spreadsheet and returned to the practitioner in the same spreadsheet.

– Formatted Excel data can be imported into Word and PowerPoint for creating reports and presentations.

5

Characteristics of Excel Implementation # 2

• Advantages:– User has more control over formatting of data,

prediction results, and plots.

– A large set of built-in mathematical and statistical functions are available for reliability analysis.

• SRT limited to functions like Chi-square.

– User can construct his own reliability equations.

• SRT equations are fixed, based on the models implemented.

– More flexibility in changing term in equations.

• Change cell values; copy and paste equations.

6


• Disadvantages:– Column and cell orientation of spreadsheets is

cumbersome.• It is not a natural mathematical format.• Need to repeat parameter entries for

iterations of equations.• Variable names are not case sensitive.• Variable names cannot be the same as

column or cell names.– Thus, some variables must renamed to avoid

naming conflicts.

7


• Disadvantages:– Mathematical library is not as extensive as

Fortran and C++ libraries used in SRT.– Does not have sophisticated model evaluation

criteria of SRT.• However, error analysis between actuals and

predictions (i.e., validation) can be done in Excel.

8

Combined Software Reliability Tools – Excel Approach

• Best approach may be to combine SRT with Excel.

• SRT provides model parameter estimation.

– Beyond the capabilities of Excel unless programmed in

Visual Basic.

– Copy and paste parameters from SRT into spreadsheet.

• Excel extends capabilities of SRT by allowing user

provided equations, statistical analysis, and plots.

9

Structure of Combined Approach

• Worksheets:– Definitions:

• Notation• Equations

– Predictions• Analysis

– Actual – Prediction Comparisons• Analysis• Plots• Validation

• Examples of this approach follow.

10

Notation for Prediction Worksheet

s starting interval for using observed failure data in parameter estimation

alpha: failure rate at the beginning of interval s

beta: negative of derivative of failure rate divided by failure rate t interval when time to next failure prediction made Xs-1 observed failure count in the range [1,s-1]. Xs,t observed failure count in the range [s,t] Xt observed failure count in the range [1,t] Ft given number of failures to occur after interval t TF(t) time to next failure(s) predicted at time t r(t) remaining failures predicted at time t T test or operational time D(T) cumulative number of failures detected at time T D(TL) cumulative number of failures detected over life of software TL

11

Equations for Prediction and Comparison Worksheets

Time to Next Failure(s) Predicted at Time t

(t) = [(log[ /( ( + ))]) / ] (t s+1)T X FF s,t t

Remaining Failures Predicted at Time t:

r(t) = (/) – Xs,t

Cumulative Number of Failures Detected at Time T:

D(T) = (α/β)[1 – exp (-β ((T –s + 1)))] + Xs-1

Cumulative Number of Failures Detected Over Life of Software TL:

D(TL) = / + Xs-1 References: [1, 2, 3].

12

Example Prediction Worksheet

Project s t Xs-1 Xst Xt Ft TF(t) r(t) T D(T) D (TL)

SatellitesProject 1 1 20.950000 0.15854 26 0 130 130 1 3.96 2.14 27 130.32 132.14

Project 2 12 3.447300 0.044796 23 39 32 71 5 2.63 44.96 23 71.00 115.9612 3.447300 0.044796 23 39 32 71 7 3.78 24 72.9712 3.447300 0.044796 23 39 32 71 9 4.99 25 74.8512 3.447300 0.044796 23 39 32 71 13 7.62 26 76.6512 3.447300 0.044796 23 39 32 71 14 8.33 27 78.3712 3.447300 0.044796 23 39 32 71 15 9.06 28 80.0212 3.447300 0.044796 23 39 32 71 17 10.60 29 81.6012 3.447300 0.044796 23 39 32 71 20 13.14 30 83.1012 3.447300 0.044796 23 39 32 71 31 84.5412 3.447300 0.044796 23 39 32 71 32 85.9212 3.447300 0.044796 23 39 32 71 33 87.2312 3.447300 0.044796 23 39 32 71 34 88.4912 3.447300 0.044796 23 39 32 71 35 89.69

13

Analysis of Prediction Worksheet # 1• s, , and obtained from SMERFS. • One interval = one week of calendar time.

• Project 1:– Optimal s = 1 for both failure count and time to failure

predictions.– t=26: interval when time to next failure prediction made

This is also the last interval of observed failure data. – X26 = 130: observed failure count in the range [1,26].– F1 = 1: given number of failures to occur after interval 26.– TF(26) = 3.96 intervals: time to next failure predicted at

time 26 intervals.

14

Analysis of Prediction Worksheet #2

• Project 1:– r(26) = 2.14: remaining failures predicted at time 26

intervals.

– T = 27 intervals: test time.

– D(27) = 130.32: cumulative number of failures detected at time 27 intervals.

– D() = 132.14: cumulative number of failures detected over life of software (conservatively, infinity).

• r(26) = D() - X26 = 132.14 – 130 = 2.14 remaining

failures, as in the above.

15


• Project 2:– Total range of 35 weeks divided into Parameter

Estimation Range = 1, 23 weeks and Prediction Range = 24, 35 weeks for the purpose of model validation.

• Model fit using historical data does not demonstrate validity!

– Estimate model parameters in range 1, 23 weeks.• Accuracy of future predictions demonstrates

validity.– Predict in range 24, 35 weeks and compare with

actuals.– Optimal s = 12 for both failure count and time to failure

predictions.

16


• Project 2:– t=23: interval when time to next failure prediction made

– X11 = 39: observed failure count in the range [1,11].

– X12,23 = 32: observed failure count in the range

[12,23].

– X23 = 71: observed failure count in the range [1,23].

– F1 = 5, …, 20: given number of failures to occur after

interval 23.

– TF(23) = 2.63, …, 13.14 intervals: time to next failures

predicted at time 23 intervals.

17


• Project 2:– r(23) = 44.96: remaining failures predicted at time 23

intervals.

– T = 23, …, 35 intervals: test time.

– D(23, …, 35) = 71.00, …, 89.69 cumulative number of failures detected at time 23, …, 35 intervals.

– D() = 115.96: cumulative number of failures detected over life of software (conservatively, infinity).

• r(23) = D() - X23 = 115.96 –71 = 44.96 remaining

failures, as in the above.

18

Notation for Actual – Prediction Comparisons Worksheet

• Parameter Estimation Range = 1, 23 weeks; Prediction Range = 24, 35 weeks; s = 12 weeks.

• D(T) Actual = Actual Cumulative Count, from Interval 1, in Prediction Range

• D(T) Pred = Predicted Cumulative Count, from Interval1, in Prediction Range

• Interval Actual = Difference in D(T) Actual• Interval Pred = Difference in D(T) Pred• Int Act Cum = Interval Actual Cumulative Count, from Interval 24, in

Prediction Range• Int Pred Cum = Interval Predicted Cumulative Count, from Interval 24, in

Prediction Range• TF(t) Actual = Actual Time to Next Given Number of Failures in the Int Act

Cum column• TF(t) Pred = Predicted Time to Next Given Number of Failures in the Int

Act Cum column

19

Example Actual – Prediction Comparisons Worksheet

Interval Failure Count D(T) Actual D(T) Pred Interval Actual Interval Pred Int Act Cum Int Pred CumTF(t) Actual TF(t) Pred

23 0 71 71.0024 0 71 72.97 0 1.97 0 1.9725 5 76 74.85 5 1.88 5 3.85 2 2.6326 0 76 76.65 0 1.80 5 5.65 2 2.6327 2 78 78.37 2 1.72 7 7.37 4 3.7828 2 80 80.02 2 1.65 9 9.02 5 4.9929 4 84 81.60 4 1.57 13 10.60 6 7.6230 0 84 83.10 0 1.51 13 12.10 6 7.6231 1 85 84.54 1 1.44 14 13.54 8 8.3332 1 86 85.92 1 1.38 15 14.92 9 9.0633 2 88 87.23 2 1.32 17 16.23 10 10.6034 3 91 88.49 3 1.26 20 17.49 11 13.1435 0 91 89.69 0 1.20 20 18.69 11 13.14

20

Analysis of Comparison Worksheet # 1

• Project 2– D(T) Actual is compared with D(T) Prediction.

• Failure counts are accumulated from Interval1in the parameter estimation range, but are compared in the prediction range.

– Interval Actual is compared with Interval Prediction.• Interval failure counts are compared in the

prediction range.– Int Act Cum is compared with Int Pred Cum.

• Interval failure counts are accumulated from Interval 24 in the prediction range and compared in the prediction range.

21


• Project 2• Make plots in prediction range:

– Actual and Predicted Cumulative Failures in Range 1, 35 Weeks.

– Actual and Predicted Cumulative Failures in Range 24,35 Weeks.

– Validation of Failure Count Predictions.• Residuals: (Predicted – Actual) versus week.

– Residuals do not show bias (i.e., trend in either positive or negative direction).

– Average Residual = -0.55 failures indicates optimistic prediction on average.

22

Cumulative Failures in Range 1, 35 Weeks:Parameter Estimation Range plus Prediction Range

Cumulative Failures in Range 1, 35 Weeks (Project 2)

70

75

80

85

90

95

22 24 26 28 30 32 34 36Week

Actual

Predicted

23

Cumulative Failures in Range 24,35 Weeks:Prediction Range

Cumulative Failures in Range 24, 35 Weeeks (Project 2)

0

5

10

15

20

25

22 24 26 28 30 32 34 36

Week

Actual

Predicted

24

Validation of Failure Count Predictions

Cumulative Failure Residuals: Predicted-Actual (Project 2)

-3.00

-2.00

-1.00

0.00

1.00

2.00

3.00

22 24 26 28 30 32 34 36

Week

Fai

lure

s

Average Residual = -0.55 failures

25


• Project 2• Make plot in prediction range:

– Actual and Predicted Time to Next Failures versus given number of failures.

– Validation of Time to Failure Predictions.• Residuals: (Predicted – Actual) versus given number

of failures.– Residuals show bias starting at 15 failures (week

32) as it becomes difficult to predict further out into the future.

– Average Residual = 0.87 weeks indicates optimistic prediction on average.

26

Time to Given Number of Failures

Time to Failure(s) (Project 2)

0

5

10

15

0 5 10 15 20 25Failures

Wee

ks

Actual

Predicted

27

Validation of Time to Failure Predictions

Time to Failures Residuals: Predicted-Actual (Project 2)

-1.00

0.00

1.00

2.00

3.00

0 5 10 15 20 25

Given Number of Failures

Wee

ks

Average Residual = 0.87 weeks

28

Conclusions

• Spreadsheet technology can effectively support software reliability modeling and prediction.

• Advantages relative to SRT are:– Easier transfer of technology to practitioners.– More user control of program’s operation.– Many built-in mathematical and statistical functions.

• Disadvantages relative to SRT are: – Cell format is not conducive to mathematical modeling.– No built-in model evaluation criteria.

• SRT and Excel can be combined to advantage:– SRT for reliability model parameter estimation.– Excel for reliability prediction.

29

References

• [1] Norman F. Schneidewind, "Reliability Modeling for Safety Critical Software", IEEE Transactions on Reliability, Vol. 46, No.1, March 1997, pp.88-98.

• [2] Norman F. Schneidewind, "Software Reliability Model with Optimal Selection of Failure Data", IEEE Transactions on Software Engineering, Vol. 19, No. 11, November 1993, pp. 1095-1104.

• [3] Norman F. Schneidewind and T. W. Keller, "Application of Reliability Models to the Space Shuttle", IEEE Software, Vol. 9, No. 4, July 1992 pp. 28-33.

1 using excel to implement software reliability models norman f. schneidewind naval postgraduate...

Documents

capabilities of excel

formatted excel data

srt software reliability

capabilities of srt

prediction worksheetequations

prediction results

failure data

iterations of equations