sample using linear regression reportdorpjr/emse4765/project/sample rep… · independent variables...
TRANSCRIPT
1
Satellite Applications Motivated by the Development of a Silver-Zinc Battery
Battery Performance Analysis Using Linear Regression
By: Leslie Gillespie-Marthaler
EMSE 271 December 18, 2009
SAMPLE REPORT
2
Introduction: Satellite manufacturers recently proposed replacing battery technology with a silver-zinc technology. Since satellite applications require reliable and long-lasting battery technology, the manufacturing association requested an analysis of the following:
1. Develop a model for linear regression based on battery performance data, using the Log of (Cycles to Failure); the model should be based on the best predictors available to characteristic the behavior of the battery throughout its lifecycle;
2. Perform diagnostic analysis of the fitted model; and 3. Forecast the Cycles to Failure with a 95% confidence interval, using the model for the
following independent variables: X1 = 1.5, X2 = 4.5, X3 = 50, X4 = 25, X5 = 2.
The table below provides the original battery performance data provided by the manufacturing association.
The Dependent Variable is: - Cycles to Failure is the dependent variable (Y) - The Log of (Cycles to Failure) is represented as Log(Y)
The Independent Variables are: - Charge Rate (X1) - Discharge Rate (X2) - Depth of Discharge (X3) - Temperature (X4) - End of Charge (X5)
Table 1: Original Performance Data
Cycles to
Failure
Log Cycles
to Failure
Charge Rate
(Amps)
Discharge Rate
(Amps)
Depth of Discharge
(% of rated
ampere-hours)
Temperature (Celsius)
End of charge (Volts)
Data Y Log(Y) X1 X2 X3 X4 X5 1 101.000 2.004 0.375 3.130 60.000 40.000 2.000 2 141.000 2.149 1.000 3.130 76.800 30.000 1.990 3 96.000 1.982 1.000 3.130 60.000 20.000 2.000 4 125.000 2.097 1.000 3.130 60.000 20.000 1.980 5 43.000 1.633 1.625 3.130 43.200 10.000 2.010 6 16.000 1.204 1.625 3.130 60.000 20.000 2.000 7 188.000 2.274 1.625 3.130 60.000 20.000 2.020 8 10.000 1.000 0.375 5.000 76.800 10.000 2.010 9 3.000 0.477 1.000 5.000 43.200 10.000 1.990 10 386.000 2.587 1.000 5.000 43.200 30.000 2.010 11 45.000 1.653 1.000 5.000 100.000 20.000 2.000 12 2.000 0.301 1.625 5.000 76.800 10.000 1.990 13 76.000 1.881 0.375 1.250 76.800 10.000 2.010 14 78.000 1.892 1.000 1.250 43.200 10.000 1.990 15 160.000 2.204 1.000 1.250 76.800 30.000 2.000 16 3.000 0.477 1.000 1.250 60.000 0.000 2.000 17 216.000 2.334 1.625 1.250 43.200 30.000 1.990 18 73.000 1.863 1.625 1.250 60.000 20.000 2.000 19 314.000 2.497 0.375 3.130 76.800 30.000 1.990 20 170.000 2.230 0.375 3.130 60.000 20.000 2.000
SAMPLE REPORT
3
When initially analyzing the performance data, the following observations were made concerning the Dependent Variable (Y) and its relationship with the Independent Variables (X1-5):
- There is large variability in the original cycles to failure (Y) data. In the histogram of the dependent variable (Y), we can see that it is skewed toward the left. This could be problematic in conducting the regression analysis.
- When we conduct a probability plot for this data, the standard deviation is also very large.
These observations are displayed in the histogram and probability plot generated by Minitab below:
Figure 1: Histogram of Cycles to Failure (Y)
Figure 2: Probability Plot of Cycles to Failure (Y)
4003002001000-100
5
4
3
2
1
0
Cycles to Failure
Freq
uenc
y
Mean 112.3StDev 104.7N 20
Histogram of Cycles to FailureNormal
5004003002001000-100-200-300
99
95
90
80
70
60504030
20
10
5
1
Cycles to Failure
Perc
ent
Mean 112.3StDev 104.7N 20AD 0.668P-Value 0.069
Probability Plot of Cycles to FailureNormal - 95% CI
SAMPLE REPORT
4
We would prefer a more normalized distribution for the dependent variable. When comparing the original dependent variable (Y) to the Log (Y), we do see some improvement in the distribution, indicating increased normality. The following observations were made when analyzing Log (Y):
- The standard deviation for Log cycles to failure is much smaller, but the P-value has decreased.
- In general, we would prefer to have a larger p-value in order to indicate greater normality of the distribution.
- At this point, it is difficult to discern the greater normality expressed by the Log (Y). - For the purposes of this project (and to meet the client’s request), we will choose (Log
cycles to failure) as the dependent variable for the regression model. Choosing the Log(Y) allows for clear interpretation in that constant changes to Log(Y) translate to constant percentage changes in Y.
These observations are displayed in the histogram and probability plot generated by Minitab below:
Figure 3: Histogram of Log Cycles to Failure (Log(Y))
3.22.82.42.01.61.20.80.4
7
6
5
4
3
2
1
0
Log Cycles to Failure
Freq
uenc
y
Mean 1.737StDev 0.6875N 20
Histogram of Log Cycles to FailureNormal
SAMPLE REPORT
5
Figure 4: Probability Plot of Log Cycles to Failure (Log(Y))
Correlation Analysis: In order to determine the best predictors for the regression model, we completed a correlation analysis of the dependent variable Log(Y) and the independent variables (X1-5). The figure below displays the correlation strengths between the dependent and independent variables.
Figure 5: Correlation between Log(Y) and X1-5
Log Cycles Charge Discharge Depth
End of
to Failure Rate Rate Discharge Temp Charge
Log(Y) X1 X2 X3 X4 X5 Log(Y) 1
X1
-0.175377126 1
X2
-0.291453599 -0.08686 1
X3
-0.068901748 -0.31402 0.191942 1
X4 0.718930287 -0.13537 -0.00283 0.066934 1
X5 0.101140168 0.007163 0.064439 0.019973 -
0.11434 1
The threshold chosen to indicate significant correlation is (0.19). The highlighted values represent significant correlation. Based on these findings, we should keep the following independent variables as best predictors for the regression model: (X2) Discharge Rate, (X3) Depth of Discharge, and (X4) Temperature.
Initial Regression Analysis: Based on this decision, we then move forward with regression analysis using the informed outcome from the correlation analysis. The results of the initial regression analysis are displayed below.
43210
99
95
90
80
70
60504030
20
10
5
1
Log Cycles to Failure
Perc
ent
Mean 1.737StDev 0.6875N 20AD 1.046P-Value 0.007
Probability Plot of Log Cycles to FailureNormal - 95% CI
SAMPLE REPORT
6
Figure 6: Initial Regression Analysis for Log(Y) and X2, X3, X4
Regression Statistics Multiple R 0.778 R Square 0.605 Adjusted R Square 0.530 Standard Error 0.471 Observations 20.000
F-value is moderately high
P-value is moderately low
ANOVA
P-value df SS MS F Significance F Regression 3.000 5.429 1.810 8.154 0.002 Residual 16.000 3.551 0.222
Total 19.000 8.980
VIF
Coefficients Standard
Error t Stat P-value Lower 95% Upper 95% Lower 95.0%
Upper 95.0%
from Minitab
Log(Y) Intercept 1.352 0.510 2.651 0.017 0.271 2.434 0.271 2.434 (X2) Discharge rate -0.134 0.077 -1.730 0.103 -0.298 0.030 -0.298 0.030 1.039
(X3) Depth of discharge -0.003 0.007 -0.399 0.695 -0.018 0.012 -0.018 0.012 1.043 (X4) Temperature 0.050 0.011 4.584 0.000 0.027 0.073 0.027 0.073 1.005
The P-value for depth of discharge is high, which indicates that we may want to discard.
The P-value for discharge rate is also high, which indicates that we may want to discard.
The regression equation is:
Log Cycles to Failure = 1.35 - 0.134 Discharge Rate - 0.00285 Depth of Discharge + 0.0497 Temperature
SAMPLE REPORT
7
The observations resulting from the initial regression results above are as follows:
- The variance inflation factors (VIF) values obtained from Minitab for each independent variable are all in the range of 1, so there is little to no colinearity among independent variables and the estimates for coefficients are considered stable.
- The R-Squared = 60.5%, which is moderately high. Ultimately, we would like a higher R-Squared value, indicating increased “goodness of fit” for the model.
- The Durbin-Watson statistic = 2.02425, indicating very little to no presence of auto-correlation among observations.
- The critical F-value is moderately high, but not significantly high. Ultimately, we would prefer a higher F-value.
- The statistical significance, or P-value is low, but not extremely low. Ultimately, we would prefer a lower P-value that is closer to zero.
- When looking at the individual P-values for the independent variables, that X2 and X3 have high P-values. In particular, the P-value for X3 is very high. This indicates that we may want to consider discarding X3 from the model.
- The residual analysis appears to support the assumption of normality for residuals. - The normal probability plot of the residuals shows some deviation from normality.
However, deviations do not invalidate the assumption of normality for the residuals. - We do see a high P-value for the residuals probability plot (0.243), which indicates
goodness of fit for normality test. - There is 1 influential observation (outlier) identified within the probability plot for the
residuals. This observation may require review or possible removal. - There is no apparent heteroscedasticity in the plot of the residual versus fitted values
for Log(Y). So, there is evidence to support constant variance in residuals.
The following figures support the observations listed above:
Figure 7: Residual Plots for Log (Y)
1.00.50.0-0.5-1.0
99
90
50
10
1
Residual
Per
cent
3.02.52.01.51.0
0.6
0.3
0.0
-0.3
-0.6
Fitted Value
Res
idua
l
0.60.40.20.0-0.2-0.4-0.6-0.8
4.8
3.6
2.4
1.2
0.0
Residual
Freq
uenc
y
2018161412108642
0.6
0.3
0.0
-0.3
-0.6
Observation Order
Res
idua
l
Normal Probability Plot Versus Fits
Histogram Versus Order
Residual Plots for Log Cycles to Failure
SAMPLE REPORT
8
Figure 8: Probability Plot of Residuals
Figure 9: Plot of Residuals versus Fitted Values for Log(Y)
1.00.50.0-0.5-1.0-1.5
99
95
90
80
70
60504030
20
10
5
1
RESI5
Perc
ent
Mean -4.99600E-16StDev 0.471N 20AD 0.654P-Value 0.243
Probability Plot of RESI5Normal - 95% CI
3.02.52.01.51.0
0.50
0.25
0.00
-0.25
-0.50
-0.75
FITS5
RES
I5
Residuals Versus Fitted Values
SAMPLE REPORT
9
Diagnostic Analysis: Analysis of the initial regression model indicates that the model described in the following regression equation is within reason:
Log Cycles to Failure = 1.35 - 0.134 Discharge Rate - 0.00285 Depth of Discharge + 0.0497 Temperature
The analysis of the residuals versus fitted values indicates that the majority of the values fall within expected thresholds. Only observation 1 looks somewhat suspicious, but this is not enough to warrant invalidation of the model.
To determine where we may want to focus in order to improve the existing model, we look at the interaction between the independent variables when plotted against the dependent variable. The graph below depicts interaction between temperature and discharge rate, thus suggesting that an additional independent variable may be needed to better express the relationship between the dependent and independent variables. This can be seen in the figure below:
Figure 10: Plot of Log(Y) Versus Independent Variables
Based on interaction effect between Temperature and Discharge rate, we decided to add another independent variable (Xnew): Temp*Discharge Rate.
The following table displays the new independent variable along with the other remaining independent variables that comprise the regression model. We will now refer to the following analysis as an adjusted model based on the addition of the new independent variable.
50403020100
3.0
2.5
2.0
1.5
1.0
0.5
0.0
Temperature
Log
Cycl
es t
o Fa
ilure
TemperatureDischarge RateDepth of Discharge
Variable
Scatterplot Log(Y) Versus X Variables
SAMPLE REPORT
10
Table 2: Adjusted Model Variables
Log Cycles
to Failure
Discharge Rate
(Amps)
Depth of Discharge
(% of rated
ampere-hours)
Temperature (Celsius)
New I.V. Temp*
Discharge Rate
Log(Y) X2 X3 X4 X(new) 2.004 3.130 60.000 40.000 125.200 2.149 3.130 76.800 30.000 93.900 1.982 3.130 60.000 20.000 62.600 2.097 3.130 60.000 20.000 62.600 1.633 3.130 43.200 10.000 31.300 1.204 3.130 60.000 20.000 62.600 2.274 3.130 60.000 20.000 62.600 1.000 5.000 76.800 10.000 50.000 0.477 5.000 43.200 10.000 50.000 2.587 5.000 43.200 30.000 150.000 1.653 5.000 100.000 20.000 100.000 0.301 5.000 76.800 10.000 50.000 1.881 1.250 76.800 10.000 12.500 1.892 1.250 43.200 10.000 12.500 2.204 1.250 76.800 30.000 37.500 0.477 1.250 60.000 0.000 0.000 2.334 1.250 43.200 30.000 37.500 1.863 1.250 60.000 20.000 25.000 2.497 3.130 76.800 30.000 93.900 2.230 3.130 60.000 20.000 62.600
Adjusted Regression Analysis: With the addition of the new independent variable, we now need to analyze the regression results with the new independent variable to determine whether or not the added independent variable results in an improvement in model fit.
The figure below provides the results from the adjusted regression analysis:
SAMPLE REPORT
11
Figure 11: Adjusted Regression Analysis for Log(Y) and X2, X3, X4, and Xnew1
Regression Statistics
Adjusted R2 is higher than original
Durbin-Watson statistic = 1.91397
Multiple R 0.868
original 60.5 R Square 0.754
adjusted 75.4
P-Value is lower
Adjusted R Square 0.683 Standard Error 0.396
The regression equation is Observations 19.000
Log Cycles to Failure = 1.77 - 0.305 Discharge Rate - 0.00224 Depth of Discharge
+ 0.0213 Temperature + 0.0104 Temp*Discharge Rate ANOVA
P-value
df SS MS F Significance F Regression 4 6.710 1.677 10.701 0.00035 Residual 14 2.195 0.157
Total 18 8.905
Coefficients
Standard Error t Stat P-value Lower 95%
Upper 95% Lower 95.0% Upper 95.0%
Intercept 1.766 0.509 3.467 0.004 0.674 2.858 0.674 2.858 (X2) Discharge rate -0.346 0.130 -2.655 0.019 -0.626 -0.067 -0.626 -0.067 (X3) Depth of discharge -0.003 0.006 -0.577 0.573 -0.016 0.009 -0.016 0.009 (X4) Temperature 0.026 0.021 1.268 0.226 -0.018 0.071 -0.018 0.071 Temp*DischRate 0.013 0.007 1.928 0.074 -0.001 0.028 -0.001 0.028
P-values for X3 and X4 are still high
The adjusted regression equation is:
Log Cycles to Failure = Log Cycles to Failure = 1.77 - 0.305 Discharge Rate - 0.00224 Depth of Discharge + 0.0213 Temperature + 0.0104 Temp*Discharge Rate
SAMPLE REPORT
12
The observations resulting from the adjusted regression results above are as follows:
- The R-Squared value has increased from 60.5% to 75.4%, indicating better fit. - The Adjusted R-Squared value has increased from 53% to 68.3%, indicating better fit. - The P-Value has decreased from .002 to .00035, indicating better fit. - The F-Value has increased from 8.154 to 10.701, indicating better fit. - The Durbin-Watson statistic = 1.91397, which is still close enough to 2 to indicate very little to
no presence of auto-correlation. - When looking at the individual P-values for the independent variables, X3 and X4 have
high P-values. In particular, the P-value for X3 is very high. This indicates that we may want to consider discarding X3 from the model.
- The residual analysis appears to support the assumption of normality for residuals. - The normal probability plot of the residuals shows some deviation from normality.
However, deviations do not invalidate the assumption of normality for the residuals. - We still see a high P-value, although lower than that for the original model, for the
residuals probability plot (0.136), which indicates goodness of fit for normality test. - There is 1 influential observation (outlier) identified within the probability plot for the
residuals. - There is no apparent heteroscedasticity in the plot of the residual versus fitted values
for Log(Y).
The following figures support the observations listed above:
Figure12: Adjusted Residual Plots for Log (Y)
1.00.50.0-0.5-1.0
99
90
50
10
1
Residual
Per
cent
3.02.52.01.51.0
0.5
0.0
-0.5
-1.0
Fitted Value
Res
idua
l
0.500.250.00-0.25-0.50-0.75
8
6
4
2
0
Residual
Freq
uenc
y
2018161412108642
0.5
0.0
-0.5
-1.0
Observation Order
Res
idua
l
Normal Probability Plot Versus Fits
Histogram Versus Order
Residual Plots for Log Cycles to Failure
SAMPLE REPORT
13
Figure 13: Adjusted Probability Plot of Residuals
Figure 14: Adjusted Plot of Residuals versus Fitted Values for Log(Y)
1.00.50.0-0.5-1.0-1.5
99
95
90
80
70
60504030
20
10
5
1
RESI6
Perc
ent
Mean -9.10383E-16StDev 0.471N 20AD 0.813P-Value 0.136
Probability Plot of RESI6Normal - 95% CI
3.02.52.01.51.0
0.50
0.25
0.00
-0.25
-0.50
-0.75
-1.00
FITS6
RES
I6
Scatterplot of RESI6 vs FITS6
SAMPLE REPORT
14
Overall, the addition of the new independent variable (Xnew1) Temp*Discharge Rate results in a better model fit.
Test for Adding a Second New Independent Variable: We then ask the question, would adding another independent variable result in additional improvement to the regression model? We can test this by adding another new independent variable and then comparing the regression results to the previous results. In particular, we will be looking at the R-Squared and Adjusted R-Squared valued to determine whether or not an additional variable improves the model.
Thus, we test the addition of (Xnew2) Discharge Rate *Depth of Discharge. The addition of this new independent variable is displayed in the table below:
Table 3: Test for Addition of Second New Independent Variable
Log Cycles
to Failure
Discharge Rate
(Amps)
Depth of Discharge
(% of rated
ampere-hours)
Temperature (Celsius)
New I.V. Temp*
Discharge Rate
New I.V. Discharge
Rate *Depth of Discharge
Log(Y) X2 X3 X4 X(new1) X(new2) 2.004 3.130 60.000 40.000 125.200 187.8 2.149 3.130 76.800 30.000 93.900 240.384 1.982 3.130 60.000 20.000 62.600 187.8 2.097 3.130 60.000 20.000 62.600 187.8 1.633 3.130 43.200 10.000 31.300 135.216 1.204 3.130 60.000 20.000 62.600 187.8 2.274 3.130 60.000 20.000 62.600 187.8 1.000 5.000 76.800 10.000 50.000 384 0.477 5.000 43.200 10.000 50.000 216 2.587 5.000 43.200 30.000 150.000 216 1.653 5.000 100.000 20.000 100.000 500 0.301 5.000 76.800 10.000 50.000 384 1.881 1.250 76.800 10.000 12.500 96 1.892 1.250 43.200 10.000 12.500 54 2.204 1.250 76.800 30.000 37.500 96 0.477 1.250 60.000 0.000 0.000 75 2.334 1.250 43.200 30.000 37.500 54 1.863 1.250 60.000 20.000 25.000 75 2.497 3.130 76.800 30.000 93.900 240.384 2.230 3.130 60.000 20.000 62.600 187.8
The results from the regression analysis with the second new independent variable are displayed in the following figure.
SAMPLE REPORT
15
Figure 15: Adjusted Regression Analysis for Log(Y) and X2, X3, X4, Xnew1 and Xnew2
Regression Statistics
Adding another independent variable does not improve the model fit. Multiple R 0.804
The R2 value decreases and the adjusted R2 also decreases, indicating that we should not add another variable.
R Square 0.646
Further supporting this decision, the F-value has decreased and the P-value has increased. Adjusted R Square 0.520
We will not add the second new independent variable.
Standard Error 0.477 Observations 20.000
ANOVA df SS MS F Significance F
Regression 5 5.801 1.160 5.109 0.007 Residual 14 3.179 0.227
Total 19 8.980
Coefficients
Standard Error t Stat P-value Lower 95% Upper 95% Lower 95.0%
Upper 95.0%
Intercept 1.928 1.215 1.586 0.135 -0.678 4.534 -0.678 4.534 X2 -0.350 0.340 -1.029 0.321 -1.078 0.379 -1.078 0.379 X3 -0.005 0.018 -0.257 0.801 -0.044 0.035 -0.044 0.035 X4 0.021 0.025 0.847 0.411 -0.032 0.075 -0.032 0.075 X(new1) 0.011 0.008 1.280 0.221 -0.007 0.028 -0.007 0.028 X(new2) 0.001 0.005 0.147 0.885 -0.009 0.010 -0.009 0.010
SAMPLE REPORT
16
Very quickly, we can see that adding a second new independent variable does not improve the regression model. The following observations substantiate this conclusion:
- Both the R-Squared and Adjusted R-Squared values decrease, indicating that we should not add another variable.
- Further supporting this decision, the F-value has decreased and the P-value has increased.
We will not add the second independent variable.
Test for Removing an Independent Variable: However, we now consider the effect that removing an independent variable would have on the regression model. In general, we want a model that captures the total variance in an equation with the fewest number of independent variables. Would subtraction of an existing variable improve the regression model?
Specifically, we look at removing X3 (Depth of Discharge) because it originally had a very high P-value when compared to the other independent variables. The table below displays the dependent and independent variables when X3 is removed.
Table 4: Test for Removal of Independent Variable X3
Log Cycles
to Failure
Discharge Rate
(Amps) Temperature
(Celsius)
New I.V. Temp*
Discharge Rate
Log(Y) X2 X4 X(new1) 2.004 3.130 40.000 125.200 2.149 3.130 30.000 93.900 1.982 3.130 20.000 62.600 2.097 3.130 20.000 62.600 1.633 3.130 10.000 31.300 1.204 3.130 20.000 62.600 2.274 3.130 20.000 62.600 1.000 5.000 10.000 50.000 0.477 5.000 10.000 50.000 2.587 5.000 30.000 150.000 1.653 5.000 20.000 100.000 0.301 5.000 10.000 50.000 1.881 1.250 10.000 12.500 1.892 1.250 10.000 12.500 2.204 1.250 30.000 37.500 0.477 1.250 0.000 0.000 2.334 1.250 30.000 37.500 1.863 1.250 20.000 25.000 2.497 3.130 30.000 93.900 2.230 3.130 20.000 62.600
The results from the regression analysis with X3 removed are displayed in the following figure.
SAMPLE REPORT
17
Figure 16: Adjusted Regression Analysis for Log(Y) and X2, X4, Xnew1 (X3 Removed)
Regression Statistics
Removing X3 does not improve the model fit. Multiple R 0.802
The R2 value decreases and the adjusted R2 also decreases, indicating that we should not remove the variable.
R Square 0.643
Further supporting this decision, the F-value has decreased and the P-value has increased. Adjusted R Square 0.576
We will not remove X3.
Standard Error 0.448 Observations 20.000
ANOVA df SS MS F Significance F
Regression 3.000 5.774 1.925 9.607 0.001 Residual 16.000 3.205 0.200
Total 19.000 8.980
Coefficients
Standard Error t Stat P-value Lower 95% Upper 95%
Lower 95.0%
Upper 95.0%
Intercept 1.659 0.459 3.616 0.002 0.686 2.631 0.686 2.631
X2 -0.312 0.145 -
2.162 0.046 -0.619 -0.006 -0.619 -0.006 X4 0.021 0.023 0.883 0.390 -0.029 0.070 -0.029 0.070 X(new1) 0.011 0.008 1.379 0.187 -0.006 0.027 -0.006 0.027
SAMPLE REPORT
18
Again, we can quickly determine that removing X3 does not result in a better model fit. The following observations substantiate this conclusion:
- The R2 value decreases and the adjusted R2 also decreases, indicating that we should not remove the variable.
- Further supporting this decision, the F-value has decreased and the P-value has increased.
Best Regressions Model Fit: We can also test for improved fit in the adjusted model by comparing it to the original.
Test for model improvement:
R2f 0.754
adjusted
R2r 0.605
original
dferror dff 14
adjusted dferror dfr 16
original
F= (R^f-R^r)/(dfr-dff)
(1-R2f)/dff
F= 4.239837
F(.05,2,14) = 0.951398
F>F(.05,2,14)
At this point, we can say with confidence that the best fit model is the one represented by the following equation:
Log Cycles to Failure = Log Cycles to Failure = 1.77 - 0.305 Discharge Rate - 0.00224 Depth of Discharge + 0.0213 Temperature + 0.0104 Temp*Discharge Rate
With Dependent Variable Log(Y)
And Independent Variables: - Discharge Rate (X2) - Depth of Discharge (X3) - Temperature (X4) - Temperature*Discharge Rate (X(new1))
Forecasting Dependent Variable Values: Provided values for each of the independent variables are below:
Charge Rate (X1) = 1.5 Discharge Rate (X2) = 4.5 Depth of Discharge (X3) = 50 Temperature (X4) = 25 End of Charge (X5) = 2 Temperature*Discharge Rate (Xnew1) = 25*4.5 = 112.5 XT0 = (1, 4.5, 50, 25, 112.5) bhatT = (1.77, -.305, -.002, +.021, +.010) yhat = 1.998 s = .396
SAMPLE REPORT
19
Adjusted Model Log Cycles to Failure = 1.77 - 0.305 Discharge Rate - 0.00224 Depth of Discharge Adjusted Model + 0.0213 Temperature + 0.0104 Temp*Discharge Rate
Log Cycles to Failure = 1.988
Adjusted Model
Cycles to Failure =
"10^1.988"
Adjusted Model
Cycles to Failure =
97.27
Adjusted Model
In addition, we can look at what the original model would have forecast and compare this with the output from the adjusted model. Original Model Log Cycles to Failure = - 27.7 - 0.199 Charge Rate - 0.142 Discharge Rate Original Model - 0.00483 Depth of Discharge + 0.0503 Temperature +14.7 End of Charge
Log Cycles to Failure =
1.7785
Original Model
Cycles to Failure =
"10^1.779"
Cycles to Failure =
60.12
Original Model
We must now determine the 95% prediction interval and look at the estimated values from the adjusted model. XTX
20.000 60.670 1256.800 390.000 1182.300 60.670 222.547 3892.784 1182.300 4213.599
1256.800 3892.784 83520.640 24704.000 75887.200 390.000 1182.300 24704.000 9500.000 28215.000
1182.300 4213.599 75887.200 28215.000 97632.950
(XTX)-1 1.655 -0.249 -0.012 -0.042 0.012
-0.249 0.107 -0.001 0.013 -0.005 -0.012 -0.001 0.000 0.000 0.000 -0.042 0.013 0.000 0.003 -0.001 0.012 -0.005 0.000 -0.001 0.000
SAMPLE REPORT
20
XT0*XTX-1 0.243711
XT0*XTX-1*X0 0.243711
sigma hat y = 0.441626
95% Prediction Interval: XT0bhat = 1.988 df error = 14 t14, .975 = 2.144787
logUB 2.93609 logLB 1.04991
UB 862.9785 LB 11.22018
Log Cycles to Failure (1.988) is within the 95% prediction interval. Cycles to Failure (97.27) is within the 95% prediction interval. Conclusions and Recommendation: The prediction interval for cycles to failure is very large. This is actually indicative of actual battery performance in space environments which has large deviations due to changes in temperature. Using the adjusted model, the number of cycles to failure is 97.27. This is a higher number of cycles to failure than the original model which predicts 60.12 cycles to failure. Assuming that the adjusted model is a better fit and therefore, provides a better estimate for the dependent variable, it might be said that the original model could have resulted in higher costs or lower sales based on the assumption that the battery would fail in fewer cycles. The adjusted model indicates that the battery lifecycle is longer than would otherwise be expected. It can be assumed that satellite purchasers would show preference for battery technology that provides a longer life and fewer replacements or upgrades. Therefore, the adjusted model is the recommended model for the manufacturing association.
SAMPLE REPORT