10-1 complete business statistics by amir d. aczel & jayavel sounderpandian 6 th edition (sie)
TRANSCRIPT
10-1
COMPLETE COMPLETE BUSINESS BUSINESS
STATISTICSSTATISTICSbyby
AMIR D. ACZELAMIR D. ACZEL
&&
JAYAVEL SOUNDERPANDIANJAYAVEL SOUNDERPANDIAN
66thth edition (SIE) edition (SIE)
10-2
Chapter 10 Chapter 10
Simple Linear Simple Linear Regression and Regression and
CorrelationCorrelation
10-3
• Using Statistics• The Simple Linear Regression Model• Estimation: The Method of Least Squares• Error Variance and the Standard Errors of Regression
Estimators• Correlation• Hypothesis Tests about the Regression Relationship• How Good is the Regression?• Analysis of Variance Table and an F Test of the
Regression Model• Residual Analysis and Checking for Model Inadequacies• Use of the Regression Model for Prediction• The Solver Method for Regression
Simple Linear Regression and CorrelationSimple Linear Regression and Correlation1010
10-4
• Determine whether a regression experiment would be useful in a given instance
• Formulate a regression model• Compute a regression equation• Compute the covariance and the correlation
coefficient of two random variables• Compute confidence intervals for regression
coefficients• Compute a prediction interval for the dependent
variable
LEARNING OBJECTIVESLEARNING OBJECTIVES1010
After studying this chapter, you should be able to:After studying this chapter, you should be able to:
10-5
• Test hypothesis about a regression coefficients• Conduct an ANOVA experiment using regression
results• Analyze residuals to check if the assumptions about the
regression model are valid• Solve regression problems using spreadsheet templates• Apply covariance concept to linear composites of
random variables• Use LINEST function to carry out a regression
LEARNING OBJECTIVESLEARNING OBJECTIVES (continued)(continued)1010
After studying this chapter, you should be able to:After studying this chapter, you should be able to:
10-6
10-1 Using Statistics10-1 Using Statistics
• RegressionRegression refers to the statistical technique of modeling the relationship between variables.• In simple linearsimple linear regressionregression, we model the relationship
between two variablestwo variables. • One of the variables, denoted by Y, is called the dependent dependent
variable variable and the other, denoted by X, is called the independent variableindependent variable.
• The model we will use to depict the relationship between X and Y will be a straight-line relationshipstraight-line relationship.
• A graphical sketch of the the pairs (X, Y) is called a scatter scatter plotplot.
10-7
This scatterplot locates pairs of observations of advertising expenditures on the x-axis and sales on the y-axis. We notice that:
Larger (smaller) values of sales tend to be associated with larger (smaller) values of advertising.
Scatterplot of Advertising Expenditures (X) and Sales (Y)
50403020100
140
120
100
80
60
40
20
0
Advertising
Sa
les
The scatter of points tends to be distributed around a positively sloped straight line.
The pairs of values of advertising expenditures and sales are not located exactly on a straight line.
The scatter plot reveals a more or less strong tendency rather than a precise linear relationship.
The line represents the nature of the relationship on average.
10-1 Using Statistics10-1 Using Statistics
10-8
X
Y
X
Y
X 0
0
0
0
0
Y
X
Y
X
Y
XY
Examples of Other ScatterplotsExamples of Other Scatterplots
10-9
The inexact nature of the relationship between advertising and sales suggests that a statistical statistical modelmodel might be useful in analyzing the relationship.
A statistical model separates the systematic systematic componentcomponent of a relationship from the random componentrandom component.
The inexact nature of the relationship between advertising and sales suggests that a statistical statistical modelmodel might be useful in analyzing the relationship.
A statistical model separates the systematic systematic componentcomponent of a relationship from the random componentrandom component.
DataData
Statistical Statistical modelmodel
Systematic Systematic componentcomponent
++RandomRandomerrorserrors
In ANOVA, the systematic component is the variation of means between samples or treatments (SSTR) and the random component is the unexplained variation (SSE).
In regressionregression, the systematic component is the overall linear relationship, and the random component is the variation around the line.
In ANOVA, the systematic component is the variation of means between samples or treatments (SSTR) and the random component is the unexplained variation (SSE).
In regressionregression, the systematic component is the overall linear relationship, and the random component is the variation around the line.
Model BuildingModel Building
10-10
The population simple linear regression model:
Y= 0 + 1 X + Nonrandom or Random
Systematic Component Component
where Y is the dependent variable, the variable we wish to explain or predict X is the independent variable, also called the predictor variable is the error term, the only random component in the model, and thus, the only source of randomness in Y.
0 is the intercept of the systematic component of the regression relationship.1 is the slope of the systematic component.
The conditional mean of Y:
The population simple linear regression model:
Y= 0 + 1 X + Nonrandom or Random
Systematic Component Component
where Y is the dependent variable, the variable we wish to explain or predict X is the independent variable, also called the predictor variable is the error term, the only random component in the model, and thus, the only source of randomness in Y.
0 is the intercept of the systematic component of the regression relationship.1 is the slope of the systematic component.
The conditional mean of Y: E Y X X[ ] 0 1
10-2 The Simple Linear Regression 10-2 The Simple Linear Regression ModelModel
10-11
The simple linear regression model gives an exact linear relationship between the expected or average value of Y, the dependent variable, and X, the independent or predictor variable: E[Yi]=0 + 1 Xi
Actual observed values of Y differ from the expected value by an unexplained or random error:
Yi = E[Yi] + i
= 0 + 1 Xi + i
The simple linear regression model gives an exact linear relationship between the expected or average value of Y, the dependent variable, and X, the independent or predictor variable: E[Yi]=0 + 1 Xi
Actual observed values of Y differ from the expected value by an unexplained or random error:
Yi = E[Yi] + i
= 0 + 1 Xi + i
X
Y
E[Y]=0 + 1 X
Xi
}} 1 = Slope
1
0 = Intercept
Yi
{Error: i
Regression Plot
Picturing the Simple LinearPicturing the Simple Linear Regression Model Regression Model
10-12
• The relationship between X and Y is a straight-line relationship.
• The values of the independent variable X are assumed fixed (not random); the only randomness in the values of Y comes from the error term i.
• The errors i are normally distributed with mean 0 and variance 2. The errors are uncorrelated (not related) in successive observations. That is: ~ N(0,2)
• The relationship between X and Y is a straight-line relationship.
• The values of the independent variable X are assumed fixed (not random); the only randomness in the values of Y comes from the error term i.
• The errors i are normally distributed with mean 0 and variance 2. The errors are uncorrelated (not related) in successive observations. That is: ~ N(0,2) X
Y
E[Y]=0 + 1 X
Assumptions of the Simple Linear Regression Model
Identical normal distributions of errors, all centered on the regression line.
Assumptions of the Simple Linear Assumptions of the Simple Linear Regression ModelRegression Model
10-13
Estimation of a simple linear regression relationship involves finding estimated or predicted values of the intercept and slope of the linear regression line.
The estimated regression equation: Y = b0 + b1X + e
where b0 estimates the intercept of the population regression line, 0 ;b1 estimates the slope of the population regression line, 1;and e stands for the observed errors - the residuals from fitting the estimated regression line b0 + b1X to a set of n points.
Estimation of a simple linear regression relationship involves finding estimated or predicted values of the intercept and slope of the linear regression line.
The estimated regression equation: Y = b0 + b1X + e
where b0 estimates the intercept of the population regression line, 0 ;b1 estimates the slope of the population regression line, 1;and e stands for the observed errors - the residuals from fitting the estimated regression line b0 + b1X to a set of n points. The estimated regression line:
+
where Y (Y - hat) is the value of Y lying on the fitted regression line for a givenvalue of X.
Y b b X 0 1
The estimated regression line:
+
where Y (Y - hat) is the value of Y lying on the fitted regression line for a givenvalue of X.
Y b b X 0 1
10-3 Estimation: The Method of Least 10-3 Estimation: The Method of Least SquaresSquares
10-14
Fitting a Regression LineFitting a Regression Line
X
Y
Data
X
Y
Three errors from a fitted line
X
Y
Three errors from the least squares regression line
X
Errors from the least squares regression line are minimized
10-15
.{Error ei Yi Yi
Yi the predicted value of Y for Xi
Yi the predicted value of Y for Xi
YY
XX
Y b b X 0 1 the fitted regression lineY b b X 0 1 the fitted regression line
Yi
Yi
Errors in RegressionErrors in Regression
XXii
point data observed the point data observed the
10-16
Least Squares RegressionLeast Squares Regression
The sum of squared errors in regression is:
SSE = e (y
The is that which the SSEwith respect to the estimates b and b .
The :
y x
x y x x
i
2
i=1
n
ii=1
n
0 1
ii=1
n
ii=1
n
i ii=1
n
ii=1
n
i
2
i=1
n
)y
nb b
b b
i
2
0 1
0 1
least squares regression line
normal equations
minimizes
b0SSE
b1
Least squares b0
Least squares b1
At this point SSE is minimized with respect to b0 and b1
10-17
Sums of Squares and Cross Products:
Least squares regression estimators:
SS x x xx
n
SS y y yy
n
SS x x y y xyx y
n
bSSSS
b y b x
x
y
xy
XY
X
( )
( )
( )( )( )
2 2
2
2 2
2
1
0 1
Sums of Squares and Cross Products:
Least squares regression estimators:
SS x x xx
n
SS y y yy
n
SS x x y y xyx y
n
bSSSS
b y b x
x
y
xy
XY
X
( )
( )
( )( )( )
2 2
2
2 2
2
1
0 1
Sums of Squares, Cross Products, Sums of Squares, Cross Products, and Least Squares Estimatorsand Least Squares Estimators
10-18
Miles Dollars Miles 2 Miles*Dollars 1211 1802 1466521 2182222 1345 2405 1809025 3234725 1422 2005 2022084 2851110 1687 2511 2845969 4236057 1849 2332 3418801 4311868 2026 2305 4104676 4669930 2133 3016 4549689 6433128 2253 3385 5076009 7626405 2400 3090 5760000 7416000 2468 3694 6091024 9116792 2699 3371 7284601 9098329 2806 3998 7873636 11218388 3082 3555 9498724 10956510 3209 4692 10297681 15056628 3466 4244 12013156 14709704 3643 5298 13271449 19300614 3852 4801 14837904 18493452 4033 5147 16265089 20757852 4267 5738 18207288 24484046 4498 6420 20232004 28877160 4533 6059 20548088 27465448 4804 6426 23078416 30870504 5090 6321 25908100 32173890 5233 7026 27384288 36767056 5439 6964 29582720 3787719679,448 106,605 293,426,946 390,185,014
Miles Dollars Miles 2 Miles*Dollars 1211 1802 1466521 2182222 1345 2405 1809025 3234725 1422 2005 2022084 2851110 1687 2511 2845969 4236057 1849 2332 3418801 4311868 2026 2305 4104676 4669930 2133 3016 4549689 6433128 2253 3385 5076009 7626405 2400 3090 5760000 7416000 2468 3694 6091024 9116792 2699 3371 7284601 9098329 2806 3998 7873636 11218388 3082 3555 9498724 10956510 3209 4692 10297681 15056628 3466 4244 12013156 14709704 3643 5298 13271449 19300614 3852 4801 14837904 18493452 4033 5147 16265089 20757852 4267 5738 18207288 24484046 4498 6420 20232004 28877160 4533 6059 20548088 27465448 4804 6426 23078416 30870504 5090 6321 25908100 32173890 5233 7026 27384288 36767056 5439 6964 29582720 3787719679,448 106,605 293,426,946 390,185,014
85.274
25
448,79)255333776.1(
25
605,106
10
26.1255333776.184.557,947,40
4.852,402,51
1
4.852,402,5125
)605,106)(448,79(014,185,390
)(
84.557,947,4025
2448,79946,426,293
22
xbyb
XSS
XYSS
b
n
yxxyxySS
n
xxxSS
85.274
25
448,79)255333776.1(
25
605,106
10
26.1255333776.184.557,947,40
4.852,402,51
1
4.852,402,5125
)605,106)(448,79(014,185,390
)(
84.557,947,4025
2448,79946,426,293
22
xbyb
XSS
XYSS
b
n
yxxyxySS
n
xxxSS
Example 10-1Example 10-1
10-19
Template (partial output) that can be Template (partial output) that can be used to carry out a Simple Regressionused to carry out a Simple Regression
10-20
Template (continued) that can be used Template (continued) that can be used to carry out a Simple Regressionto carry out a Simple Regression
10-21
Template (continued) that can be used Template (continued) that can be used to carry out a Simple Regressionto carry out a Simple Regression
Residual Analysis. The plot shows the absence of a relationshipbetween the residuals and the X-values (miles).Residual Analysis. The plot shows the absence of a relationshipbetween the residuals and the X-values (miles).
10-22
Template (continued) that can be used Template (continued) that can be used to carry out a Simple Regressionto carry out a Simple Regression
Note:Note: The normal probability plot is approximately linear. This would indicate that the normality assumption for the errors has not been violated.
Note:Note: The normal probability plot is approximately linear. This would indicate that the normality assumption for the errors has not been violated.
10-23
Y
X
What you see when looking at the total variation of Y.
X
What you see when looking along the regression line at the error variance of Y.
Y
Total Variance and Error VarianceTotal Variance and Error Variance
10-24
Degrees of Freedom in Regression:
An unbiased estimator of s2
, denoted by S2
:
df = (n - 2) (n total observations less one degree of freedom
for each parameter estimated (b0 and b1) )
= ( - )
=
MSE =SSE
(n - 2)
SSE Y Y SSY
SS XY
SS XSSY b SS XY
( )2
2
1
X
Y
Square and sum all regression errors to find SSE.
Example 10 - 1:
SSE SSY b SS XY
MSESSE
n
s MSE
=
166855898 1 255333776 51402852 4
2328161 2
2
2328161 2
23101224 4
101224 4 318 158
( . )( . )
.
.
.
. .
10-4 Error Variance and the Standard 10-4 Error Variance and the Standard Errors of Regression EstimatorsErrors of Regression Estimators
10-25
The standard error of (intercept)
where s = MSE
The standard error of (slope)
0
1
b
s bs x
nSS
b
s bs
SS
X
X
:
( )
:
( )
0
2
1
The standard error of (intercept)
where s = MSE
The standard error of (slope)
0
1
b
s bs x
nSS
b
s bs
SS
X
X
:
( )
:
( )
0
2
1
Example 10 - 1:
s bs x
nSS X
s bs
SS X
( )
.
( )( . ).
( )
.
..
0
2
318 158 293426944
25 4097557 84170 338
1
318 158
40947557 840 04972
Example 10 - 1:
s bs x
nSS X
s bs
SS X
( )
.
( )( . ).
( )
.
..
0
2
318 158 293426944
25 4097557 84170 338
1
318 158
40947557 840 04972
Standard Errors of Estimates in Standard Errors of Estimates in RegressionRegression
10-26
A (1 - ) 100% confidence interval for b0
A (1 - ) 100% confidence interval for b1
:
,( )( )
:
,( )( )
b tn
s b
b tn
s b
02
2 0
12
2 1
Example 10 - 195% Confidence Intervals:b t s b
b t s b
0 0 025 25 2 0
0 025 25 2
170 33827485 352 43
7758 627 28
01 25533 010287115246 1 35820
1 1
. ,( ) ( )
. ,( ) ( )
( . ). .
[ . , . ]
( ). .
[ . , . ]
= 274.85 2.069) (
= 1.25533 2.069) ( .04972
Example 10 - 195% Confidence Intervals:b t s b
b t s b
0 0 025 25 2 0
0 025 25 2
170 33827485 352 43
7758 627 28
01 25533 010287115246 1 35820
1 1
. ,( ) ( )
. ,( ) ( )
( . ). .
[ . , . ]
( ). .
[ . , . ]
= 274.85 2.069) (
= 1.25533 2.069) ( .04972
Length = 1H
eight = Slope
Least-squares point estimate:b1=1.25533
Upper
95%
bou
nd o
n slo
pe: 1
.358
20
Lower 95% bound: 1
.15246
(not a possible value of the regression slope at 95%)
0
Confidence Intervals for the Confidence Intervals for the Regression ParametersRegression Parameters
10-27
Template (partial output) that can be used Template (partial output) that can be used to obtain Confidence Intervals for to obtain Confidence Intervals for and and
10-28
The correlationcorrelation between two random variables, X and Y, is a measure of the degree of linear associationdegree of linear association between the two variables.
The population correlation, denoted by, can take on any value from -1 to 1.
The correlationcorrelation between two random variables, X and Y, is a measure of the degree of linear associationdegree of linear association between the two variables.
The population correlation, denoted by, can take on any value from -1 to 1.
indicates a perfect negative linear relationship-1 < < 0 indicates a negative linear relationship indicates no linear relationship0 < < 1 indicates a positive linear relationshipindicates a perfect positive linear relationship
The absolute value of indicates the strength or exactness of the relationship.
indicates a perfect negative linear relationship-1 < < 0 indicates a negative linear relationship indicates no linear relationship0 < < 1 indicates a positive linear relationshipindicates a perfect positive linear relationship
The absolute value of indicates the strength or exactness of the relationship.
10-5 Correlation10-5 Correlation
10-29
Y
X
= 0= 0
Y
X
= -.8= -.8 Y
X
= .8= .8
Y
X
= 0= 0
Y
X
= -1= -1Y
X
= 1= 1
Illustrations of CorrelationIllustrations of Correlation
10-30
The sample correlation coefficient*:
=rSS
XYSS
XSS
Y
The population correlation coefficient:
=
Cov X Y
X Y
( , )
The covariance of two random variables X and Y: where X and are the population means of X and Y respectivelyY .
Cov X Y E X X Y Y( , ) [( )( )]
Example 10 - 1:
=
rSS
XYSS
XSS
Y
51402852.4
40947557.84 6685589851402852.4
52321943 299824
( )( )
..
*Note:Note: If < 0, b1 < 0 If = 0, b1 = 0 If > 0, b1 >0*Note:Note: If < 0, b1 < 0 If = 0, b1 = 0 If > 0, b1 >0
Covariance and CorrelationCovariance and Correlation
10-31
H0: = 0 (No linear relationship)H1: 0 (Some linear relationship)
Test Statistic: tr
rn
n( )
2 212
Example 10 -1:
=0.98241- 0.9651
25- 2
=0.98240.0389
H rejected at 1% level0
tr
rn
t
n( )
.
.
. .
2 2
0 005
12
2525
2 807 2525
Example 10 -1:
=0.98241- 0.9651
25- 2
=0.98240.0389
H rejected at 1% level0
tr
rn
t
n( )
.
.
. .
2 2
0 005
12
2525
2 807 2525
Hypothesis Tests for the Correlation Hypothesis Tests for the Correlation CoefficientCoefficient
10-32
Y
X
Y
X
Y
X
Constant Y Unsystematic Variation Nonlinear Relationship
A hypothesis test for the existence of a linear relationship between X and Y:
H 0 H1Test statistic for the existence of a linear relationship between X and Y:
( - )
where is the least - squares estimate of the regression slope and ( ) is the standard error of .
When the null hypothesis is true, the statistic has a distribution with - degrees of freedom.
:
:
( )
1 0
1 0
2
1
1
1 1 12
tn
b
s b
b s b b
t n
10-6 Hypothesis Tests about the 10-6 Hypothesis Tests about the Regression RelationshipRegression Relationship
10-33
Example 10 - 1:
H 0 H1
=1.25533
0.04972
H 0 is rejected at the 1% level and we may
conclude that there is a relationship between
charges and miles traveled.
( - )
:
:
( )
.
. .( . , )
1 0
1 0
1
1
25 25
2 807 25 25
2
0 005 23
t
b
s b
t
n
1. fromdifferent ist coefficien
beta that theconcludenot may We
level. 10% at the rejectednot is 0
H
14.1671.1)58,05.0(
14.10.21
1-1.24=
)1
(
11
)2-(
11
:1
H
11
:0
H
:4-10 Example
t
bs
b
nt
Hypothesis Tests for the Regression Hypothesis Tests for the Regression SlopeSlope
10-34
The coefficient of determination, rcoefficient of determination, r22, is a descriptive measure of the strength of the regression relationship, a measure of how well the regression line fits the data.
.{
Y
X
Y
Y
Y
X
{}Total DeviationTotal Deviation
Explained DeviationExplained Deviation
Unexplained DeviationUnexplained Deviation
Total = Unexplained ExplainedDeviation Deviation Deviation (Error) (Regression)
SST = SSE + SSR
r2
( ) ( ) ( )
( ) ( ) ( )
y y y y y y
y y y y y y
SSR
SST
SSE
SST
2 2 2
1Percentage of total variation explained by the regression.
Percentage of total variation explained by the regression.
10-7 How Good is the Regression?10-7 How Good is the Regression?
10-35
Y
X
r2 = 0 SSE
SST
Y
X
r2 = 0.90SSE
SST
SSR
Y
X
r2 = 0.50 SSE
SST
SSR
Example 10 -1:
r 2 SSRSST
64527736 866855898
0 96518.
.
5500500045004000350030002500200015001000
7000
6000
5000
4000
3000
2000
Miles
Dol
lar s
The Coefficient of DeterminationThe Coefficient of Determination
10-36
10-8 Analysis-of-Variance Table and 10-8 Analysis-of-Variance Table and an an FF Test of the Regression Model Test of the Regression Model
Example 10-1
Source ofVariation
Sum ofSquares
Degrees ofFreedom
Mean SquareF Ratio p Value
Regression 64527736.8 1 64527736.8 637.47 0.000
Error 2328161.2 23 101224.4
Total 66855898.0 24
Example 10-1
Source ofVariation
Sum ofSquares
Degrees ofFreedom
Mean SquareF Ratio p Value
Regression 64527736.8 1 64527736.8 637.47 0.000
Error 2328161.2 23 101224.4
Total 66855898.0 24
Source ofVariation
Sum ofSquares
Degrees ofFreedom Mean Square F Ratio
Regression SSR (1) MSR MSRMSE
Error SSE (n-2) MSE
Total SST (n-1) MST
Source ofVariation
Sum ofSquares
Degrees ofFreedom Mean Square F Ratio
Regression SSR (1) MSR MSRMSE
Error SSE (n-2) MSE
Total SST (n-1) MST
10-37
Template (partial output) that displays Analysis of Template (partial output) that displays Analysis of Variance and an Variance and an FF Test of the Regression Model Test of the Regression Model
10-38
x or y
0
Residuals
Homoscedasticity: Residuals appear completely random. No indication of model inadequacy.
0
Residuals
Curved pattern in residuals resulting from underlying nonlinear relationship.
0
Residuals
Residuals exhibit a linear trend with time.
Time
0
Residuals
Heteroscedasticity: Variance of residuals increases when x changes.
x or y
x or y
10-9 Residual Analysis and Checking 10-9 Residual Analysis and Checking for Model Inadequaciesfor Model Inadequacies
10-39
Normal Probability Plot of the Normal Probability Plot of the ResidualsResiduals
Flatter than NormalFlatter than Normal
10-40
Normal Probability Plot of the Normal Probability Plot of the ResidualsResiduals
More Peaked than NormalMore Peaked than Normal
10-41
Normal Probability Plot of the Normal Probability Plot of the ResidualsResiduals
Positively Skewed Positively Skewed
10-42
Normal Probability Plot of the Normal Probability Plot of the ResidualsResiduals
Negatively Skewed Negatively Skewed
10-43
• Point Prediction A single-valued estimate of Y for a given value of X
obtained by inserting the value of X in the estimated regression equation.
• Prediction Interval For a value of Y given a value of X
Variation in regression line estimate Variation of points around regression line
For an average value of Y given a value of X Variation in regression line estimate
• Point Prediction A single-valued estimate of Y for a given value of X
obtained by inserting the value of X in the estimated regression equation.
• Prediction Interval For a value of Y given a value of X
Variation in regression line estimate Variation of points around regression line
For an average value of Y given a value of X Variation in regression line estimate
10-10 Use of the Regression Model 10-10 Use of the Regression Model for Predictionfor Prediction
10-44
X
Y
X
Y
Regression line
Upper limit on slope
Lower limit on slope
1) Uncertainty about the slope of the regression line
X
Y
X
Y
Regression lineUpper limit on intercept
Lower limit on intercept
2) Uncertainty about the intercept of the regression line
Errors in Predicting Errors in Predicting E[Y|X]E[Y|X]
10-45
X
Y
X
Prediction Interval for E[Y|X]
Y
Regression line
• The prediction band for E[Y|X] is narrowest at the mean value of X.
• The prediction band widens as the distance from the mean of X increases.
• Predictions become very unreliable when we extrapolate beyond the range of the sample itself.
• The prediction band for E[Y|X] is narrowest at the mean value of X.
• The prediction band widens as the distance from the mean of X increases.
• Predictions become very unreliable when we extrapolate beyond the range of the sample itself.
Prediction Interval for Prediction Interval for E[Y|X]E[Y|X]
Prediction band for E[Y|X]
10-46
Additional Error in Predicting Individual Additional Error in Predicting Individual Value of Value of YY
3) Variation around the regression line
X
YRegression line
X
Y
X
Prediction Interval for E[Y|X]
Y
Regression line
Prediction band for E[Y|X]
Prediction band for Y
10-47
]67.5972 ,43.4619[62.67605.5296
84.557,947,40
)92.177,3000,4(
25
1116.318069.2 ,000)}(1.2553)(474.852{
:4,000)=(X 1-10 Example
)(11ˆ
:Yfor interval prediction 100% )-(1A
2
2
2
X
SS
xx
nsty
]67.5972 ,43.4619[62.67605.5296
84.557,947,40
)92.177,3000,4(
25
1116.318069.2 ,000)}(1.2553)(474.852{
:4,000)=(X 1-10 Example
)(11ˆ
:Yfor interval prediction 100% )-(1A
2
2
2
X
SS
xx
nsty
Prediction Interval for a Value of Prediction Interval for a Value of YY
10-48
]53.5452 ,57.5139[48.15605.296,5
84.557,947,40
)92.177,3000,4(
25
116.318069.2 ,000)}(1.2553)(474.852{
:4,000)=(X 1-10 Example
)(1ˆ
:X]YE[ for the interval prediction 100% )-(1A
2
2
2
X
SS
xx
nsty
]53.5452 ,57.5139[48.15605.296,5
84.557,947,40
)92.177,3000,4(
25
116.318069.2 ,000)}(1.2553)(474.852{
:4,000)=(X 1-10 Example
)(1ˆ
:X]YE[ for the interval prediction 100% )-(1A
2
2
2
X
SS
xx
nsty
Prediction Interval for the Average Prediction Interval for the Average Value of Value of YY
10-49
Template Output with Prediction Template Output with Prediction IntervalsIntervals
10-50
10-11 The Solver Method for 10-11 The Solver Method for RegressionRegression
The solver macro available in EXCEL can also be used to conduct a simple linear regression. See the text for instructions.See the text for instructions.
10-51
• The Case of Independent Random Variables: For independent random variables, X1, X2, …, Xn, the
expected value for the sum, is given by:• E(X1 + X2 + … + Xn) = E(X1) + E(X2)+ … + E(Xn)
• For independent random variables, X1, X2, …, Xn, the variance for the sum, is given by:
• V(X1 + X2 + … + Xn) = V(X1) + V(X2)+ … + V(Xn)
• The Case of Independent Random Variables: For independent random variables, X1, X2, …, Xn, the
expected value for the sum, is given by:• E(X1 + X2 + … + Xn) = E(X1) + E(X2)+ … + E(Xn)
• For independent random variables, X1, X2, …, Xn, the variance for the sum, is given by:
• V(X1 + X2 + … + Xn) = V(X1) + V(X2)+ … + V(Xn)
10-12 Linear Composites of 10-12 Linear Composites of Dependent Random VariablesDependent Random Variables
10-52
• The Case of Independent Random Variables with Weights: For independent random variables, X1, X2, …, Xn, with
respective weights 1, 2, …, n, the expected value for the sum, is given by:
• E(1 X1 + 2 X2 + … + n Xn) = 1 E(X1) + 2 E(X2)+ … + n E(Xn) For independent random variables, X1, X2, …, Xn, with
respective weights 1, 2, …, n, the variance for the sum, is given by:
• V(1 X1 + 2 X2 + … + n Xn) = 12 V(X1) + 2
2 V(X2)+ … + n
2 V(Xn)
• The Case of Independent Random Variables with Weights: For independent random variables, X1, X2, …, Xn, with
respective weights 1, 2, …, n, the expected value for the sum, is given by:
• E(1 X1 + 2 X2 + … + n Xn) = 1 E(X1) + 2 E(X2)+ … + n E(Xn) For independent random variables, X1, X2, …, Xn, with
respective weights 1, 2, …, n, the variance for the sum, is given by:
• V(1 X1 + 2 X2 + … + n Xn) = 12 V(X1) + 2
2 V(X2)+ … + n
2 V(Xn)
10-12 Linear Composites of 10-12 Linear Composites of Dependent Random VariablesDependent Random Variables
10-53
• The covariance between two random variables X1 and X2 is given by:
• Cov(X1, X2) = E{[X1 – E(X1)] [X2 – E(X2)]}
• A simpler measure of covariance is given by:
• Cov(X1, X2) = SD(X1) SD(X2) where is the correlation between X1 and X2.
• The covariance between two random variables X1 and X2 is given by:
• Cov(X1, X2) = E{[X1 – E(X1)] [X2 – E(X2)]}
• A simpler measure of covariance is given by:
• Cov(X1, X2) = SD(X1) SD(X2) where is the correlation between X1 and X2.
CovarianceCovariance of two random variables of two random variables XX11 and X and X22
10-54
• The Case of Dependent Random Variables with Weights: For dependent random variables, X1, X2, …, Xn, with
respective weights 1, 2, …, n, the variance for the sum, is given by:
• V(1 X1 + 1 X2 + … + n Xn) = 12 V(X1) + 2
2 V(X2)+ … + n
2 V(Xn) + 2 1 2Cov(X1, X2) + … + 2 n-1 nCov(Xn-1, Xn)
• The Case of Dependent Random Variables with Weights: For dependent random variables, X1, X2, …, Xn, with
respective weights 1, 2, …, n, the variance for the sum, is given by:
• V(1 X1 + 1 X2 + … + n Xn) = 12 V(X1) + 2
2 V(X2)+ … + n
2 V(Xn) + 2 1 2Cov(X1, X2) + … + 2 n-1 nCov(Xn-1, Xn)
10-12 Linear Composites of 10-12 Linear Composites of Dependent Random VariablesDependent Random Variables