multiple regression (reduced set with minitab examples)
DESCRIPTION
Multiple Regression (Reduced Set with MiniTab Examples). Chapter 15 BA 303. Multiple Regression. Estimated Multiple Regression Equation. Estimated Multiple Regression Equation. ^. y = b 0 + b 1 x 1 + b 2 x 2 + . . . + b p x p. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/1.jpg)
1 Slide
Multiple Regression(Reduced Set with MiniTab Examples)
Chapter 15BA 303
![Page 2: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/2.jpg)
2 Slide
MULTIPLE REGRESSION
![Page 3: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/3.jpg)
3 Slide
A simple random sample is used to compute sample statistics b0, b1, b2, . . . , bp that are used as the point estimators of the parameters b0, b1, b2, . . . , bp.
Estimated Multiple Regression Equation
^y = b0 + b1x1 + b2x2 + . . . + bpxp
Estimated Multiple Regression Equation
![Page 4: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/4.jpg)
4 Slide
Least Squares Method
Least Squares Criterion
2ˆmin ( )i iy y
![Page 5: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/5.jpg)
5 Slide
The years of experience, score on the aptitude test
test, and corresponding annual salary ($1000s) for a
sample of 20 programmers is shown on the next slide.
Programmer Salary Survey
Multiple Regression Model
A software firm collected data for a sample of 20
computer programmers. A suggestion was made that
regression analysis could be used to determine if
salary was related to the years of experience and the
score on the firm’s programmer aptitude test.
![Page 6: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/6.jpg)
6 Slide
47158100166
92105684633
781008682868475808391
88737581748779947089
24.043.023.734.335.838.022.223.130.033.0
38.026.636.231.629.034.030.133.928.230.0
Exper.(Yrs.)
TestScore
TestScore
Exper.(Yrs.)
Salary($000s)
Salary($000s)
Multiple Regression Model
![Page 7: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/7.jpg)
7 Slide
Suppose we believe that salary (y) is related tothe years of experience (x1) and the score on
theprogrammer aptitude test (x2) by the
following regression model:
Multiple Regression Model
where y = annual salary ($000) x1 = years of experience x2 = score on programmer aptitude test
y = b0 + b1x1 + b2x2 +
![Page 8: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/8.jpg)
8 Slide
Solving for the Estimates of b0, b1, b2
Salary = 3.174 + 1.4039YearsExp + 0.25089ApScoreNote: Predicted salary will be in thousands of dollars.
![Page 9: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/9.jpg)
9 Slide
MULTIPLE COEFFICIENT OF DETERMINATION
![Page 10: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/10.jpg)
10 Slide
Multiple Coefficient of Determination
Relationship Among SST, SSR, SSE
where: SST = total sum of squares SSR = sum of squares due to regression SSE = sum of squares due to error
SST = SSR + SSE
2( )iy y 2ˆ( )iy y= 2ˆ( )i iy y+
![Page 11: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/11.jpg)
11 Slide
SSR, SSE, and SST
SSR
SSTSSE
![Page 12: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/12.jpg)
12 Slide
Multiple Coefficient of Determination
R2 = SSR/SST
![Page 13: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/13.jpg)
13 Slide
Adjusted Multiple Coefficientof Determination
R R nn pa
2 21 1 11
( )R R n
n pa2 21 1 1
1
( )
Where p is the number of independent variables in the regression equation.
![Page 14: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/14.jpg)
14 Slide
R2 and R2a
834.079.59933.5002
SSTSSRR
81447.01220120)834.01(1
11)1(1 22
pnnRRa
![Page 15: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/15.jpg)
15 Slide
TESTING FOR SIGNIFICANCE
![Page 16: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/16.jpg)
16 Slide
Testing for Significance: F Test
The F test is referred to as the test for overall significance.
The F test is used to determine whether a significant relationship exists between the dependent variable and the set of all the independent variables.
![Page 17: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/17.jpg)
17 Slide
Testing for Significance: F Test
Hypotheses
Rejection Rule
Test Statistics
H0: b1 = b2 = . . . = bp = 0 Ha: One or more of the parameters is not equal to zero.
F = MSR/MSE
Reject H0 if p-value < a or if F > Fa ,
where Fa is based on an F distributionwith p d.f. in the numerator andn - p - 1 d.f. in the denominator.
![Page 18: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/18.jpg)
18 Slide
F Test for Overall Significance
Say a=0.05, is the regression significant overall?
![Page 19: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/19.jpg)
19 Slide
A separate t test is conducted for each of the independent variables in the model.
The t test is used to determine whether each of the individual independent variables is significant.
Testing for Significance: t Test
We refer to each of these t tests as a test for individual significance.
![Page 20: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/20.jpg)
20 Slide
Testing for Significance: t Test
Hypotheses
Rejection Rule
Test Statistics
Reject H0 if p-value < a orif t < -ta or t > ta where ta is based on a t distributionwith n - p - 1 degrees of freedom.
t bs
i
bi
t bs
i
bi
0 : 0iH b
: 0a iH b
![Page 21: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/21.jpg)
21 Slide
t Test for Significanceof Individual Parameters
Say a=0.05, which parameters are significant?
![Page 22: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/22.jpg)
22 Slide
MULTICOLLINEARITY
![Page 23: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/23.jpg)
23 Slide
Multicollinearity
The term multicollinearity refers to the correlation among the independent variables.
When the independent variables are highly correlated, it is not possible to determine the separate effect of any particular independent variable on the dependent variable.
Every attempt should be made to avoid including independent variables that are highly correlated.
![Page 24: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/24.jpg)
24 Slide
Multicollinearity
The Variance Inflation Factor (VIF) measures how much the variance of the coefficient for an independent variable is inflated by one or more of the other independent variables.
This inflation of the variance means that the independent variable is highly correlated with at least one other independent variable.• VIF around 1 = no multicollinearity (good)• VIF much greater than 1 = multicollinearity
(bad)• “much greater” is subjective!
![Page 25: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/25.jpg)
25 Slide
Multicollinearity
VIF values not available in Excel MiniTab:
![Page 26: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/26.jpg)
26 Slide
ESTIMATION AND PREDICTION
![Page 27: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/27.jpg)
27 Slide
Using the Estimated Regression Equationfor Estimation and Prediction
The procedures for estimating the mean value of y and predicting an individual value of y in multiple regression are similar to those in simple regression.
We substitute the given values of x1, x2, . . . , xp into the estimated regression equation and use the corresponding value of y as the point estimate.
![Page 28: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/28.jpg)
28 Slide
PI and CI Using MiniTab
![Page 29: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/29.jpg)
29 Slide
CATEGORICAL VARIABLES
![Page 30: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/30.jpg)
30 Slide
In many situations we must work with categorical independent variables such as gender (male, female), method of payment (cash, check, credit card), etc.
For example, x2 might represent gender where x2 = 0 indicates male and x2 = 1 indicates female.
Categorical Independent Variables
In this case, x2 is called a dummy or indicator variable.
![Page 31: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/31.jpg)
31 Slide
The years of experience, the score on the programmer aptitude test, whether the individual has a relevant graduate degree, and the annual salary ($000) for each of the sampled 20 programmers are shown on the next slide.
Categorical Independent Variables
Programmer Salary SurveyAs an extension of the problem involving the computer programmer salary survey, suppose that management also believes that the annual salary is related to whether the individual has a graduate degree in computer science or information systems.
![Page 32: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/32.jpg)
32 Slide
47158100166
92105684633
781008682868475808391
88737581748779947089
24.043.023.734.335.838.022.223.130.033.0
38.026.636.231.629.034.030.133.928.230.0
Exper.(Yrs.)
TestScore
TestScore
Exper.(Yrs.)
Salary($000s)
Salary($000s)Degr.
NoYes NoYesYesYes No No NoYes
Degr. Yes NoYes No NoYes NoYes No No
Categorical Independent Variables
If grad degree, Degr = 1. If no grad degree, Degr = 0.
![Page 33: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/33.jpg)
33 Slide
47158100166
92105684633
781008682868475808391
88737581748779947089
24.043.023.734.335.838.022.223.130.033.0
38.026.636.231.629.034.030.133.928.230.0
Exper.(Yrs.)
TestScore
TestScore
Exper.(Yrs.)
Salary($000s)
Salary($000s)Degr.
01 0111 0 0 01
Degr. 1 01 0 01 01 0 0
Categorical Independent Variables
![Page 34: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/34.jpg)
34 Slide
Estimated Regression Equation
^where: y = annual salary ($1000) x1 = years of experience x2 = score on programmer aptitude test x3 = 0 if individual does not have a graduate degree 1 if individual does have a graduate degree
x3 is a dummy variable
y = b0 + b1x1 + b2x2 + b3x3^
![Page 35: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/35.jpg)
35 Slide
Categorical Independent Variables
![Page 36: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/36.jpg)
36 Slide
Categorical Independent Variables
![Page 37: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/37.jpg)
37 Slide
Categorical Independent Variables
![Page 38: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/38.jpg)
38 Slide
More Complex Categorical Variables
If a categorical variable has k levels, k - 1 dummy variables are required, with each dummy variable being coded as 0 or 1.
For example, a variable with levels A, B, and C could be represented by x1 and x2 values of (0, 0) for A, (1, 0) for B, and (0,1) for C.
Care must be taken in defining and interpreting the dummy variables.
![Page 39: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/39.jpg)
39 Slide
For example, a variable indicating level of education could be represented by x1 and x2 values as follows:
More Complex Categorical Variables
HighestDegree x1 x2
Bachelor’s 0 0Master’s 1 0Ph.D. 0 1
![Page 40: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/40.jpg)
40 Slide
AND RESIDUALS
![Page 41: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/41.jpg)
41 Slide
The variance of , denoted by 2, is the same for all values of the independent variables.
The error is a normally distributed random variable reflecting the deviation between the y value and the expected value of y given by b0 + b1x1 + b2x2 + . . + bpxp.
Assumptions About the Error Term
The error is a random variable with mean of zero.
The values of are independent.
![Page 42: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/42.jpg)
42 Slide
Standardized Residual Plot Against ̂y Standardized residuals are frequently used in
residual plots for purposes of:• Identifying outliers (typically, standardized
residuals < -2 or > +2)• Providing insight about the assumption that
the error term has a normal distribution
![Page 43: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/43.jpg)
43 Slide
Standardized Residual Plot Against ̂y
![Page 44: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/44.jpg)
44 Slide
Residuals
![Page 45: Multiple Regression (Reduced Set with MiniTab Examples)](https://reader035.vdocuments.site/reader035/viewer/2022081505/568161a3550346895dd15bc5/html5/thumbnails/45.jpg)
45 Slide