correlation and simple linear regression - revisited ref: cohen, cohen, west, & aiken (2003),...
Post on 19-Dec-2015
235 views
TRANSCRIPT
![Page 1: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/1.jpg)
CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited
Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2
![Page 2: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/2.jpg)
Pearson Correlation
n
(xi – mx)(yi – my)/(n-1) rxy = I=1_____________________________ = sxy/sxsy
sx sy
= zxizyi/(n-1) / = 1 – ( (zxi-zyi)2/2(n-1)
= 1 – ( (dzi)2/2(n-1)
= COVARIANCE / SDxSDy
![Page 3: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/3.jpg)
Variance of X=1
Variance of Y=1
r2 = percent overlap in the two squares
Fig. 3.6: Geometric representation of r2 as the overlap of two squares
a. Nonzero correlation
Variance of X=1
Variance of Y=1
B. Zero correlation
![Page 4: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/4.jpg)
SSySSx
Sxy
Sums of Squares and Cross Product (Covariance)
![Page 5: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/5.jpg)
SATMath
CalcGrade
.00364 (.40))
error
.932(.955)
Figure 3.4: Path model representation of correlation between SAT Math scores and Calculus Grades
R2 = .42 = .16
![Page 6: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/6.jpg)
Path Models• path coefficient -standardized coefficient
next to arrow, covariance in parentheses
• error coefficient- the correlation between the errors, or discrepancies between observed and predicted Calc Grade scores, and the observed Calc Grade scores.
• Predicted(Calc Grade) = .00364 SAT-Math + 2.5
• errors are sometimes called disturbances
![Page 7: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/7.jpg)
X Y
a
X Y
b
X Y
c
Figure 3.2: Path model representations of correlation
![Page 8: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/8.jpg)
SUPPRESSED SCATTERPLOT
• NO APPARENT RELATIONSHIP
X
Y
Prediction lines
MALES
FEMALES
![Page 9: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/9.jpg)
IDEALIZED SCATTERPLOT
• POSITIVE CURVILINEAR RELATIONSHIP
X
Y
Linear
prediction line
Quadratic
prediction line
![Page 10: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/10.jpg)
LINEAR REGRESSION- REVISITED
![Page 11: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/11.jpg)
Single predictor linear regression.
• Regression equations:
• y = xb1x+ xb0
• x = yb1y + yb0
• Regression coefficients:
• xb1 = rxy sy / sx
• yb1 = rxy sx / sy
![Page 12: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/12.jpg)
Two variable linear regression
• Path model representation:unstandardized
x y e
b1
![Page 13: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/13.jpg)
Linear regression
y = b1x + b0
If the correlation coefficient is calculated, then b1 can be calculated from the equation above:
b1 = rxy sy / sx
The intercept, b0, follows by placing the means for x and y into the equation above and solving:
_ _
b0 = y. – [ rxysy/sx ] x.
![Page 14: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/14.jpg)
Linear regression
• Path model representation:standardized
zx zy e
rxy
![Page 15: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/15.jpg)
Least squares estimation
The best estimate will be one in which the sum of squared differences between each score and the estimate will be the smallest among all possible linear unbiased estimates (BLUES, or best linear unbiased estimate).
![Page 16: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/16.jpg)
Least squares estimation
• errors or disturbances. They represent in this case the part of the y score not predictable from x:
• ei = yi – b1xi .
• The sum of squares for errors follows:• n
• SSe = e2i .
• i-1
![Page 17: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/17.jpg)
e
y
x
e
e
e
e
e
e e
SSe = e2i
![Page 18: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/18.jpg)
Matrix representation of least squares estimation.
• We can represent the regression model in matrix form:
• y = X + e
![Page 19: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/19.jpg)
Matrix representation of least squares estimation
• y = X + e
• y1 1 x1 e1
• 0
• y2 1 x2 1 e2
• y3 1 x3 e3
• y4 = 1 x4 + e4
• . 1 . .
• . 1 . .
• . 1 . .
![Page 20: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/20.jpg)
Matrix representation of least squares estimation
• y = Xb + e
• The least squares criterion is satisfied by the following matrix equation:
• b = (X’X)-1X’y .
• The term X’ is called the transform of the X matrix. It is the matrix turned on its side. When X’X is multiplied together, the result is a 2 x 2 matrix
• n xi
• xi x2i
![Page 21: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/21.jpg)
SUMS OF SQUARES
• SSe = (n – 2 )s2e
• SSreg = ( b1 xi – y. )2
• SSy = SSreg + SSe
![Page 22: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/22.jpg)
SUMS OF SQUARES-Venn Diagram
ssregSSy
SSe
Fig. 8.3: Venn diagram for linear regression with one predictor and one outcome measure
SSx
![Page 23: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/23.jpg)
STANDARD ERROR OF ESTIMATE
s2y = s2yhat + s2e
s2zy = 1 = r2y.x +s2ez
sez = sy ( 1 - r2y.x )
= SSe / (n-2)
Review slide 17: this is the standard deviation of the errors shown there
![Page 24: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/24.jpg)
SUMS OF SQUARES- ANOVA Table
SOURCE df Sum of Mean F
Squares Square
x 1 SSreg SSreg / 1 SSreg/ 1
SSe /(n-2)
e n-2 SSe SSe / (n-2)
Totaln-1 SSy SSy / (n-1)
Table 8.1: Regression table for Sums of Squares
![Page 25: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/25.jpg)
Confidence Intervals Around b and Beta weights
sb = (sy / sx ) (1 - r2y.x )/ (n-2)
Standard deviation of sampling error of estimate of regression weight b
sβ = ( 1 - r2y.x )/ (n-2)
Note: this is formally correct only for a regression equation, not for the Pearson correlation
![Page 26: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/26.jpg)
Distribution around parameter estimates: b-weight
bestimatesb
± t sb
![Page 27: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/27.jpg)
Hypothesis testing for the regression weight
Null hypothesis: bpopulation = 0
Alternative hypothesis: bpopulation ≠ 0
Test statistic: t = bsample / seb
Student’s t-distribution with degrees of freedom = n-2
![Page 28: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/28.jpg)
Model Summary
.539a .291 .268 3.121Model1
R R SquareAdjustedR Square
Std. Error ofthe Estimate
Predictors: (Constant), LOCUS OF CONTROLa.
ANOVAb
123.867 1 123.867 12.714 .001a
302.012 31 9.742
425.879 32
Regression
Residual
Total
Model1
Sum ofSquares df Mean Square F Sig.
Predictors: (Constant), LOCUS OF CONTROLa.
Dependent Variable: SOCIAL STRESSb.
Coefficientsa
-4.836 2.645 -1.828 .077
.190 .053 .539 3.566 .001
(Constant)
LOCUS OF CONTROL
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: SOCIAL STRESSa.
Test of b=0 rejected at .05 level
SPSS Regression Analysis option predicting Social Stress from Locus of Control in a sample of 16 year olds
![Page 29: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/29.jpg)
Locus of Control
Social Stress
.190 (.539))
error
3.12(.842)
Figure 3.4: Path model representation of prediction of Social Stress from Locus of Control
R2 = .291
√1- R2 = .842
b βse
![Page 30: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/30.jpg)
Difference between Independent b-weights
Compare two groups’ regression weights to see if they differ (eg. boys vs. girls)
Null hypothesis: bboys = bgirls
Test statistic: t = (bboys - bgirls) / (sbboys – bgirls)(sbboys – bgirls) = √ s2
bboys + s2bgirls
Student’s t distribution with n1+ n2 - 4
![Page 31: CORRELATION AND SIMPLE LINEAR REGRESSION - Revisited Ref: Cohen, Cohen, West, & Aiken (2003), ch. 2](https://reader036.vdocuments.site/reader036/viewer/2022062421/56649d3f5503460f94a18b21/html5/thumbnails/31.jpg)
Coefficientsa
-.416 3.936 -.106 .917
.106 .081 .289 1.314 .205
(Constant)
LOCUS OF CONTROL
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: SOCIAL STRESSa.
Coefficientsa
-9.963 2.970 -3.354 .007
.281 .058 .835 4.807 .001
(Constant)
LOCUS OF CONTROL
Model1
B Std. Error
UnstandardizedCoefficients
Beta
StandardizedCoefficients
t Sig.
Dependent Variable: SOCIAL STRESSa.
boys n=22
girls n=12
t = ( .281 - .106) / √ (.0812 + .0582 )
= 1.76