simple regression and correlation

31
SIMPLE REGRESSION AND CORRELATION Prepared by: WET SOCIETY :D

Upload: mary-grace

Post on 16-Jul-2015

153 views

Category:

Education


2 download

TRANSCRIPT

Page 1: Simple regression and correlation

SIMPLE

REGRESSION

AND

CORRELATION

Prepared by: WET SOCIETY :D

Page 2: Simple regression and correlation

DEFINITION OF TERMS

CORRELATION The correlations term is used when:

1) Both variables are random variables,

2) The end goal is simply to find a number that expresses the relation between the

variables

REGRESSIONThe regression term is used when

1) One of the variables is a fixed variable,

2) The end goal is use the measure of relation to predict values of the random

variable based on values of the fixed variable

WET SOCIETY \m/

Page 3: Simple regression and correlation

CORRELATION

Correlations range from -1

(perfect negative relation)

through 0 (no relation) to +1

(perfect positive relation)

WET SOCIETY \m/

Page 4: Simple regression and correlation

CORRELATION = -1.0WET SOCIETY \m/

Page 5: Simple regression and correlation

CORRELATION = 0.0WET SOCIETY \m/

Page 6: Simple regression and correlation

CORRELATION = +1.0WET SOCIETY \m/

Page 7: Simple regression and correlation

CALCULATING THE COVARIANCE:

The first step in calculating a correlation co-

efficient is to quantify the covariance between

two variables.

WET SOCIETY \m/

Page 8: Simple regression and correlation

CALCULATING THE COVARIANCE:

Alternative formula:

WET SOCIETY \m/

Page 9: Simple regression and correlation

THE PEARSON PRODUCT-MOMENT CORRELATION COEFFICIENT (R)

The Pearson Product-Moment Correlation Coefficient, r, is computed simple by standardizing the covariance estimate as follows:

This results in r values ranging from -1.0 to +1.0 as discussed earlier

WET SOCIETY \m/

Page 10: Simple regression and correlation

THE PEARSON PRODUCT-MOMENT CORRELATION COEFFICIENT (R)

There is another way to represent this formula. It is:

where SPXY is the sum of the products of X and Y, SSX is the

sum of squares for X and SSY is the sum of squares for Y

WET SOCIETY \m/

Page 11: Simple regression and correlation

SUMS OF SQUARES AND SUMS OF PRODUCTS

WET SOCIETY \m/

Page 12: Simple regression and correlation

SUMS OF SQUARES AND SUMS OF PRODUCTS

WET SOCIETY \m/

Page 13: Simple regression and correlation

ADJUSTED RWET SOCIETY \m/

Page 14: Simple regression and correlation

EXAMPLE 1

In this class, height and ratings of physical attractiveness vary

across individuals. What is the correlation between height and

these ratings in our class?

PhyHeightSubject

7691

8612

6683

5664

8665

....

107148

WET SOCIETY \m/

Page 15: Simple regression and correlation

We can create a scatter plot of these data by simply plotting

one variable against the other:

correlation = 0.146235 or +0.15

WET SOCIETY \m/

Page 16: Simple regression and correlation

EXAMPLE 2

Consider the height and weight variables from our class dataset ...

WET SOCIETY \m/

Page 17: Simple regression and correlation

SUM (XY) = 99064

Subject Height (X) Weight (Y)

1 69 108

2 61 130

3 68 135

4 66 135

5 66 120

6 63 115

7 72 150

8 62 105

9 62 115

10 67 145

11 66 132

12 63 120

Mean 65.42 125.83

Sum(X) = 785 Sum(Y) = 1510

Sum (X2) = 51473 Sum(Y2) = 192238

WET SOCIETY \m/

Page 18: Simple regression and correlation

WET SOCIETY \m/

Page 19: Simple regression and correlation

WET SOCIETY \m/

Page 20: Simple regression and correlation

So, based on the 12 subjects we examined,

the correlation between height and weight

was +0.55

WET SOCIETY \m/

Page 21: Simple regression and correlation

Unfortunately, the r we measure using our sample

is not an unbiased estimator of the population

correlation coefficient (rho)

We can correct for this using the adjusted

correlation coefficient which is computed as

follows:

WET SOCIETY \m/

Page 22: Simple regression and correlation

WET SOCIETY \m/

Page 23: Simple regression and correlation

THE REGRESSION LINE

The regression line represents

the best prediction of the

variable on the Y axis for each

point along the X axis.

WET SOCIETY \m/

Page 24: Simple regression and correlation

COMPUTING THE REGRESSION LINE

where = the predicted value of Y

b = the slope of the line (the change in Y as a function of X)

X = the various values of X

a = the intercept of the line (the point where the line hits the Y

axis)

WET SOCIETY \m/

Page 25: Simple regression and correlation

Slope(b) = (NΣXY - (ΣX)(ΣY)) /

(NΣX2 - (ΣX)2)

Intercept(a) = (ΣY – b(ΣX)) / Nwhere

x and y are the variables.

N = Number of values or elements

X = First Score

Y = Second Score

ΣXY = Sum of the product of first and

Second Scores

ΣX = Sum of First Scores

ΣY = Sum of Second Scores

ΣX2 = Sum of square First Scores

WET SOCIETY \m/

Page 26: Simple regression and correlation

REGRESSION EXAMPLE

To find the Simple/Linear Regression of

To find regression equation, we will first find slope, intercept and use it to form regression equation..

X Values Y Values

60 3.1

61 3.6

62 3.8

63 4

65 4.1

WET SOCIETY \m/

Page 27: Simple regression and correlation

Step 1: Count the number of values.

N = 5

Step 2: Find XY, X2

See the below table

X Value Y Value X*Y X*X

60 3.160 *3.1 =

18660 *60 =

3600

61 3.661 *3.6 =

219.661 *61 =

3721

62 3.862 *3.8 =

235.662 *62 =

3844

63 4 63 *4 =25263 *63 =

3969

65 4.165 *4.1 =

266.565 *65 =

4225

WET SOCIETY \m/

Page 28: Simple regression and correlation

Step 3: Find ΣX, ΣY, ΣXY, ΣX2.

ΣX = 311

ΣY = 18.6

ΣXY = 1159.7

ΣX2 = 19359

WET SOCIETY \m/

Page 29: Simple regression and correlation

Step 4: Substitute in the above slope

formula given.

Slope(b) = (NΣXY - (ΣX)(ΣY)) /

(NΣX2 - (ΣX)2)

= ((5)*(1159.7)-

(311)*(18.6))/((5)*(19359)-(311)2)

= (5798.5 - 5784.6)/(96795 -

96721)

= 13.9/74

= 0.19

WET SOCIETY \m/

Page 30: Simple regression and correlation

Step 5: Now, again substitute in the above

intercept formula given.

Intercept(a) = (ΣY - b(ΣX)) / N

= (18.6 - 0.19(311))/5

= (18.6 - 59.09)/5

= -40.49/5

= -8.098

Step 6: Then substitute these values in

regression equation formula

Regression Equation(y) = a + bx

= -8.098 + 0.19x.

WET SOCIETY \m/

Page 31: Simple regression and correlation

Suppose if we want to know the approximate y

value for the variable x = 64. Then we can

substitute the value in the above equation.

Regression Equation(y) = a + bx

= -8.098 + 0.19(64).

= -8.098 + 12.16

= 4.06

This example will guide you to find the relationship

between two variables by calculating the

Regression from the above steps.

WET SOCIETY \m/