regression & correlation

31
Course Title: Business Statistics BBA (Hons) 2 nd Semester Course Instructor: Atiq ur Rehman Shah Lecturer, Federal Urdu University of Arts, Science & Technology, Islamabad +92-345-5271959 [email protected]

Upload: atiq-ur-rehman-shah

Post on 15-Jul-2015

97 views

Category:

Business


4 download

TRANSCRIPT

Page 1: Regression & correlation

Course Title: Business Statistics

BBA (Hons)

2nd Semester

Course Instructor: Atiq ur Rehman Shah

Lecturer, Federal Urdu University of Arts, Science & Technology, Islamabad

+92-345-5271959

[email protected]

Page 2: Regression & correlation

Correlation

• Correlation is a LINEAR association between two random variables

• Correlation is a statistical technique used to determine the degree to which two variables are related

Page 3: Regression & correlation

Scatter diagram

• Rectangular coordinate

• Two quantitative variables

• One variable is called independent (X) and

the second is called dependent (Y)

Page 4: Regression & correlation
Page 5: Regression & correlation

Scatter diagram of weight and systolic blood pressure

Page 6: Regression & correlation

Scatter diagram of weight and systolic blood pressure

Page 7: Regression & correlation

Scatter plots

The pattern of data is indicative of the type of relationship between your two variables:

• positive relationship• negative relationship• no relationship

Page 8: Regression & correlation

Positive relationship

Page 9: Regression & correlation

Negative relationship

Reliability

Age of Car

Page 10: Regression & correlation

No relation

Page 11: Regression & correlation

Correlation Coefficient

• The correlation coefficient (r) measures the strength and direction of relationship between two variables

Page 12: Regression & correlation

How to interpret the value of r?

• r lies between -1 and 1. Values near 0 means no (linear) correlation and values near ± 1 means very strong correlation.

• The negative sign means that the two variables are inversely related, that is, as one variable increases the other variable decreases.

Page 13: Regression & correlation

How to interpret the value of r?

Page 14: Regression & correlation

Pearson’s r

• A 0.9 is a strong positive association (as one variable rises, so does the other)

• A -0.9 is a strong negative association (as one variable rises, the other falls)

r=correlation coefficient

Page 15: Regression & correlation

Coefficient of DeterminationDefined

• Pearson’s r can be squared , r 2, to derive a coefficient of determination.

• Coefficient of determination – the portion of variability in one of the variables that can be accounted for by variability in the second variable

Page 16: Regression & correlation

• Example of depression and CGPA– Pearson’s r shows negative correlation, r=-0.5– r2=0.25

• In this example we can say that 1/4 or 0.25 of the variability in CGPA scores can be accounted for by depression (remaining 75% of variability is other factors, habits, ability, motivation, courses studied, etc)

Page 17: Regression & correlation

Coefficient of Determinationand Pearson’s r

• If r=0.5, then r2=0.25• If r=0.7 then r2=0.49

• Thus while r=0.5 versus 0.7 might not look so different in terms of strength, r2 tells us that r=0.7 accounts for about twice the variability relative to r=0.5

Page 18: Regression & correlation

Example

• Calculate the coefficient of correlation between the value X and Y given below:

X 78 89 97 69 59 79 68 61

Y 125 137 156 112 107 136 123 108

Page 19: Regression & correlation

X Y X2 Y2 XY

78 125 6084 15625 9750

89 137 7921 18769 12193

97 156 9409 24336 15132

69 112 4761 12544 7728

59 107 3481 11449 6313

79 136 6241 18496 10744

68 123 4624 15129 8364

61 108 3721 11664 6588

Summation 600 1004 46242 128012 76812

Page 20: Regression & correlation

= 0.95Hence the correlation co-efficient between X and Y is 0.95.

** (What does this value tells us??)**

Page 21: Regression & correlation

Regression

• A statistical tool that is used to investigate the dependence of one variable (dependent variable) on one or more other variables (independent variables)

• The dependent variable (Y) is the variable for which we want to make a prediction.

• The independent variable (X) is the variable on the basis of which we are making predictions.

Page 22: Regression & correlation

• The linear relationship between two variables can either be positive or negative.

• For instance, an increase in advertisement budget will bring more sales (positive), and increase in temperature will decrease the cooling efficiency of a room AC (negative)

Page 23: Regression & correlation

Simple Linear Regression

• Positive Linear RelationshipPositive Linear Relationship

yy

xx

Slope (b)Slope (b)is positiveis positive

Regression lineRegression line

InterceptIntercept(a)(a)

Page 24: Regression & correlation

Simple Linear Regression

• Negative Linear RelationshipNegative Linear Relationship

yy

xx

Slope (b)Slope (b)is negativeis negative

Regression lineRegression line

InterceptIntercept(a)(a)

Page 25: Regression & correlation

Simple Linear Regression

• No RelationshipNo Relationship

yy

xx

Slope (b)Slope (b)is 0is 0

Regression lineRegression line

InterceptIntercept(a)(a)

Page 26: Regression & correlation

Simple Linear Regression Equation

• Hence the equation for linear regression line can be written as:

y= a + bx

Where:

y= dependent variable

x= independent variable

a= y-intercept (i.e value of y when x=0)

b= slope

Page 27: Regression & correlation

Least-squares estimates

• For a simple linear regression equation:

y= a + bx

We have,

Where, and

Page 28: Regression & correlation

Example

• Compute the least squares regression equation of Y on X for the following data. What is the regression coefficient and what does it mean??

X 5 6 8 10 12 13 15 16 17

Y 16 19 23 28 36 41 44 45 50

Page 29: Regression & correlation

X Y XY X2

5 16 80 25

6 19 114 36

8 23 184 64

10 28 280 100

12 36 432 144

13 41 533 169

15 44 660 225

16 45 720 256

17 50 850 289

Summation 102 302 3853 1308

Page 30: Regression & correlation

Now = 102/9 = 11.33

And = 302/9 = 33.56

= 9(3853) – (102) (302) 9( 1308) – (102)2

= 3873/1368

So b = 2.381

Page 31: Regression & correlation

And

= 33.56 – (2.831) (11.33)

= 1.47

Hence the desired estimated regression line of Y on X is

y= 1.47 + 2.831x

** The estimated regression co-efficient is b=2.831, which means that yhe value of y increase by 2.831 units for a unit increase in x.