basics of regression analysis

29
Basics of Regression Analysis Presented By Mahak Vijay 19/06/2022 1

Upload: mahak-vijayvargiya

Post on 21-Feb-2017

62 views

Category:

Engineering


0 download

TRANSCRIPT

Page 1: Basics of Regression analysis

1

Basics of Regression Analysis

Presented By Mahak Vijay

01/05/2023

Page 2: Basics of Regression analysis

01/05/2023 2

•What is Regression Analysis?•Population Regression Line•Why do we use Regression Analysis?•What are the types of Regression?•Simple Linear Regression Model•Least Square Estimation for parameters•Least Square for Linear Regression•References

Outlines

Page 3: Basics of Regression analysis

01/05/2023 3

Regression analysis is a form of predictive modelling technique which investigates the relationship between a dependent (target) and independent variable(s) (predictor).

This technique is used for forecasting, time series modelling and finding the causal effect relationship between the variables.

For example, relationship between rash driving and number of road accidents by a driver is best studied through regression.

What is Regression Analysis?

Page 4: Basics of Regression analysis

01/05/2023 4

x

yRegression Line

Actual

Estimated

Errors

Population Regression Line

Independent Variables

Dep

ende

nt V

aria

bles

Page 5: Basics of Regression analysis

Study Time

Est

imat

ed G

rade

sPopulation regression function =

+x

Estimated Gradesx = Study Time= Intercept= Slope

Example

01/05/2023 5

Population Regression Line

= Intercept

= Slope

Regression Line

Page 6: Basics of Regression analysis

01/05/2023 6

Typically, a regression analysis is used for these purposes:

(1) Prediction of the target variable (forecasting).

(2) Modelling the relationships between the dependent variable and the explanatory variable.

(3) Testing of hypotheses.

Benefits

1. It indicates the strength of impact of multiple independent variables on a dependent variable.

2. It indicates the significant relationships between dependent variable and independent variable.

These benefits help market researchers / data analysts / data scientists to eliminate and evaluate the best set of variables to be used for building predictive models.

Why we need Regression Analysis?

Page 7: Basics of Regression analysis

01/05/2023 7

Types of regression analysis:

Regression analysis is generally classified into two kinds: simple and multiple.

Simple Regression:

It involves only two variables: dependent variable , explanatory (independent) variable.

A regression analysis may involve a linear model or a nonlinear model.The term linear can be interpreted in two different ways: 1. Linear in variable2. Linearity in the parameter

Regression Analysis

Simple Multiple

Linear Non Linear

1 Explanatory variable

2+ Explanatory variable

Types of Regression Analysis

Page 8: Basics of Regression analysis

01/05/2023 8

Simple linear regression model is a model with a single regressor x that has a linear relationship with a response y.

Simple linear regression model:

y +x + ɛ Response variable Regressor variable

Intercept Slope Random error component

In this technique, the dependent variable is continuous and random variable, independent variable(s) can be continuous or discrete but it is not a random variable, and nature of regression line is linear.

Simple Linear Regression Model

Page 9: Basics of Regression analysis

01/05/2023 9

Some basic assumption on the model:

Simple linear regression model:

yi+xi + ɛi for i=(1,2….n)

ɛi is a random variable with zero mean and variance σ2,i.e.

ɛi and ɛj are uncorrelated for i ≠ j, i.e.

ɛi is a normally distributed random variable with mean zero and variance σ2.

Ɛi N (0, σ2).

E(ɛi )=0 ; V(ɛi )= σ2

cov(ɛi , ɛj )=0

Page 10: Basics of Regression analysis

01/05/2023 10

yi+xi + ɛi for i=(1,2….n)

E(yi+xi + ɛi)= +xi

V(yi+xi + ɛi)=V(ɛi )=σ2.

=> Ɛi N (0, σ2)

=> Yi N (+xi , σ2)

NOTE : The dataset should satisfy the basic assumption.

E(ɛi )=0

Page 11: Basics of Regression analysis

01/05/2023 11

The parameters and are unknown and must be estimates using sample data: (,), (,),……(,)

x

y+x + ɛ

x

y

Least Square Estimation for Parameters

+xi + ɛi

Page 12: Basics of Regression analysis

01/05/2023 12

The line fitted by least square is the one that makes the sum of squares of all vertical discrepancies as small as possible.

x

y

We estimate the parameters so that sum of squares of all the vertical difference between the observation and fitted line is minimum.

S=2(x1,y1)

(x1,)

(y1-)= ɛ1

+xi + ɛi

Page 13: Basics of Regression analysis

01/05/2023 13

Minimizing the function requires to calculate the first order condition with respect to alpha and beta and set them zero:

I: = -2

II: = -2

We can mathematically solve for :

I: = -2

=

=-

Where

S=2

Page 14: Basics of Regression analysis

01/05/2023 14

II: = -2

1

=

=

=

Proof:

=

=

=

=0

=- ; =

Page 15: Basics of Regression analysis

01/05/2023 15

Example

= =

= -

Page 16: Basics of Regression analysis

01/05/2023 16

Calculating R2 Using Regression Analysis R-squared is a statistical measure of how close the data are to the fitted regression line(For measuring the

goodness of fit ). It is also known as the coefficient of determination. Firstly we calculate distance between actual values and mean value and also calculate distance between

estimated value and mean value. Then compare both the distances.

Page 17: Basics of Regression analysis

01/05/2023 17

Example

Page 18: Basics of Regression analysis

01/05/2023 18

Performance of Model

Page 19: Basics of Regression analysis

01/05/2023 19

The standard error of the estimate is a measure of the accuracy of predictions.

Note: The regression line is the line that minimizes the sum of squared deviations of prediction (also called the sum of squares error).

The standard error of the estimate is closely related to this quantity and is defined below:

Where Y = actual valueY’= Estimated ValueN = No. of observations

Standard error of the Estimate (Mean square error)

Page 20: Basics of Regression analysis

01/05/2023 20

  X Y Y' Y-Y' (Y-Y')2

  1.00 1.00 1.210 -0.210 0.044  2.00 2.00 1.635 0.365 0.133  3.00 1.30 2.060 -0.760 0.578  4.00 3.75 2.485 1.265 1.600

  5.00 2.25 2.910 -0.660 0.436

Sum 15.00 10.30 10.30 0.000 2.791

Example

Page 21: Basics of Regression analysis

01/05/2023 21

Difference

Page 22: Basics of Regression analysis

01/05/2023 22

Solve : Ax=b

The columns of A define a vector space range(A).

2a

1a

Ax 2211 aa xx

Ax is an arbitrary vector in range(A).

b is a vector in Rn and also in the column space of A so this has a solution.

b

Least Square for Linear Regression

Page 23: Basics of Regression analysis

01/05/2023 23

The columns of A define a vector space range(A).

2a

1a

Ax 2211 aa xx

Ax is an arbitrary vector in range(A).

b is a vector in Rn but not in the column space of A then it doesn’t has a solution.

b

Try to find out that makes A as close to as possible and this is called least square solution of our problem.

xAb ˆ

Page 24: Basics of Regression analysis

01/05/2023 24

b

2a

1a

xA ˆ

xAb ˆ

A is the orthogonal projection of b onto range(A)

bAxAAxAbA TTT ˆˆ 0

Page 25: Basics of Regression analysis

 

25

Page 26: Basics of Regression analysis

26

Matlab Implementation (Linear_Regression3.m)

Page 27: Basics of Regression analysis

27

Matlab Implementation (Linear_Regression3.m)

Page 28: Basics of Regression analysis

01/05/2023 28

[1] Sykes, Alan O. "An introduction to regression analysis." (1993).

[2] Chatterjee, Samprit, and Ali S. Hadi. Regression analysis by example. John Wiley & Sons, 2015.

[3] Draper, Norman Richard, Harry Smith, and Elizabeth Pownell. Applied regression analysis. Vol. 3. New York: Wiley, 1966.

[4] Montgomery, Douglas C., Elizabeth A. Peck, and G. Geoffrey Vining. Introduction to linear regression analysis. John Wiley & Sons, 2015.

[5] Seber, George AF, and Alan J. Lee. Linear regression analysis. Vol. 936. John Wiley & Sons, 2012.

Reference

Page 29: Basics of Regression analysis

01/05/2023 29

THANK YOU