linear regression ( cont'd ). outline - multiple regression - checking the regression : coeff....

36
Linear Regression ( Cont'd )

Upload: priscilla-gaines

Post on 18-Jan-2018

232 views

Category:

Documents


0 download

DESCRIPTION

Multiple Regression Does Consumption is only affected by income ? There are some other variables that could also have relation to the income Then the simple regression should be expanded by introducing some new I.Vs into the model --> multiple regression

TRANSCRIPT

Page 1: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

Linear Regression ( Cont'd )

Page 2: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

Outline- Multiple Regression

- Checking The Regression : Coeff. Determination Standard ErrorConfidence IntervalHypothesis Test :t test, F test,

- Classical Assumption Test : No MulticollinearityHomoscedasticityNo autocorrelation

Page 3: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

Multiple Regression

Does Consumption is only affected by income ? There are some other variables that could also have

relation to the income Then the simple regression should be expanded by

introducing some new I.Vs into the model --> multiple regression

Page 4: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

The MODEL

Yi = 0 + 1X1i + 2X2i + 3X3i + ........+ kXki + ui

i = 1,2,3,......., N (observation)

example:Yi = 0 + 1X1 + 2X2 + 3X3 + ui

Y : ConsumptionX1 : IncomeX2 : Number of DependanceX3 : Age

Page 5: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

Checking The Regression

1. Coeffisient of Determination2.Standard Error of Coefficient3.Confidence Interval4.Hypotesis Test:

t-test F-test

Page 6: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

2.Standard Error

Principle of OLS --> minimizing error. Therefore the accuracy of the estimators is determined

by each standard error (S.e). The formula of S.e.

Se= ∑ Υ− Υ 2

n−2= SST−SSR

n−2= ∑ Υ 2−b∑ ΧΥ

n−2=MSE

Checking The Regression

Page 7: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

As , then

s

u N

i

2 1/2

2

ui2 =

2)ˆ( ii YY = ui2

The minimal standard error resulted from Smallest error

Page 8: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

How small is S.e to be the best ? Difficult for absolut number

More usefull when it is combined with each coefficient of regression

Coefficent to S.e ratio The Ratio will be used for t-test.

Page 9: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

3.Confidence Interval of j

What is Confidence Interval of Parameter ? What for? Formula:

bj t/2 s.e(bj)or

P(bj - t/2 s.e(bj) ≤ βj ≤ bj + t/2 s.e(bj))= 1-

Page 10: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

exampleFrom the regression we get : b1 = 0,1022 and s.e (b1) = 0,0092. observation(n) = 10; Estimated parameter (k) = 2; Then, degree of freedom = 10 – 2 = 8 and signifance level = 5 %.

then from the t-table find ( t df ) or ( t 0.025, 8) = 2,306

therefore the confidence interval for β1 is( 0,1022 2,306 (0,0092) ) or (0,0810 ; 0,1234)

interpretation: the value of β1 will lie on the interval 0,0810 and 0,1234 with the confidence level 95%.

Page 11: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

4.Hypotesis Test It is an individual testing for coefficient of regression.

H0 : j = 0H1 : j 0; j = 0, 1, 2........, k is slope of

coefficient.

For simple regression: (1) H0 : 0 = 0 (2) H0 : 1 = 0 H1 : 0 0 H1 : 1 0;

T-test is defined :t =b j−β js .e b j

Testing to find out if j is not different to 0

t =b j

s .e b j

t-test

Page 12: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

The t-computation is compared to t- table.

If we get t > t/2,df, then the t value is in rejection area

Thus, the null hypothesis(j = 0) is rejected with confidence level (1-) x100%.

In other word j statistically significance.

Page 13: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

Uji Hipotesis F-Test

to find out whether the model statistically significant ? or

hypothesis test for all the coefficeint together

H0 : 2 = 3 = 4 =............= k = 0H1 : at least one of k 0), where k is the

number of I.Vs.

Page 14: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

F-test can be more explained by ANOVA

Observation: Yi = 0 + 1 Xi + ei Regression: Ŷi = b1 + b2 Xi

Reduced the two sides by

then square the two sides:

SST SSR SSE

Y Y Y Y ei i

( ) ( )Y Y Y Y ei i i 2 2

( ) ( )Y Y Y Y ei i i 2 2 2

Y

Page 15: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

ANOVA Table

Source Sum of Square df Mean Squares F-statRegresi SSR k MSR = SSR/k F = MSRError SSE n-k-1 MSE= SSE/(n-k-1) MSETotal SST n-1

Compare F-stat with Fα(k,n-k-1) ( F-table )

Page 16: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

Classical Assumption of OLS

The estimator of OLS shuould be BLUE (Best Linier Unbiased Estimate)

3 main requirements: No Multikoliniearities No Heteroskedasticity No Autotocorrelation

Page 17: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

Multicolliniearities Multikolinieritas: is linear relation

between I.Vs

for two regressor, X1 dan X2. if : X1 = X2, there is collinearity.

But not be the case if, example X1 = X22 or X1 = log X2

Page 18: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

example Yi = 0 + 1X1 + 2X2 + 3X3 + ui

Y : ConsumptionX1 : Total IncomeX2 : Wage incomeX3 : non-wage income

There is multico--> even can be perfect multico

Why ?

Page 19: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

Data of Perfect Multicolliniearity

11811629969223827619656416514812X3X2X1

X2 = 4X1. --> perfect multicollinearity relation.

Page 20: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

Impact of Multicoliniearities

High Varians (dari taksiran OLS) Widely Confidence Interval High R2 but could get much insignificant

coefficients from t-test. The direction of coefficent can be

misleaded.

Page 21: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

22672201602129210140195419013514561401101234120100102310085113611090856806565965505005040

Asset(X2)Income (X1)Consumption (Y)

Page 22: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

Model:Y = 12,8 – 1,414X1 + 0,202 X2 SE (4,696) (1,199) (0,117)t (2,726) (-1,179) (1,721)R2 = 0,982

R2 is very high 98,2%. What's mean? t-test is not significant. What's mean? Coefficient X1 is negative. What's mean?

Page 23: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

Detecting Multicoliniearities

1. Comparing R2 and t-stat

2. Using Correlation Matrix for I.Vs

Page 24: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

3. VIF (Variance Inflation Factor) and Tolerance Value (TOL) --> for SPSS

VIF j=1

1−R j2

; j = 1,2,……,k

VIF threshold is usually 2 --> Indication of collinearity when below 2

TOL j=1VIF

= 1−R j2

Page 25: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

Solving Multicolliniearities

Relevant Informations ( theory or previous research) Combination of cross-section and time series Eliminating the infected variables

– Common to be used.– Be Careful --> specification bias.

Transforming the variabel : first difference method Adding additional sample/data

Page 26: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

Heteroskedastisity

Variance of Error is not constant. Generally occurs in cross sectional data.

ex. Consumption and Income in Province level The disobedience of homoskedasticity still keep the

estimator is unbiased, but not efficient

Page 27: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

0

20

40

60

80

100

120

0 20 40 60

The Pattern of Heteroskedasticity

Page 28: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

Checking The Heteroskedasticity

1. Graffic Method Analyzing the pattern relationship

between (ui2) and predicted Yi.

Page 29: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

ui2

i

,

Page 30: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

ui2 ui

2

ii

Page 31: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

Solving heteroskedastisity 1. Transformed in to the Logarithmic model

Ln Yj = β0 + β1 Ln Xj + uj

Page 32: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

Autocorrelation Is correlation between varable it self, at

the different time and individual sample observation.

Generally occurs in the case of time series data

E (ui uj) becomes not equal 0 The estimator becomes inefficient

Page 33: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

Autocorrelation :

ui ui * * ** * * * * * * * * * ** * * * Waktu/X * ** Waktu/X * * *

Page 34: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

Detecting autocorrelation

Durbin-Watson Test

d=∑t= 2

N

ut−u t−1 2

∑t=1

N

u t 2

Test -stat

compare the d-stat to the d-tabel ( dL and dU)

Page 35: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

Rules of Game

undecisive undecisive

positive No correlation negative

0 dL dU 4-dU 4-dL 4

Page 36: Linear Regression ( Cont'd ). Outline - Multiple Regression - Checking The Regression : Coeff. Determination Standard Error Confidence Interval Hypothesis

THANK YOU