chapter 4: the normality assumption: classical normal...

Basic Econometrics

Chapter 4:

THE NORMALITY ASSUMPTION:

Classical Normal Linear Regression Model (CNLRM)

1

•  CNLR assumes that each u i is distributed normally u i ∼ N(0, σ2) with:

Mean = E(u i) = 0 Ass 3 Variance = E(u2

i) = σ2 Ass 4 Cov(u i , u j ) = E(u i , u j) = 0 (i#j) Ass 5

•  Note: For two normally distributed variables, the zero covariance or correlation means independence of them, so u i and u j are not only uncorrelated but also independently distributed. Therefore:

u i ∼ NID(0, σ2) is Normal and Independently Distributed

Thenormalityassumption

2

•  Why the normality assumption? (1)  With a few exceptions, the distribution of sum of

a large number of independent and identically distributed random variables tends to a normal distribution as the number of such variables increases indefinitely

(2)  If the number of variables is not very large or they are not strictly independent, their sum may still be normally distributed


3

•  Why the normality assumption? (3) Under the normality assumption for ui , the

OLS estimators β^1 and β^2 are also normally distributed

(4)  The normal distribution is a comparatively simple distribution involving only two parameters (mean and variance)


4

•  With the normality assumption the OLS estimators β^1 , β^2 and σ^2 have the following properties:

1. They are unbiased 2. They have minimum variance. Combined 1 and 2,

they are efficient estimators 3. Consistency, that is, as the sample size increases

indefinitely, the estimators converge to their true population values

PropertiesofOLSestimatorsunderthenormalityassumption

5

4.  β^1 is normally distributed ∼ N(β1, σ^β12)

and Z = (β^1- β1)/ σ^β1 is ∼ N(0,1) 5. β^2 is normally distributed ∼N(β2 ,σ^β2

2) and Z = (β^2- β2)/ σ^β2 is ∼ N(0,1) 6.  (n-2) σ^2/ σ2 is distributed as the χ2

(n-2)

Properties of OLS estimators under the normality assumption

6

7.  β^1 and β^2 are distributed independently of σ^2. They have minimum variance in the entire class of unbiased estimators, whether linear or not. They are best unbiased estimators (BUE)

8. Let ui is ∼ N(0, σ2 ) then Yi is ∼ N[E(Yi); Var(Yi)] = N[β1+ β2X i ; σ2]

Properties of OLS estimators under the normality assumption

7

Basic Econometrics Chapter 5: TWO-VARIABLE REGRESSION: Interval Estimation and Hypothesis Testing

8

5-2. Interval estimation: Some basic Ideas •  How “close” is, say, β^2 to β2 ? Pr (β^2 - δ ≤ β2 ≤ β^2 + δ) = 1 - α (5.2.1) •  Random interval β^2 - δ ≤ β2 ≤ β^2 + δ

if exits, it is known as confidence interval •  β^2 - δ is lower confidence limit

•  β^2 + δ is upper confidence limit

Chapter5TWO-VARIABLEREGRESSION:IntervalEstimationandHypothesisTesting

9

5-2. Interval estimation: Some basic Ideas •  (1 - α) is confidence coefficient, •  0 < α < 1 is significance level

•  Equation (5.2.1) does not mean that the Pr of β2

lying between the given limits is (1 - α), but the Pr of constructing an interval that contains β2 is (1 - α)

•  (β^2 - δ , β^2 + δ) is random interval


10

5-2. Interval estimation: Some basic Ideas •  In repeated sampling, the intervals will enclose, in

(1 - α)*100 of the cases, the true value of the parameters

•  For a specific sample, one can not say that the probability is (1 - α) that a given fixed interval includes the true β2

•  If the sampling or probability distributions of the estimators are known, one can make confidence interval statement like (5.2.1)


11

5-3. Confidence Intervals for Regression Coefficients

•  Z= (β^2 - β2)/se(β^2) = (β^2 - β2) √∑x2

i /σ ~N(0,1) (5.3.1)

We did not know σ and have to use σ^ instead, so: •  t= (β^2 - β2)/se(β^2) = (β^2 - β2) √∑x2

i /σ^ ~ t(n-2) (5.3.2)

•  => Interval for β2 Pr [ -t α/2 ≤ t ≤ t α/2] = 1- α (5.3.3)


12

5-3. Confidence Intervals for Regression Coefficients

•  Or confidence interval for β2 is Pr [β^2-t α/2se(β^2) ≤ β2 ≤ β^2+t α/2se(β^2)] = 1- α

(5.3.5) •  Confidence Interval for β1 Pr [β^1-t α/2se(β^1) ≤ β1 ≤ β^1+t α/2se(β^1)] = 1- α

(5.3.7)


13

5-4. Confidence Intervals for σ2 Pr [(n-2)σ^2/ χ2

α/2 ≤ σ2 ≤(n-2)σ^2/ χ2

1- α/2] = 1- α (5.4.3)

•  The interpretation of this interval is: If we establish (1- α) confidence limits on σ2 and if we maintain a priori that these limits will include true σ2, we shall be right in the long run (1- α) percent of the time


14

5-5. Hypothesis Testing: General Comments v  The stated hypothesis is known as the null hypothesis: Ho v The Ho is tested against and alternative hypothesis: H1 5-6. Hypothesis Testing: The confidence interval

approach One-sided or one-tail Test H0: β2 ≤ β* versus H1: β2 > β*


15

Two-sided or two-tail Test H0: β2 = β* versus H1: β2 # β* β^2 - t α/2se(β^2) ≤ β2 ≤ β^2 + t α/2se(β^2) values of β2 lying in this interval are plausible under Ho with 100*(1- α)% confidence.

•  If β2 lies in this region we do not reject Ho (the finding is statistically insignificant)

•  If β2 falls outside this interval, we reject Ho (the finding is statistically significant)


16

5-7. Hypothesis Testing: The test of significance approach

A test of significance is a procedure by which sample results are used to verify the truth or falsity of a null hypothesis

•  Testing the significance of regression coefficient: The t-test

Pr [β^2-t α/2se(β^2) ≤ β2 ≤ β^2+t α/2se(β^2)]= 1- α (5.7.2)


17

l 5-7. Hypothesis Testing: The test of significance approach

l Table 5-1: Decision Rule for t-test of significance


18

Type of Hypothesis

H0 H1 Reject H0 if

Two-tail β2 = β2* β2 # β2* |t| > tα/2,df

Right-tail β2 ≤ β2* β2 > β2* t > tα,df

Left-tail β2≥ β2* β2 < β2* t < - tα,df

5-8. Hypothesis Testing: Some practical aspects 1) The meaning of “Accepting” or “Rejecting” a

Hypothesis 2) The Null Hypothesis and the Rule of Thumb 3) Forming the Null and Alternative Hypotheses 4) Choosing α, the Level of Significance


19

5-8.HypothesisTesting:Somepracticalaspects5)TheExactLevelofSignificance:Thep-Value[Seepage132]6)StatisticalSignificanceversusPracticalSignificance7)TheChoicebetweenConfidence-IntervalandTest-of-SignificanceApproachestoHypothesisTesting[Warning:Readcarefullypages117-134]


5-10. Application of Regression Analysis: Problem of Prediction •  By the data of Table 3-2, we obtained the

sample regression : Y^i = 24.4545 + 0.5091Xi , where Y^i is the estimator of true E(Yi)

•  There are two kinds of prediction as follows:


21

5-10. Application of Regression Analysis: Problem of Prediction •  Mean prediction: Prediction of the

conditional mean value of Y corresponding to a chosen X, say X0, that is the point on the population regression line itself (see pages 137-138 for details)

•  Individual prediction: Prediction of an individual Y value corresponding to X0 (see pages 138-139 for details)


22

Basic Econometrics Chapter 6

EXTENSIONS OF THE TWO-VARIABLE LINEAR

REGRESSION MODEL

23

Chapter6EXTENSIONSOFTHETWO-VARIABLELINEARREGRESSION

MODELS 6-1. Regression through the origin

l  The SRF form of regression: l  Yi = β^2X i + u^ i (6.1.5)

l  Comparison two types of regressions: * Regression through-origin model and

* Regression with intercept 24


MODELS

6-1. Regression through the origin

Comparison two types of regressions:

β^2 = ΣXiYi/ΣX2i (6.1.6) O

β^2 = Σxiyi/Σx2i (3.1.6) I

var(β^2) = σ2/ ΣX2i (6.1.7) O

var(β^2) = σ2/ Σx2i (3.3.1) I

σ^2 = Σ(u^i)2/(n-1) (6.1.8) O σ^2 = Σ(u^i)2/(n-2) (3.3.5) I

25


MODELS

6-1. Regression through the origin r2 for regression through-origin model

Raw r2 = (ΣXiYi)2 /ΣX2i ΣY2

i (6.1.9)

Note: Without very strong a priory expectation, well advise is sticking to the conventional, intercept-present model. If intercept equals to zero statistically, for practical purposes

we have a regression through the origin. If in fact there is an intercept in the model but we insist on fitting a regression through the origin, we would be committing a specification

error

26

6-3. Functional form of regression model

The log-linear model Semi-log model

Reciprocal model

27

6-4. How to measure elasticity

The log-linear model Exponential regression model:

Yi= β1Xi β2 e u

i (6.4.1) By taking log to the base e of both side:

lnYi = lnβ1 +β2lnXi + ui , by setting lnβ1 = α =>

lnYi = α +β2lnXi + ui (6.4.3) (log-log, or double-log, or log-linear model)

This can be estimated by OLS by letting Y*i = α +β2X*i + ui , where Y*i=lnYi, X*i=lnXi ; β2 measures the ELASTICITY of Y respect to X, that is,

percentage change in Y for a given (small) percentage change in X.

28

6-4. How to measure elasticity

The log-linear model

The elasticity E of a variable Y with respect to variable X is defined as:

E=dY/dX=(% change in Y)/(% change in X)

~ [(ΔY/Y) x 100] / [(ΔX/X) x100]= = (ΔY/ΔX)x (X/Y) = slope x (X/Y)

29

6-5. Semi-log model: Log-lin and Lin-log Models

How to measure the growth rate: The log-lin model Y t = Y0 (1+r) t (6.5.1) lnYt = lnY0 + t ln(1+r) (6.5.2) lnYt = β1 + β2t , called constant growth model (6.5.5) where β1 = lnY0 ; β2 = ln(1+r) lnYt = β1 + β2t + ui (6.5.6) It is Semi-log model, or log-lin model. The slope coefficient measures the constant proportional or relative change in Y for a given absolute change in the value of the regressor (t) β2 = (Relative change in regressand)/(Absolute change in regressor) (6.5.7)

30


Instantaneous Vs. compound rate of growth β2 is instantaneous rate of growth antilog(β2) – 1 is compound rate of growth

The linear trend model Yt = β1 + β2t + ut (6.5.9) If β2 > 0, there is an upward trend in Y If β2 < 0, there is an downward trend in Y Note: (i) Cannot compare the r2 values of models (6.5.5) and (6.5.9) because the regressands in the two models are different, (ii) Such models may be appropriate only if a time series is stationary. 31


The lin-log model: Yi = β1 +β2lnXi + ui (6.5.11) β2 = (Change in Y) / Change in lnX = (Change in Y)/(Relative change in X) ~ (ΔY)/(ΔX/X) (6.5.12) or ΔY = β2 (ΔX/X) (6.5.13) That is, the absolute change in Y equal to β2 times the relative change in X.

32

6-6. Reciprocal Models: Log-lin and Lin-log Models

The reciprocal model: Yi = β1 + β2( 1/Xi ) + ui (6.5.14) As X increases definitely, the term β2( 1/Xi ) approaches to zero and Yi

approaches the limiting or asymptotic value β1 (See figure 6.5 in page 174) An Illustrative example: The Phillips Curve for the United Kingdom 1950-1966

33

6-7.SummaryofFunctionalForms

Table 6.5 (page 178)

34

Model Equation

Slope = dY/dX

Elasticity = (dY/dX).(X/Y)

Linear Y = β1 + β2 X β2 β2(X/Y) */

Log-linear (log-log)

lnY = β1 + β2 lnX

β2 (Y/X) β2

Log-lin lnY = β1 + β2 X β2 (Y) β2 X */

Lin-log Y = β1 + β2 lnX β2(1/X) β2 (1/Y) */

Reciprocal Y = β1 + β2 (1/X) - β2(1/X2) - β2 (1/XY) */

Basic Econometrics Chapter 7

MULTIPLE REGRESSION ANALYSIS:

The Problem of Estimation

35

•  Yi = ß1+ ß2X2i + ß3X3i + u i (7.1.1) •  ß2 , ß3 are partial regression coefficients •  With the following assumptions: + Zero mean value of U i:: E(u i|X2i,X3i) = 0. ∀i (7.1.2) + No serial correlation: Cov(ui,uj) = 0, ∀i # j (7.1.3) + Homoscedasticity: Var(u i) = σ2 (7.1.4) + Cov(ui,X2i) = Cov(ui,X3i) = 0 (7.1.5) + No specification bias or model correct specified (7.1.6) + No exact collinearity between X variables (7.1.7) (no multicollinearity in the cases of more explanatory vars. If there is linear relationship exits, X vars. Are said to be linearly dependent) + Model is linear in parameters

7-1. The three-Variable Model: Notation and Assumptions

36

•  E(Yi| X2i ,X3i) = ß1+ ß2X2i + ß3X3i (7.2.1) •  (7.2.1) gives conditional mean or

expected value of Y conditional upon the given or fixed value of the X2 and X3

7-2.InterpretationofMultipleRegression

37

•  Yi= ß1+ ß2X2i + ß3X3 +….+ ßsXs+ ui •  ßk measures the change in the mean value

of Y per unit change in Xk, holding the rest explanatory variables constant. It gives the “direct” effect of unit change in Xk on the E(Yi), net of Xj (j # k)

•  How to control the “true” effect of a unit change in Xk on Y? (read pages 195-197)

7-3.Themeaningofpartialregressioncoefficients

38

•  R2 is a non-decreasing function of the number of explanatory variables. An additional X variable will not decrease R2

R2= ESS/TSS = 1- RSS/TSS = 1-∑u^2I / ∑y^2

i (7.8.1) •  This will make the wrong direction by adding more irrelevant

variables into the regression and give an idea for an adjusted-R2 (R bar) by taking account of degree of freedom

•  R2bar= 1- [ ∑u^2

I /(n-k)] / [∑y^2i /(n-1) ] , or (7.8.2)

R2bar= 1- σ^2

/ S2Y (S2

Y is sample variance of Y) K= number of parameters including intercept term –  By substituting (7.8.1) into (7.8.2) we get R2

bar = 1- (1-R2) (n-1)/(n- k) (7.8.4) –  For k > 1, R2

bar < R2 thus when number of X variables increases

R2bar increases less than R2 and R2

bar can be negative

7-8.R2andtheAdjusted-R2

39

•  Comparing Two R2 Values: To compare, the size n and the dependent variable must be the same

•  Example 7-2: Coffee Demand Function Revisited (page 210) •  The “game” of maximizing adjusted-R2: Choosing

the model that gives the highest R2bar may be dangerous, for

in regression our objective is not for that but for obtaining the dependable estimates of the true population regression coefficients and draw statistical inferences about them

•  Should be more concerned about the logical or theoretical relevance of the explanatory variables to the dependent variable and their statistical significance

7-8.R2andtheAdjusted-R2

40

chapter 4: the normality assumption: classical normal...

Documents