regression analysis. introduction derive the α and β assess the use of the t-statistic discuss the...
TRANSCRIPT
Regression Analysis
Introduction
• Derive the α and β
• Assess the use of the T-statistic
• Discuss the importance of the Gauss-Markov assumptions
• Describe the problems associated with autocorrelation, how to measure it and possible remedies
• Introduce the problem of heteroskedasicity
Values and Fitted Values
termerrortheise
valuefittedaisywhere
xye
yye
xy
i
i
iii
iii
ii
ˆ
ˆ
ˆ
Deriving the α and β
• The aim of a least squares regression is to minimize the distance between the regression line and error terms (e).
• and ... )ˆˆ( 2
11
2 trwxyeSMin t
n
tt
n
tt
The Constant
xy
xy
ˆˆ
ˆˆ
The Slope Coefficient (β)
n
tt
n
ttt
xx
xxyy
1
2
1
)(
))((
)(
)(ˆxVar
yxCov
T-test
• When conducting a t-test, we can use either a 1 or 2 tailed test, depending on the hypothesis
• We usually use a 2 tailed test, in this case our alternative hypothesis is that our variable does not equal 0. In a one tailed test we would stipulate whether it was greater than or less than 0.
• Thus the critical value for a 2 tailed test at the 5% level of significance is the same as the critical value for a 1 tailed test at the 2.5% level of significance.
T-test
• We can also test whether our coefficient equals 1.
)ˆ(
1ˆ
SE
t
Gauss-Markov Assumptions
• There are 4 assumptions relating to the error term.
• The first is that the expected value of the error term is zero
• The second is that the error terms are not correlated
• The third is that the error term has a constant variance
• The fourth is that the error term and explanatory variable are not correlated.
Gauss-Markov assumptions
• More formally we can write them as:
0),(
)()(
0),(
0)(
22
1
tt
tt
tt
t
xuCov
uEuVar
uuCov
uE
Additional Assumptions
• There are a number of additional assumptions such as normality of the error term and n (number of observations) exceeding k (the number of parameters).
• If these assumptions hold, we say the estimator is BLUE
BLUE
• Best or minimum variance
• Linear or straight line
• Unbiased or the estimator is accurate on average over a large number of samples.
• Estimator
Consequences of BLUE
• If the estimator is not BLUE, there are serious implications for the regression, in particular we can not rely on the t-tests.
• In this case we need to find a remedy for the problem.
Autocorrelation
• Autocorrelation occurs when the second Gauss-Markov assumption fails.
• It is often caused by an omitted variable
• In the presence of autocorrelation the estimator is not longer Best, although it is still unbiased. Therefore the estimator is not BLUE.
Durbin-Watson Test
• This tests for 1st order autocorrelation only• In this case the autocorrelation follows the
first-order autoregressive process
ttt uu 1
Durbin-Watson Test- decision framework
0 dl du 2 44-du b-dl
Zone of indecision
Zone of indecision
DW Statistic
• The DW test statistic lies between 0 and 4, if it lies below the dl point, we have positive autocorrelation. If it lies between du and 4-du, we have no autocorrelation and if above 4-dl we have negative autocorrelation.
• The dl and du value can be found in the DW d-statistic tables (at the back of most text books)
Lagrange Multiplier (LM) Statistic
• Tests for higher order autocorrelation• The test involves estimating the model and
obtaining the error term .• Then run a second regression of the error term on
lags of itself and the explanatory variable: (the number of lags depends on the order of the autocorrelation, i.e. second order)
ttttt xuuu 322110
LM Test
• The test statistic is the number of observations multiplied by the R-squared statistic.
• It follows a chi-squared distribution, the degrees of freedom are equal to the order of autocorrelation tested for (2 in this case)
• The null hypothesis is no autocorrelation, if the test statistic exceeds the critical value, reject the null and therefore we have autocorrelation.
Remedies for Autocorrelation
• There are 2 main remedies:
• The Cochrane-Orcutt iterative process
• An unrestricted version of the above process
Heteroskedasticity
• This occurs when the variance of the error term is not constant
• Again the estimator is not BLUE, although it is still unbisased it is no longer Best
• It often occurs when the values of the variables vary substantially in different observations, i.e. GDP in Cuba and the USA.
Conclusion
• The residual or error term is the difference between the fitted value and actual value of the dependent variable.
• There are 4 Gauss-Markov assumptions, which must be satisfied if the estimator is to be BLUE
• Autocorrelation is a serious problem and needs to be remedied
• The DW statistic can be used to test for the presence of 1st order autocorrelation, the LM statistic for higher order autocorrelation.