regression assumption by ammara aftab
TRANSCRIPT
THE SHINNIG STAR FOR STATISTIC…MY IDEALS…
REGRESSION seems like MSC(FINAL) OF UOK DEPENDS ON QUALIFIED ECONOMIST MR.ZOHAIB AZIZ
SIR ZOHAIB
AZIZ
By Ammara Aftab
INTRODUCTION
By Ammara Aftab
HISTORICAL ORIGIN OF THE TERM REGRESSION
By Ammara Aftab
Historical ORIGIN BY FRANCIS GALTON
GALTON’S Law of universal regression was confirmed by his friend CARL F .GAUSS
KARL PEARSONBy Ammara Aftab
Galton's universal regression law
Galton found that, although there was a tendency for tall parents to have tall children and for short parents to have short children.
KARL PEARSON: He is talking in average sense that average height (not single
height of children that may be high or low from tall fathers) of sons is less than the fathers height means tendency to mid....
similarly, average height of sons (not single height of children that may be high or low from short fathers) of short parents greater than from them means tendency to mid.
Conclusion:Tall parents have tall children but average height of their children
will be less from them similarly for short parents. That is why Karl Pearson's endorses the Galton's theory.
By Ammara Aftab
MODERN INTERPRETATION OF REGRESSION
By Ammara Aftab
By Ammara Aftab
HISTORICAL
VS
MODERN
By Ammara Aftab
Reconsider Galton’s law of universal regression. Galton was interested in finding out why there was a stability in the distribution of heightsin a population.
But in the modern view our concern is not with this explanation but rather with finding out how the average height of sons changes, given the fathers’ height. In other words, our concern is with predicting the average height of sons knowing the height of their fathers
By Ammara Aftab
EXAMPLE
By Ammara Aftab
How to use the
regression data
By Ammara Aftab
PUT THE REGRESSION DATA ON EXCEL SHEET
By Ammara Aftab
The slope indicates the steepness of a line and the intercept indicates the location where it intersects an axis.
By Ammara Aftab
GRAPHICAL REPRESENTATION
By Ammara Aftab
By Ammara Aftab
By Ammara Aftab
By Ammara Aftab
REGRESSION ASSUMPTIONS
By Ammara Aftab
WHY THE ASSUMPTION requirement is needed??
…………?????By Ammara Aftab
LOOK AT PRF: Yi=β1+(β2)Xi+µi
It shows that Yi depends on both Xi and ui . Therefore, unless we are specific about how Xi and ui are created or generated, there is no way we can make any statistical inference about the Yi and also, as we shall see, about β1 and β2. Thus, the assumptions made about the Xi variable(s) and the error term are extremely critical to the valid interpretation of the regression estimates.
By Ammara Aftab
the Gaussian or classical linear regression
ASSUMPTIONS(clrm)
By Ammara Aftab
ASSUMPTION # 01
By Ammara Aftab
LINEAR REGRESSION MODEL:The regression model is linear with respect to parameter.
1) Yi = β1 + β2X1 +µiIn this above equation the model is linear with respect to parameter.
2)Yi=B1+(B2^2)X1+µiIn this above equation the model is non linear with respect to parameter.
1
By Ammara Aftab
Linear with respect to (Parameter)
• Yi=β1+(β2)Xi+µi
• Yi=β1+(β2)Xi^2+µi
Non-linear with respect to
(PARAMETER)
Yi=β1+(β2^2)Xi+µi
ß is the parameter,If ß have power then it will be non linear with respect to parameter, if it does not have then it is in Linear form.By Ammara Aftab
Simple linear regression describes the linear relationship between a predictor variable, plotted on the x-axis, and a response variable, plotted on the y-axis
Independent Variable (X)
depe
nden
t Var
iabl
e (Y
)
1
By Ammara Aftab
ASSUMPTION # 02
By Ammara Aftab
X values are fixed in repeated sampling:
Values taken by the regressor X are considered fixed in repeated samples. More technically, X is assumed to be no stochastic .
2
By Ammara Aftab
2
By Ammara Aftab
X
Y
20$ 40$ 60$ 80$ 100$
JAPAN 18 23 47 78 89
CHINA 09 28 37 34 34
USA 16 35 48 45 67RUSSIA 17 38 50 23 69
INCOME & CONSUMPTION
Here X (INCOME)is fixed variable AN y (CONSUMPTION ) is depending on X
2
By Ammara Aftab
The X variable is measured without error
X is fixed and Y is dependent
X
Y
2
By Ammara Aftab
ASSUMPTION # 03
By Ammara Aftab
Zero mean value of disturbance ui. Given the value of X, the mean, or expected, value of the random disturbance term ui is zero. Technically, the conditional mean value of ui is zero. Symbolically, we have E(ui |Xi) = 0EXAMPLE:If X= 3,6,9Then the mean =6,after applying this formula (mean-X) = (6-3)+(6-6)+(6-9) = 0 so 1st moment is always zero.
3
By Ammara Aftab
3Mean =ovariance= constant
By Ammara Aftab
ASSUMPTION # 04
By Ammara Aftab
4Homoscedasticity or equal variance of ui. Given the value of X, the varianceof ui is the same for all observations. That is, the conditional variances of ui are identical.Symbolically, we have
var (ui |Xi) = E[ui − E(ui |Xi)]2 = E(ui2 | Xi ) because of Asp3 = σ2By Ammara Aftab
4Homo= samescedasticity=spreadness
By Ammara Aftab
ASSUMPTION # 05
By Ammara Aftab
5No autocorrelation between the disturbances. Given any two X values,
Xi and Xj (i = j), the correlation between any two ui and uj (i = j) is zero.
Symbolically,cov (ui, uj |Xi, Xj) = E{[ui − E(ui)] | Xi }{[uj − E(uj)] | Xj } = E(ui |Xi)(uj | Xj) (why?) = 0
By Ammara Aftab
No Auto-Correlation
Positive correlation
Negative correlation
No correlation
5
By Ammara Aftab
ASSUMPTION # 06
By Ammara Aftab
6Zero covariance between ui and Xi,or E(uiXi) = 0.
We assumed that X and u have separate effect on Y,But if X and u are corelated,it is not possible to assess their individual effect on Y as well as they will be directly proportional. X u Formally, cov (ui, Xi) = E[ui − E(ui)][Xi − E(Xi)] = E[ui (Xi − E(Xi))] since E(ui) = 0 = E(uiXi) − E(Xi)E(ui) since E(Xi) is nonstochastic ( NON RANDOM) = E(uiXi) since E(ui) = 0 = 0By Ammara Aftab
ASSUMPTION # 07
By Ammara Aftab
7The number of observations n must be greater than the number of parameters to be estimated.
Example:Yi=β1+(β2)X1+(β3)X2+µi As u can see that we have 3 parameters here so the number of observation will be grater then 3.Alternatively,The number of observations n must be greater than the number of regressors. From this single observation there is no way to estimate the two unknowns, β1 and β2. We need at least two pairs of observations to estimate the two unknowns.By Ammara Aftab
ASSUMPTION # 08
By Ammara Aftab
8Variability in X values.
The X values in a given sample must not all be the same.
If the X values are same then the Variance is equal to zero and
regression can’t be run
Technically, var (X) must be a finite positive
number.By Ammara Aftab
8If we have same values of X:
X = 2,2,2
Then, according to the variance formula
σ = [(x-µ)^2]/n σ = [(2-2)^2]/3+[(2-2)^2]/3+[(2-2)^2]/3 σ = 0As σ= o then regression could not be run Variance = 0 ,regression can not run..
Positive Finite means:
variance could not be(-ve),because it has squaring σ^2 variance could not be 0 , regression could not run. variance should be (+ve) , positive finite.
By Ammara Aftab
ASSUMPTION # 09
By Ammara Aftab
9The regression model is correctly specified.
Alternatively, There is no specification bias or error in the model used in empirical analysis.
Yi = α1 + α2Xi + ui Yi = β1 + β2(1/Xi)+ ui
By Ammara Aftab
9Yi = β1 + β2(1/Xi)+ ui
By Ammara Aftab
ASSUMPTION # 10
By Ammara Aftab
10
There is no perfect multicollinearity.Perfect multicollinearity=Strong Correlation b/w regressors.
That is, there are no perfect linearRelation b/w regressors.
Yi = β1 + β2X2i + β3X3i + ui
where Y is the dependent variable, X2 and X3 the explanatory variables (orregressors) or non random variables.
By Ammara Aftab