chapter 6: the simple regression modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfexamples:...

81
Chapter 6: The Simple Regression Model Statistics and Introduction to Econometrics M. Angeles Carnero Departamento de Fundamentos del Análisis Económico Year 2014-15 M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 1 / 81

Upload: others

Post on 18-Aug-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Chapter 6: The Simple Regression ModelStatistics and Introduction to Econometrics

M. Angeles Carnero

Departamento de Fundamentos del Análisis Económico

Year 2014-15

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 1 / 81

Page 2: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Introduction

Econometrics is a branch or subdiscipline of Economics that usesand develops statistic methods in order to estimate relationshipsbetween the economic variables, to test economic theories and toevaluate government and firms policies.Examples of econometric applications:

Effects on employment of a training programme for unemployedpeople.Counselling in different investment strategies.Effects on sales of an advertising campaign.

Econometric Applications with many economic disciplines:Macroeconomics =) Prediction of variables such as GNP andinflation or quantifying the relationship between interestrate-inflation.Microeconomics =) Quantify the relationship between educationand wages, production and inputs, R+D investment and firmsprofits.Finance =) Volatility Analysis of assets, Asset Pricing Models

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 2 / 81

Page 3: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Stages of the empirical economic analysisThe first stage of the econometric analysis is to formulate clearand precisely the question to be studied (test of an economictheory, analysis of the effect of a public policy, etc. ). In manycases a formal economic model is built.Example

In order to describe the consumption decision of individualssubject to budget constraints, we assume that the individuals maketheir choices in order to maximise their utility level.This model implies a set of demand equations in which thedemanded quantity of each good depends on its own price, theprice of other substitute and complementary goods, consumerincome and their individual characteristics affecting theirpreferences.These equations model the individual consumption decisions andare the basis for the econometric analysis of the consumers demand.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 3 / 81

Page 4: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Crime Economic Model (Gary Becker (1968))

This model describes the individual participation in crime and itis based on the utility maximisation.Crimes imply economic rewards and costs.The decision to participate in crime activities is a problem ofassigning resources in order to maximise utility, where the costsand benefits of the alternative decisions must be taken intoaccount.Costs:

Costs linked to the possibility of being arrested and convicted.Opportunity cost of not participating in other activities such aslegal jobs.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 4 / 81

Page 5: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Crime Economic Model (cont.)

Equation describing the time invested in crime activities

y = f (x1, x2, x3, x4, x5, x6, x7)

y ! Hours devoted to crime activitiesx1 ! Hourly "Wage" of crime activities.x2 ! Hourly wage of legal work.x3 ! Other income that does not arise from crime activities or paidwork.x4 ! Probability of being arrestedx5 ! Probability of being convicted in case of being arrested.x6 ! Expected sentence in case of being arrested.x7 ! Age.

Function f depends on the underlying utility function that isbarely known. However, we can use the economic theory, andsometimes common sense, in order to predict the effect of eachvariable on the crime activity.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 5 / 81

Page 6: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Crime Economic Model (cont.)

Once the economic model has been established we musttransform it into the econometric model.Following the previous example, in order to construct theeconometric model we should:

Specify the functional form of function f .Analyse which variables can be observed, which variables can beapproximated, which variables are not observed and how oneshould take into account many other factors affecting crimebehaviour.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 6 / 81

Page 7: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Crime Economic Model (cont.)

Consider the following particular econometric mode for theeconomic model of crime behaviour

crime = β0+ β1w+ β2othinc+ β3farr+ β4fconv+ β5avgsen+ β6age+u

crime ! Frequency of the crime activityw ! Wage that could be obtained in a legal job.othinc ! Other income.farr ! Frequency of arrests due to previous infractionsfconv ! Frequency of sentences.avgsen ! Average duration of sentencesage ! Age.

u ! This is the error term reflecting all the unobserved factorsaffecting crime activity such as the wage of crime activities, thefamily environment of the individual, etc. This also capturesmeasurement errors in those variables included in the model.β0, β1, .., β6 ! Parameter of the econometric model describing therelationship between crime (crime) and those factors used in orderto determine crime in the model.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 7 / 81

Page 8: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Once the econometric model is specified hypothesis of interest interms of the unknown parameters of the model can be formulated.

For example, we can ask whether wage obtained in a legal job (w)does not have any effect on the crime activity. This hypothesis isequivalent to β1 = 0.

Once the econometric model has been established we have tocollect the date on the variables appearing there.Finally, we use appropriate statistical techniques in order toestimate the unknown parameters and test the hypothesis ofinterest of these parameters.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 8 / 81

Page 9: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

The structure of the economic dataCross-Section Data

They arise from surveys to families, individuals or firms in a givenpoint of time.In many cases we can assume that this is a random sample, that isthat the observations are independent and identically distributed(iid).Examples: Encuesta de Presupuestos Familiares (EPF), Encuesta dePoblación Activa (EPA).

Time SeriesWe observe one or more variables along the timeThey are usually dependent variablesAnnual, quarterly, monthly or daily frequency, etc.Examples: Monthly series of price indices, Annual GNP series,Daily IBEX-35 series.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 9 / 81

Page 10: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Panel DataThis is a time series for each member of a cross-section6= repeated cross-sections.Examples: Encuesta Continua de Presupuestos Familiares, Surveyof Income and Living Conditions (EU-SILC).

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 10 / 81

Page 11: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Causality and the concept of ceteris paribusin the econometrics analysis

In most applications, we are interested in analysing whether onevariable has a causal effect on another variable.Examples:

Would an increase of the price of the good cause a decrease in itsdemand?If the sentences become tighter, would this have a causal effect oncrime?Has education a causal effect on the productivity of workers?Does participation in a certain training programme cause anincrease in the wage of those workers attending?

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 11 / 81

Page 12: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

The fact that there is correlation between two variables does notimply that a causality relationship can be inferred.

For example, the fact that we observe that those workersparticipating in a certain training programme have higher wagesthan those that did not participate, is not enough to establish acausal relationship.

Inferring causality is difficult because in Economics we usually donot have experimental data.In causality, the concept ”ceteris paribus” (the rest of the relevantfactors are held fixed) is very important.

For example, in order to analyse the consumers demand, we areinterested in quantifying the effect that a change in the price of thegood has on the demanded quantity, by holding fixed the rest of thefactors such as income, the price of other goods, the preferences ofthe consumers, etc.

The econometric methods are used in order to estimate the ceterisparibus effects and therefore to infer causality between variables.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 12 / 81

Page 13: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Definition of the simple regression model

The simple regression model is used in order to analyse therelationship between two variables.Although the simple regression model has many limitations, it isuseful to learn to estimate and interpret this model before startingwith the multiple regression model.In the simple regression model, we consider that there are tworandom variables y and x that represent a population and we areinterested in explaining y in terms of x. For example, y can be thehourly wage and x the education years.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 13 / 81

Page 14: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

We need to establish an equation relating y and x, and the easiestmodel is to assume a linear relationship

y = β0 + β1x+ u (1)

This equation defines the simple regression model and it isassumed that this assumption is valid for the population ofinterest.y ! dependent variable, explained variable or response variable.x ! independent variable, explanatory variable, control variableand regressor.u ! Error term or random shock that captures the effect of otherfactors affecting y. In the analysis of the simple regression analysisall these factors affecting y are considered as unobserved.β1 is the slope parameter and β0 is the intercept. β1 and β0 areunknown parameters that we want to estimate using a randomsample of (x, y).

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 14 / 81

Page 15: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

β1 reflects the change in y given an increase in a unit of x, holdingfixed the rest of the factors affecting y and that are included in u.Note that the linearity assumption implies that an increase in aunit of x has the same effect on y regardless of the initial value ofx. This assumption is not very realistic in some cases and we willrelax this assumption later.Example 1Let’s consider a simple regression model relating the wage of anindividual with his level of education

wage = β0 + β1educ+ u

If his wage, wage, is given in dollars per hour and educ are years ofeducation, β1 reflects the change in hourly wage given an increasein one year of education, holding the rest of the factors fixed.The error term u contains all the other factors affecting wage, suchas the work experience, innate ability and tenure in the current job,etc.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 15 / 81

Page 16: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Example 2Assume that the soya production is determined by the model

yield = β0 + β1fertilizer+ u

where yield is the soya production and fertilizer is the quantity offertilizer.

The error term u contains other factors affecting the soyaproduction such as the quality of land, the quantity of rain, etc.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 16 / 81

Page 17: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Obtaining a good estimation of parameter β1 in model (1)depends on the relationship existing between the error term u andvariable x.Formally, the assumption we need to impose on the relationshipbetween x and u in order to obtain a credible estimation of β1 isthat the mean of u conditional on x is zero for any value of x

E(u j x) = E(u) = 0 (2)

Recall that the mean of u conditional on x is just the mean of thedistribution of u conditional on x.Note that, as long as the model has an intercept, the assumptionE(u) = 0 is not very restrictive, since this is just a normalisationthat is obtained by defining

β0 = E(y)� β1E(x)

The real assumption is that the mean of the distribution of uconditional on x is constant.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 17 / 81

Page 18: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

How assumption (2) should be interpreted in the context ofprevious examples?Example 1 (cont.)

To simplify, we assume that the error term u only represents innateability.The assumption (2) implies that the mean level of ability does notdepend on the years of education.Under this assumption, the level of mean ability of thoseindividuals with 10 years of education is the same as thoseindividuals with 16 years of education.However, if we assume that those individuals with higher innateability chose to acquire higher education, the average innate abilityof those individuals with 16 years of education will be higher thanthe average innate ability of those individuals with 10 years ofeducation and the assumption (2) is not satisfied.Since the innate ability is unobserved it is very difficult to knowwhether its mean depends on the level of education or not; but thisis a question that we should think about before starting with theempirical process.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 18 / 81

Page 19: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Example 2 (cont)

To simplify, assume that in this example the error term u is only thequality of the land.In this case, if the quantity employed in different slots is randomand does not depend of the quality of the land, then theassumption (2) holds: the average quality of the land does notdepend on the fertilizer quantity.On the other hand, if the best land slots obtain a higher quantity offertilizer, the mean value of u depends on the quantity of fertilizerand the assumption (2) is not true.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 19 / 81

Page 20: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

We obtain the expression of the mean of y conditional on x underthe assumption that (2). If we compute the expected value(conditional on x) in the equation (1) we have that

E(y j x) = E(β0 + β1x+ u j x) = β0 + β1x+ E(u j x)

and under the assumption (2)

E(y j x) = β0 + β1x (3)

This equation shows that, under the assumption (2), thepopulation regression function, E(y j x), is a linear function of x.From equation (3) it can be deduced that :

β0 is the mean of y when x is equal to zeroβ1 is the change in the mean of y given an increase in one unit of x.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 20 / 81

Page 21: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

The estimator of Ordinary Least Squares (OLS).Interpretation.

In this section we first review how to estimate the parameters β0and β1 of the simple regression model using a random sample ofthe population. Later on, we will see how to interpret the resultsof the estimation for a given sample.Let f(xi, yi) : i = 1, 2, .., ng be a random sample of the population.Given that this data arises from a population defined by thesimple regression model, for each observation i, we can establishthat

yi = β0 + β1xi + ui (4)

where ui is the error term of observation i containing all thefactors affecting yi different from xi.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 21 / 81

Page 22: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

We use the assumption (2) in order to obtain the estimators of theparameters β0 and β1.Since E(u) = 0, using equation (1) and substituting u as a functionof the observed variables, we have that

E(y� β0 � β1x) = 0 (5)

On the other hand, it can be shown that

E(u j x) = 0 ) E(xu) = 0

and using equation (1) and substituting u as a function of theobserved variables, we have that

E(x(y� β0 � β1x)) = 0 (6)

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 22 / 81

Page 23: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

The equations (5) and (6) allow us to obtain good estimators of theparameters β0 and β1.Replacing in equations (5) and (6) the population expectations bysample means, the estimators of bβ0 and bβ1 are obtained as thesolutions to equations

1n

n

∑i=1(yi � bβ0 � bβ1xi) = 0 (7)

1n

n

∑i=1

xi(yi � bβ0 � bβ1xi) = 0 (8)

Note that the equations (7) and (8) are the sample counterparts toequations (5) and (6). The estimates obtained as the samplecounterparts of population moments are denoted as estimates ofthe method of moments.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 23 / 81

Page 24: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

After some algebra, we can isolate bβ0 and bβ1 in equations (7) and(8) obtaining: bβ0 = y� bβ1x (9)

bβ1 =

n

∑i=1(xi � x) (yi � y)

n

∑i=1(xi � x)2

=Sxy

S2x

(10)

where Sxy =1

n�1

n

∑i=1(xi � x) (yi � y) is the sample covariance

between x and y,and S2x =

1n�1

n

∑i=1(xi � x)2 is the sample variance

of x.Note that in order for the OLS estimators to be defined we need

thatn

∑i=1(xi � x)2 > 0.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 24 / 81

Page 25: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

The estimates defined by equations (9) and (10) are denoted asOrdinary Least Squares (OLS) estimates of the constant term andslope of the simple regression model.The OLS estimates are computed for a given particular sample,and therefore, for a given sample, bβ0 and bβ1 are two real numbers.If the OLS estimates are computed with a different sample, thenone would obtain different results for bβ0 and bβ1. Therefore, sincebβ0 and bβ1 are a function of the sample, we can also think that of bβ0and bβ1 as random variables, that is, as estimators of populationparameters β0 and β1.Both in this section and sections 4 and 5 we are going to analysethe properties of the OLS estimates for a given sample. In section6 we study the statistical properties of the random variables bβ0and bβ1, that is, we study the statistical properties of bβ0 and bβ1 asestimators of the population parameters.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 25 / 81

Page 26: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Although we have derived the expressions for the estimates ofOLS from assumption (2), this assumption is not required in orderto compute the estimates. The only condition needed in order tocompute the OLS estimates for a given sample is that

n

∑i=1(xi � x)2 > 0.

In fact, note thatn

∑i=1(xi � x)2 > 0 is not an assumption since the

only condition we need is that not all the xi in the sample are allequal.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 26 / 81

Page 27: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

We see now a graphical interpretation of the estimates of OLS ofthe simple regression model that justifies the name of least squares.To do so, we draw that cloud of points associated to a givensample of size n and any line

y = b0 + b1x

We show that the OLS estimates defined in equations (9) and (10)are the "best" choice for those values b0 and b1 if the objective isthat the line is as "close" as possible to this cloud of points for agiven proximity criterion.In particular, the proximity criterion that delivers the OLSestimates is to minimise the squared sum of the vertical distancesof the cloud of points to the regression line.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 27 / 81

Page 28: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

. .. .

.

...

.. .

ix

iy

0 1y b b x= +

( , )i ix y

0 1 ib b x+

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 28 / 81

Page 29: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Graphically, we can see that the vertical distance from point(xi, yi) to the line y = b1 + b2x is given by

yi � b0 � b1xi

and therefore, the objective function that should be minimised is

s(b0, b1) =n

∑i=1(yi � b0 � b1xi)

2 (11)

The partial derivatives are:

∂s(b0, b1)

∂b0= �2

n

∑i=1(yi � b0 � b1xi)

∂s(b0, b1)

∂b1= �2

n

∑i=1

xi(yi � b0 � b1xi)

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 29 / 81

Page 30: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

The estimated coefficients are obtained once the partialderivatives of the objective function are equal to zero

n

∑i=1(yi � bβ0 � bβ1xi) = 0

n

∑i=1

xi(yi � bβ0 � bβ1xi) = 0

These two equations are denoted as first order conditions of theOLS estimates and are identical to equations (7) and (8).Therefore, the estimates obtained by minimising the objectivefunction (11) are the OLS estimates defined in equations (9) and(10).

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 30 / 81

Page 31: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

We define fitted value for y when x = xi as

byi = bβ0 +bβ1xi

This is the predicted value for y when x = xi. Note that there is afitted value for each observation in the sample.We define the residual for each observation in the sample as thedifference between the observed value yi and the fitted value byi.

bui = yi � byi

and there is a residual for each observation in the sample.Note that the residual for each observation is the vertical distance(with its corresponding sign) from the point to the regression liney = bβ0 +

bβ1x, and therefore, the OLS criterion is to minimise thesquared sum of residuals. If a point is above the regression line,the residual is positive and if the point is below the regressionline, the residual is negative.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 31 / 81

Page 32: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 32 / 81

Page 33: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Why is this criteria to minimise the squared sum of the residualsused?The answer is because this is an easy criterion and delivers goodestimators with good properties under certain assumptions.Note that a criterion consisting in minimising the sum of theresiduals would not be appropriate since the residuals can bepositive or negative.If we could consider other alternative criterion such asminimising the sum of absolute value of the residuals

minb0,b1

n

∑i=1jyi � b0 � b1xij

The problem of using this criterion is that the objective function isnot differentiable and therefore it is more complicated to computethe minimum.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 33 / 81

Page 34: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Interpretation of the results of the regressionThe regression line or sample regression function is defined asby = bβ0 +

bβ1x

and it is the estimated version of the population regressionfunction.E(y j x) = β0 + β1x.The constant term or intercept, bβ0, is the predicted value for ywhen x = 0.

In many cases, it does not make sense to consider x = 0, and inthese cases bβ0 does not have interest itself. However, it is importantnot to forget including bβ0 when predicting y for any value of x.bβ0 is also the estimated value for the mean of y when x = 0.

The slope, bβ1, is measuring the variation of by when x increases inone unit .

In fact, if x changes in ∆x units, the predicted change in y is of∆by = bβ1∆x units.bβ1 is measuring the estimated variation in the mean of y when xincrease in one unit.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 34 / 81

Page 35: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Example 1 (cont.) Given a sample with n = 526 individuals (fileWAGE1 from Wooldridge) for which the hourly wage in dollars isobserved, wage, and years of education, educ, the following OLSregression line has been obtained

[wage = �0.90+ 0.54 educ

The estimated value �0.9 for the intercept literally means that thepredicted wage for those individuals with 0 years of education is of�90 cents (�0.9 dollars) per hour, this does not make sense.The reason why this prediction is not good for those low levels ofeducation is because there are very few individuals with few yearsof education.The estimated value for the slope indicates that one more year ofeducation implies an increase of predicted hourly wage of 54 cents(0.54 dollars). If the increase in the number of years of education is3 years, the predicted wage would increase in 3 � 0.54 = 1.62dollars.Regarding the prediction for different values of educ, the predictedhourly wage for individuals with 10 years of education is[wage = �0.90+ 0.54 � 10 = 4.5 dollars per hour.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 35 / 81

Page 36: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Fitted values and residuals. Goodness of fit.

Algebraic properties of the OLS regression1. The sum of the residuals is zero

n

∑i=1bui = 0 (12)

and therefore the sample mean of the residuals is zero.2. The sum of the product of the observed values for x and theresiduals is zero

n

∑i=1

xibui = 0 (13)

and therefore, since the mean of the residuals is zero by property1, the sample covariance between the observed values of x andthe residuals is zero.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 36 / 81

Page 37: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

3. The point (x, y) lies on the sample regression line.4. The mean of the fitted values coincides with the mean of theobserved values

y = by5. The sample covariance between the fitted values and theresiduals is zero.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 37 / 81

Page 38: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Goodness of fitIn what follows we see a measure of the capacity of the explanatoryvariable to explain the variability of the dependent variable. Thismeasure reflects the quality of the fit, that is, whether the OLSregression line fits well the data.Definitions:

Total Sum of Squares(TSS):

SST =n

∑i=1(yi � y)2

Explained Sum of Squares (SSE):

SSE =n

∑i=1(byi � by)2 =

since y=byn

∑i=1(byi � y)2

Sum of Squared Residuals (SSR):

SSR =n

∑i=1bu2

i

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 38 / 81

Page 39: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

This three values that we have just seen are non negative, sincethey are sum of squares.SST, SSE and SSR are measures of the degree of variability of thedependent variable, of the fitted values and of the residuals,respectively, since they are the numerators of the sample varianceof each of these variables.These three measures are related to each other, since it can beshown that

SST = SSE+ SSR

Assuming that SST is not zero, which is equivalent to saying thatthe observations of the dependent variable are not all the same,dividing the three terms in the sum above by SSC we have:

1 =SSESST

+SSRSST

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 39 / 81

Page 40: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

We define the coefficient of determination of the model as

R2 =SSESST

= 1� SSRSST

The square-R represents the proportion of the variability of thedependent variable that is explained by the model.R2 satisfies the following condition:

0 � R2 � 1

It is nonnegative because SSE and SST are nonnegativeIt is smaller or equal than 1 because SSR is nonnegative.

Sometimes R2 is also expressed as a percentage, multiplying itsvalue by 100.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 40 / 81

Page 41: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

In order to best understand the role of the coefficient ofdetermination, it is useful to consider the two extreme cases:

The coefficient of determination is 1 if and only if SSR = 0; in thiscase, all the residuals must be exactly equal to 0, thus yi = byi for allthe observations and therefore all the observations lie on the OLSregression line: there is a perfect fit.The coefficient of determination is 0 if and only if SSE = 0; in thiscase, all the fitted values must be exactly equal to y, that is, thefitted values do not depend on the value of the independentvariable, thus the OLS regression line is an horizontal line y = y. Inthis case, knowing the value of the independent variable does notprovide any information on the dependent variable.

In practice, we would always obtain intermediate values of R2.The closer R2 is to 1, the better the goodness of fit.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 41 / 81

Page 42: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

It is important to point out that in social sciences, low R2 is oftenfound, especially when, as we do in this course, we work withcross sections.The fact that R2 is low does not mean that the OLS estimate is notuseful. The OLS estimate can still provide a good estimate of theeffect of X on y even if R2 is low.Example 1 (cont.) In the regression of wage on the years ofeducation we have

[wage = �0.90+ 0.54 educn = 526, R2 = 0.165

The years of education explain 16.5% of the variation of wages.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 42 / 81

Page 43: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Measurement units and functional form

Measurement unitsIt is very important to take into account the measurement unitswhen interpreting the results of a regression.The estimated value of the parameters of a regression modeldepends on the measurement units of the dependent variable andthe explanatory variable.If we have already estimated the parameters of the model usingcertain units for the variables, the estimated values for theseparameters can be easily obtained if we change the measurementunits.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 43 / 81

Page 44: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

If we change the measurement units of the dependent variableand we measure it in different units y� = cy, substituting theestimated model, we haveby� = cbβ0 + cbβ1x = bβ�0 + bβ�1x

where bβ�0 = cbβ0 and bβ�1 = cbβ1 and therefore, the new estimatedcoefficients are equal to the previously estimated coefficientsmultiplied by c.If we change the measurement units of the explanatory variableand the measure this variable with different units x� = cx,substituting in the estimated model x = x�

c we have

by = bβ0 +bβ1c

x� = bβ0 +bβ�1x�

where bβ�1 = bβ1c and therefore the estimated constant does not

change and the new estimated slope is equal to the previouslyestimated slope divided by c.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 44 / 81

Page 45: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Example 1 (cont.)

In the regression of wage on the years of education with variablewage measured in dollars per hour and variable educ measured inyears we obtained the following regression line:

[wage = �0.90+ 0.54 educ

n = 526, R2 = 0.165

Which values would be obtained for the constant and the slope ofthe regression line if wage is measured in cents per hour?Let wagec be the wage in cents. Obviously, the relationship betweenwage and wagec is

wagec = 100 �wage

so that the estimated model using wage in cents per hour isobtained by multiplying by 100 the estimated coefficients weobtained when wage is measured in dollars per hour

\wagec = �90+ 54 educ

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 45 / 81

Page 46: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Example 1 (cont.)

In this way, we obtained that the interpretation of the regressionresults does not change when the measurement units are changed,since an increase in one year of education implies an increase of 54cents per hour in the predicted wage.Regarding R2, the intuition tells us that since this providesinformation on the goodness of fit it should not depend on themeasurement units of the variables. In fact, it can be shown, usingthe definition, that R2 does not depend on the measurement units.In this example, we have that R2, when wage is measured in centsper hour, is also 0.165.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 46 / 81

Page 47: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Example 3

Using a sample (file CEOSAL1 from Wooldridge) of n = 209executive directors for whom their annual wage in thousands ofdollars is observed, salary, and the average return (in percentage) ofthe shares of their company, roe, the following OLS regression linehas been obtained

\salary = 963.19+ 18.50 roe

n = 209, R2 = 0.013

From this model, we have that an increase in a percentage point inthe shares returns increases the predicted wage of the executivedirector in 18500 dollars (18.5 thousands of dollars).If we change the measurements units of the explanatory variable,for example, if the return is expressed as a decimal instead of as apercentage, what would the new estimated coefficients be?

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 47 / 81

Page 48: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Example 3 (cont.)

Let roe1 be the share return expressed as a decimal. Clearly, therelationship between roe and roe1 is

roe1 =1

100roe

so that the estimated model using the shares return in decimals isobtained multiplying by 100 the estimated slope we obtained whenthe return is measured as a percentage

\salary = 963.19+ 1850 roe1

n = 209, R2 = 0.013

In this way, we obtain again that the interpretation of the regressionresults does not change when the measurement units change, sinceas before an increase in a percentage point in the company sharesreturn implies an increase in the predicted wage of the executivedirector of 1850 � 0.01 = 18.5 thousands of dollars.R2 does not change when we change the measurement units of theindependent variable. In this example R2 is still equal to 0.013.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 48 / 81

Page 49: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Example 3 (cont.)

If we change now the measurement units of both the dependentand explanatory variable, for example, we express the return withdecimals and the wage in dollars, what would the new estimatedcoefficients be?On the one hand, we have just seen that the units change in theshares return implies that we need to multiply by 100 the estimatedslope.On the other hand, if salary100 denotes the wage in hundreds ofdollars

salary100 = 10 � salary

These units change implies that we need to multiply by 10 both theconstant and the slope of the regression line.If we make both unit changes, the regression line is

\salary100 = 9631.9+ 18500 roe1

n = 209, R2 = 0.013

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 49 / 81

Page 50: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Functional formSo far we have considered linear relationships between twovariables. As seen above, when we establish a linear relationshipbetween y and x we are assuming that the effect on y of a changein one unit of x does not depend on the initial level of x. Thisassumption is not very realistic in some applications.For example, in example 1 where wage is a function of the yearsof education, the estimated model predicts that an additional yearof education would increase wage in 54 cents both for the firstyear of education, for the fifth, for the sixteenth, etc and this is notquite reasonable.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 50 / 81

Page 51: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Assume that each additional year of education implies a constantpercentage increase in wage. Can this effect be taken into accountin the context of the simple regression model? The answer is yesand it is enough to consider the logarithm of wage as thedependent variable of the model .Assume that the regression model relating wage and years ofeducation is:

log(wage) = β0 + β1educ+ u (14)

In this model if we hold fixed all the other factors affecting wageand captured by error term u, we have that an additional year ofeducation implies and increase of β1 in the logarithm of wage.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 51 / 81

Page 52: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Therefore, since a percentage increase is approximately equal tothe difference of logs multiplied by 100, we have that this modelimplies that, holding fixed all the factors affecting wage andcaptured in the error term u, an additional year of educationimplies and increase in wage of 100 � β1%.Note that equation (14) implies a nonlinear relationship betweenwage and years of education. An additional year of educationimplies a higher increase in wage (in absolute terms) the higherthe initial number of years of education is:The model where the dependent variable is in logarithms and theexplanatory variable is in levels is denoted as log-level model.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 52 / 81

Page 53: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

The model (14) can be estimated by OLS using the logarithm ofwage as the dependent variable.Using the data in example 1, the following results have beenobtained

\log(wage) = 0.584+ 0.083 educn = 526, R2 = 0.186

Therefore, this estimated model implies that for any additionalyear of education the hourly wage increases by 8.3%. This effect isdenoted by economist as return to an additional year of education.There is another important non linearity not included in thisapplication. This non linearity would reflect a "certification" effect.It could be the case that year 12, that is finishing secondaryeducation, has a much larger impact on wage that finishing year11, since the latter does not imply the degree. In chapter 5 we willsee how to take into account this type of non linearities.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 53 / 81

Page 54: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

We analyse here how to use the logarithm transformation in orderto obtain a model with constant elasticity.Example 4

Using the same data as in example 3, we can estimate a model withconstant elasticity that relates the wage of executive directors withthe sales of the firm. The population model we have to estimate is

log(salary) = β0 + β1 log(sales) + u

where sales are the annual sales of the firm in millions of dollarsand salary is the annual wage of the executive director of the firm inthousands of dollars.In this model, β1 is the elasticity of wage of executive directors withrespect to the sales of the firm.This model can be estimated by OLS using the log of wage as adependent variable and the log of sales as an explanatory variable.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 54 / 81

Page 55: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Example 4 (cont.)

The regression model is

\log(salary) = 4.822+ 0.257 log(sales)

n = 209, R2 = 0.211

The estimated elasticity is 0.257, which implies that an increase of1% in the sales implies an increase of 0.257% in the wage of theexecutive director (this is the usual interpretation of elasticity).

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 55 / 81

Page 56: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

The model where both the dependent variable and theexplanatory variable are in logarithms is denoted by log-logmodel.We see now how a change in units of a variable that is expressedin logs affects both the constant and the slope of the model.

Consider the model log-level

log(y) = β0 + β1x+ u (15)

If we change the measurement units of y and define y� = cy, usinglogarithm we have that log(y�) = log(c) + log(y). Substituting in(15) we have

log(y�) = β0 + log(c) + β1x+ u = β�0 + β1x+ u

and therefore, these units changes do not affect the slope, only theconstant of the model.Similarly, if the explanatory variable is in logarithms and wechange its measurement units, this change does not affect the slopeof the model, but only the constant term.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 56 / 81

Page 57: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Finally, we can also consider a model where the dependentvariable is in levels and the explanatory variable is in logs. Thismodel is denoted as level-log model.

y = β0 + β1 log(x) + u

In this model, β1/100 is the variation in units of y given anincrease of 1% in x.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 57 / 81

Page 58: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

The model we studied in this chapter is denoted as simpleregression model, although we have seen that this model alsoallows one to establish some nonlinear relationships betweenvariables.The adjective "lineal" is due to the linearity of the model in termsof the parameters β0 and β1.The variables y and x can be any type of transformation of othervariables.We studied in detail the logarithmic transformations since theyare the most interesting ones in Economics, but in the context ofthe simple regression model the following transformation couldhave also been considered

y = β0 + β1x2 + uy = β0 + β1

px+ u

It is important to take into account that the fact that the variablesare transformation of the variables does not affect the estimationmethod but affects the interpretation of the parameters, forexample as seen above in the logarithmic transformations.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 58 / 81

Page 59: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Statistical Properties of the OLS estimators

The algebraic properties of the OLS estimates have been studiedso far.In this section, we go back to the population model in order tostudy the statistical properties of the OLS estimators.

We consider now that bβ0 and bβ1 are random variables, that is, theyare estimators of the population parameters β0 and β1 and westudy some of the properties of their distributions.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 59 / 81

Page 60: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Unbiasedness of the OLS estimators

Unbiasedness of the OLS estimatorsWe study under which assumptions the OLS estimators are unbiased.

Assumption RLS.1 (linearity in parameters)The dependent variable y is related in the population with theexplanatory variable x and the error term u through thepopulational model

y = β0 + β1x+ u (16)

Assumption RLS.2 (random sample)The data arise from a random sample of size n:f(xi, yi) : i = 1, 2, .., ng from the population modelAssumption RLS.3 (zero conditional mean)

E(u j x) = 0

Assumption RLS.4 (sample variation of the independentvariable)The values of xi, i = 1, 2, .., n, in the sample are not all the same.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 60 / 81

Page 61: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

The assumptions RLS.1 and RLS.2 imply that we can write (16) interms of the random sample as

yi = β0 + β1xi + ui, i = 1, 2, .., n (17)

where ui is the error term of observation i and it contains thoseunobservables affecting yi.Note that the error term ui is not the same as the residual bui.The assumptions RLS.2 and RLS.3 imply that for each observationi

E(ui j xi) = 0, i = 1, 2, .., n

andE(ui j x1, x2, .., xn) = 0, i = 1, 2, .., n (18)

Note that if assumptions RLS.4 does not hold, the OLS estimatorcould not be computed.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 61 / 81

Page 62: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Before showing the statistical properties of the OLS estimators, itis useful to write bβ1 as a function of the errors of the model.

Expression for bβ1 as a function of the error terms:

Using the definition of bβ1 in equation (10)

bβ1 =

n

∑i=1(xi � x) (yi � y)

n

∑i=1(xi � x)2

=

n

∑i=1(xi � x) yi

n

∑i=1(xi � x)2

=Using (17)

n

∑i=1(xi � x) (β0 + β1xi + ui)

n

∑i=1(xi � x)2

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 62 / 81

Page 63: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Expression for bβ1 as a function of the error terms (cont.):

bβ1 = β0

n

∑i=1(xi � x)

n

∑i=1(xi � x)2

+ β1

n

∑i=1(xi � x) xi

n

∑i=1(xi � x)2

+

n

∑i=1(xi � x) ui

n

∑i=1(xi � x)2

Since

n

∑i=1(xi � x) = 0 and

n

∑i=1(xi � x) xi =

n

∑i=1(xi � x)2

we have that

bβ1 = β1 +

n

∑i=1(xi � x) ui

n

∑i=1(xi � x)2

(19)

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 63 / 81

Page 64: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Under assumptions RLS.1 to RLS.4, bβ0 and bβ1 are unbiasedestimators of parameters β0 and β1, that it is

E(bβ0) = β0 and E(bβ1) = β1

ProofWe are going to show that bβ1 is an unbiased estimator of β1, that itis E(bβ1) = β1.In this proof, the expectations are conditional to the observedvalues of the explanatory variable in the sample, that is, they areconditional expectations in x1, x2, .., xn. Therefore, conditioning inthe observed values of x, all those terms that are a function ofx1, x2, .., xn are not random.Using (19)

E�bβ1

�= β1 + E

0BB@n

∑i=1

(xi�x)ui

n

∑i=1

(xi�x)2

1CCA = β1 +1

n

∑i=1

(xi�x)2E

n

∑i=1(xi � x) ui

!

= β1 +1

n

∑i=1

(xi�x)2

n

∑i=1(xi � x)E(ui) =

using (18)β1

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 64 / 81

Page 65: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Some Comments on Assumptions RLS.1 to RLS.4

Generally, if one of the four assumptions we consider does nothold, then the estimator is not unbiased.As mentioned before, if assumption RLS.4 fails it is not possible toobtain the OLS estimates.Assumption RLS.1 requires that the relationship between y and x islinear with an additive error; we have already discussed that wemean linear in parameters since variables x and y can be nonlineartransformation of the variables of interest. If assumption RLS.1 failsand the model is nonlinear in parameters, the estimation is morecomplicated and it is beyond the contents of this course.Regarding assumption RLS.2, this is suitable for many applications(although not in all of them) when we work with cross-sectionaldata.Finally, assumption RLS.3 is a crucial assumption for theunbiasedness of the OLS estimator. If this assumption fails, theestimators are generally biased. In chapter 3, we will see that wecan determine the direction and the size of the bias.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 65 / 81

Page 66: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Some Comments on Assumptions RLS.1 to RLS.4 (cont.)

In the analysis of simple regression with non experimental data,there is always the possibility that x is correlated with u. When ucontains factors affecting y and that are correlated with x, the resultof the OLS estimation can reflect the effect that those factors haveon y and not the ceteris paribus relationship between x and y.Example 5

Suppose we are interested in analysing the effect of a publicprogramme of the school lunch on the school return.It is expected that this programme has a positive ceteris paribus effecton the school return since if there is a student without economicresources to pay for the mean that benefits from this programme, hisproductivity in school should improve.We have data on 408 secondary school of Michigan state (fileMEAP93 from Wooldridge) and for each school we observe thepercentage of students that pass a standardised math exam (math10)and the percentage of students that benefit from the lunchprogramme in schools (lnchprg).

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 66 / 81

Page 67: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Some Comments on Assumptions RLS.1 to RLS.4 (cont.)

Example 5 (cont.)

Given this data, the following results have been obtained:

\math10 = 32.14� 0.319 lnchprg

n = 408 R2 = 0.171

The estimated model predicts that if the access to the programmeincreases in 10 percentage points, the percentage of students passingthe exam decreases in approximately 3.2 percentage points. Is thisresult credible?The answer is NO. It is more likely that this result is due to the errorterm being correlated with lnchprg. The error term contains otherfactors (different to the access to the school lunch programme)affecting the result of the exam. Among these factors, thesocioeconomic level of the students families, which affects the schoolproductivity and that is obviously correlated with the participation inthe lunch programme.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 67 / 81

Page 68: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Interpretation of the concept unbiasedness of an estimator

Recall that the fact that an estimator is unbiased does not mean thatfor our particular sample the value of the estimate is close to thetrue value of the parameter.The fact that an estimator is unbiased implies that if we had accessto many random samples of the population and for each of themthe value of the estimator was computed, if the number of samplesis very large, the sample mean of the estimates would be very closeto the true value of the parameter we want to estimate.Since in the practice we only have access to one sample, theunbiasedness property is not very useful if there is not any otherproperty that guarantees that the dispersion of the distribution ofthe OLS estimator is small.In addition, a dispersion measure of the distribution of theestimators allow us to choose the best estimator as the one with lowdispersion.As a way of measuring the dispersion we use the variance, or thesquare root, the standard deviation.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 68 / 81

Page 69: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Variances of the OLS estimatorsIn this chapter we are going to compute the variance of the OLSestimators under an additional assumption known as thehomoskedasticity assumption. This assumption establishes thatthe variance of the error term u conditional on x is constant, that is, it does not depend on x.The variance of the OLS estimator can be computed without anyadditional assumption, that is, using only assumptions RLS1 toRLS4. However, the expressions for the variances in the generalcase are more complicated and they are beyond the scope of thiscourse.Assumption RLS.5 (homoskedasticity)

Var(u j x) = σ2

When Var(u j x) depends on x we say that the errors areheteroskedastic.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 69 / 81

Page 70: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

It is important to point out that assumption RLS.5 does not playany role in the unbiasedness of bβ0 and bβ1.We add assumption RLS.5 to simplify the computation of thevariance of the OLS estimators. Additionally, as we will see inChapter 7, under the additional assumption of homoskedasticitythe OLS estimators have some efficiency properties.Since assumption RLS.3 establishes that E(u j x) = 0 and sinceVar(u j x) = E(u2 j x)� (E(ujx))2 , we can write assumption RLS.5as

E(u2 j x) = σ2

Assumption RLS.5 can be written as

Var(y j x) = σ2

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 70 / 81

Page 71: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Example 1Let’s consider again the simple regression model relating thewage of a person with his/her level of education

wage = β0 + β1educ+ u

In this model the assumption of homoskedasticity isVar(wage j educ) = σ2, i.e., the variance of wage does not depend onthe number of years of education.This assumption cannot be very realistic since it is likely that thoseindividuals with higher levels of education have differentopportunities to work, which can lead to a higher variability ofwages for high education levels.On the contrary, those individuals with low levels of educationhave less opportunities to work and many of them work for theminimum wage and this implies that the variability of wage issmall for low levels of education.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 71 / 81

Page 72: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Variance of the sampling distribution of the OLS estimatorsUnder assumptions RLS.1 to RLS.5

Var(bβ1) =σ2

n

∑i=1(xi � x)2

=σ2

(n� 1)S2x

Var(bβ0) =

σ2 1n

n

∑i=1

x2i

n

∑i=1(xi � x)2

=σ2x2

(n� 1)S2x

where the variance is conditional to the observed values in thesample for the explanatory variables, i.e. they are conditionalvariances on x1, x2, .., xn

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 72 / 81

Page 73: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Proof

We show the formula for the variance of Var(bβ1).Recall the expression of bβ1 as a function of the errors of the modelin equation (19)

bβ1 = β1 +

n

∑i=1(xi � x) ui

n

∑i=1(xi � x)2

The variance we have to compute is conditional on xi, therefore,

(xi � x) , i = 1, 2, .., n, andn

∑i=1(xi � x)2 are not random and β1 is not

random neither.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 73 / 81

Page 74: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Proof (cont.)Additionally , using assumption RLS.2, errors ui are independentand therefore, using the following properties of the variance:

The variance of the sum of independent random variables is the sumof the variancesThe variance of a constant times a random variable equals thesquared constant times the variance of the random variableThe variance of the sum of a variable and a constant is the variance ofthe random variable

we have that

Var�bβ1

�=

n

∑i=1

(xi�x)2var(ui) n

∑i=1

(xi�x)2!2 =

using RLS.5

n

∑i=1

(xi�x)2σ2

n

∑i=1

(xi�x)2!2

=

σ2n

∑i=1

(xi�x)2

n

∑i=1

(xi�x)2!2 =

σ2n

∑i=1

(xi�x)2= σ2

(n�1)S2x

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 74 / 81

Page 75: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

According to the expression that we have obtained to the varianceof bβ1 we have that:

The higher the variance of the error term, σ2, the higher thevariance of bβ1, if the variance of the unobservables affecting y isvery large, it is very difficult to estimate β1 precisely.The higher the variance of xi the smaller the variance of bβ1, if xi hasa low dispersion, it is very difficult to estimate β1 precisely.The higher the sample size, the smaller the variance of bβ1 is.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 75 / 81

Page 76: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Estimation of the variance of the error termThe variance of bβ0 and bβ1 depends on the sample values of xi,which are observables and of the variance of the error term σ2,that is an unknown parameter. Therefore, in order to estimate thevariance of bβ0 and bβ1 we have to obtain an estimator of σ2.Since σ2 is the variance of the error term u, that as we saw aboveequals the expectation of u2 (given that the mean of u is zero byassumption RLS.3), we could think of using the sample mean ofthe squared errors

w =1n

n

∑i=1

u2i

as an estimator of σ2.If we could compute w as a function of the sample, w would be anunbiased estimator of σ2 since

E

1n

n

∑i=1

u2i

!= 1

n

n

∑i=1

E(u2i ) = σ2

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 76 / 81

Page 77: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

The problem is that w is not an estimator since it cannot becomputed as a function of the sample since the errors are notobservableWhat we can compute as a function of the sample is the residualsbui. In what follows, we see that the residuals are estimates of theerrors and how to obtain an unbiased estimator of σ2 as a functionof the squared residuals.Recall that the residual of observation i is defined asbui = yi � byi = yi � bβ0 � bβ1xi

and since the error of observation i is

ui = yi � β0 � β1xi

we can think of the residuals as estimates of the errors.In this way, we can define the following estimator of σ2

bw = 1n

n

∑i=1bu2

i

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 77 / 81

Page 78: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

bw is an estimator of σ2 but it is not unbiased. The reason why thisestimator is not unbiased is that, as opposed to the errors - whichare independent-, the residuals are not independent since theysatisfy the two linear restrictions seen in Section 4 (equations (12)and (13)).Therefore, since n residuals satisfy two linear restrictions, theresiduals have n� 2 degrees of freedom and the unbiasedestimator of σ2 is bσ2 =

1n� 2

n

∑i=1bu2

i

(proof in page 62 of Wooldridge)

Using this estimator for σ2, the estimated variances of bβ1 and bβ0are defined as follows

\Var(bβ1) =bσ2

(n� 1)S2x

and \Var(bβ0) =bσ2x2

(n� 1)S2x

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 78 / 81

Page 79: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

The Standard Error of Regression (SER) is defined as

bσ = pbσ2

bσ is an estimator of the standard deviation of the error term, σ.Although bσ is not an unbiased estimator of σ we see below thatthis has other good properties when the sample is large.

The standard error of bβ1, denoted by se(bβ1), is defined as

se(bβ1) =bσp

(n� 1)S2x

se(bβ1) is an estimator of the standard deviation of bβ1 and thereforea measure of the precision of bβ1.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 79 / 81

Page 80: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Analogously, the standard error of bβ0, denoted by se(bβ0), isdefined as

se(bβ0) =bσpx2p(n� 1)S2

x

se(bβ0) is an estimator of the standard deviation of bβ0 and thereforea measure of the dispersion of bβ0.

se(bβ1) is a random variable since, given the values of xi, it takesdifferent values for different samples of y. For a given sample, thestandard error se(bβ1) is a number as bβ1 when we compute it witha particular sample. The same happens with se(bβ0).The standard errors play a very important role for inference, thatis, when testing restrictions on the parameters of the model orwhen computing confidence intervals.

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 80 / 81

Page 81: Chapter 6: The Simple Regression Modelrua.ua.es/dspace/bitstream/10045/41902/1/unit6.pdfExamples: Encuesta de Presupuestos Familiares (EPF), Encuesta de Población Activa (EPA). Time

Example 1 (cont.) Using the data of example 1, the followingmodel has been estimated

log(wage) = β0 + β1educ+ u

and the standard errors have been computed.

The results of the estimation including the standard errors areusually presented as follows

\log(wage) = 0.584(0.097)

+ 0.0827(0.0076)

educ

n = 526, R2 = 0.186

M. Angeles Carnero (UA) Chapter 6: SRM Year 2014-15 81 / 81