c22: the method of least squares
DESCRIPTION
CIS 2033 based on Dekking et al. A Modern Introduction to Probability and Statistics. 2007 Instructor Longin Jan Latecki. C22: The Method of Least Squares. 22.1 – Least Squares. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: C22: The Method of Least Squares](https://reader035.vdocuments.site/reader035/viewer/2022062217/5681329b550346895d99376d/html5/thumbnails/1.jpg)
CIS 2033 based onDekking et al. A Modern Introduction to Probability and Statistics. 2007
Instructor Longin Jan Latecki
C22: The Method of Least Squares
![Page 2: C22: The Method of Least Squares](https://reader035.vdocuments.site/reader035/viewer/2022062217/5681329b550346895d99376d/html5/thumbnails/2.jpg)
22.1 – Least Squares Given is a bivariate dataset (x1, y1), …, (xn, yn), where x1, …, xn are nonrandom and Yi = α + βxi + Ui are random variables for i = 1, 2, . . ., n. The random variables U1, U2, …, Un have zero expectation and variance σ 2
Method of Least Squares: Choose a value for α and β such that
S(α,β)=( ) is minimal.
∑1
n
( yi−α−β x i)2
![Page 3: C22: The Method of Least Squares](https://reader035.vdocuments.site/reader035/viewer/2022062217/5681329b550346895d99376d/html5/thumbnails/3.jpg)
22.1 – RegressionThe observed value yi corresponding to xi and the value α+βxi on the
regression line y = α + βx.
∑1
n
( yi−α−β x i)2
![Page 4: C22: The Method of Least Squares](https://reader035.vdocuments.site/reader035/viewer/2022062217/5681329b550346895d99376d/html5/thumbnails/4.jpg)
22.1– Estimation
After some calculus magic, we get two equations to estimate α and β:
Method of Least Squares: Choose a value for α and β such that
S(α,β)=( ) is minimal.∑1
n
( yi−α−β x i)2
To find the least squares estimates, we differentiate S(α, β) with respect to α and β, and we set the derivatives equal to 0:
![Page 5: C22: The Method of Least Squares](https://reader035.vdocuments.site/reader035/viewer/2022062217/5681329b550346895d99376d/html5/thumbnails/5.jpg)
22.1– Estimation
After some simple algebraic rearranging, we obtain:
(slope)
(intercept)
![Page 6: C22: The Method of Least Squares](https://reader035.vdocuments.site/reader035/viewer/2022062217/5681329b550346895d99376d/html5/thumbnails/6.jpg)
![Page 7: C22: The Method of Least Squares](https://reader035.vdocuments.site/reader035/viewer/2022062217/5681329b550346895d99376d/html5/thumbnails/7.jpg)
Regression line y = 0.25 x –2.35 for points
![Page 8: C22: The Method of Least Squares](https://reader035.vdocuments.site/reader035/viewer/2022062217/5681329b550346895d99376d/html5/thumbnails/8.jpg)
22 ][E][E)(Var XXX
![Page 9: C22: The Method of Least Squares](https://reader035.vdocuments.site/reader035/viewer/2022062217/5681329b550346895d99376d/html5/thumbnails/9.jpg)
22.1– Least Square Estimators are Unbiased
The estimators for α and β are unbiased.
For the simple linear regression model, the random variable
is an unbiased estimator for σ2.
σ̂ 2= 1n−2∑i=1
n
(Y i−α̂−β̂ xi)2
![Page 10: C22: The Method of Least Squares](https://reader035.vdocuments.site/reader035/viewer/2022062217/5681329b550346895d99376d/html5/thumbnails/10.jpg)
22.2– ResidualsA way to explore whether the linear regression model is appropriate to model a given bivariate dataset is to inspect a scatter plot of the so-called residuals ri against the xi. The ith residual ri is defined as the vertical distance between the ith point and the estimated regression line:
We always have
![Page 11: C22: The Method of Least Squares](https://reader035.vdocuments.site/reader035/viewer/2022062217/5681329b550346895d99376d/html5/thumbnails/11.jpg)
22.2– Heteroscedasticity Homoscedasticity: The assumption of equal variance of
the Ui (and therefore Yi).In case the variance of Yi depends on the value of xi, wespeak of heteroscedasticity. For instance, heteroscedasticity occurs when Yi with a large expected value have a larger variance than those with small expected values. This produces a “fanning out” effect, which can be observed in the figure:
![Page 12: C22: The Method of Least Squares](https://reader035.vdocuments.site/reader035/viewer/2022062217/5681329b550346895d99376d/html5/thumbnails/12.jpg)
22.3– Relation with Maximum LikelihoodWhat are the maximum likelihood estimates for α and β?To apply the method of least squares no assumption is needed about the type of distribution of the Ui. In case the type of distribution of the Ui is known, the maximum likelihood principle can be applied. In particular, when the Ui are independent with an N(0, σ2) distribution.
Then Yi has an N (α + βxi, σ2) distribution, making the probability density function
![Page 13: C22: The Method of Least Squares](https://reader035.vdocuments.site/reader035/viewer/2022062217/5681329b550346895d99376d/html5/thumbnails/13.jpg)
When Yi are independent, and eachYi has an N(α+βxi, σ2) distribution, and assuming that the linear model is appropriate to model a given bivariate dataset, the residuals ri should look like the realization of a random sample from a normal distribution. An example is shown in the figure below:
![Page 14: C22: The Method of Least Squares](https://reader035.vdocuments.site/reader035/viewer/2022062217/5681329b550346895d99376d/html5/thumbnails/14.jpg)
22.3– Maximum Likelihood
For fixed σ >0 the loglikelihood l (α, β, σ) obtains the maximum when
is minimal. Hence, when random variables independent with a N(0,σ 2) distribution, the maximum likelihood principle and the least squares method return the same estimators.
The maximum likelihood estimator for σ 2 is:
∑1
n
( yi−α−β x i)2
σ̂ 2= 1n∑i=1
n
(Y i−α̂−β̂ xi)2