simple linear regression
TRANSCRIPT
![Page 1: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/1.jpg)
SIMPLE LINEAR REGRESSION
Reporters: Atty. Gener R. Gayam, CPAAgapito “pete” M. Cagampang, PMRaymond B. Cabling, MD
![Page 2: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/2.jpg)
SIMPLE LINEAR REGRESSION
A.The Scatter DiagramIn solving problems that concern estimation and
forecasting, a scatter diagram can be used as a graphical approach. This technique consists of joining the points corresponding to the paired scores of dependent and independent variables which are commonly represented by X and Y on the X – y coordinate system.
Below is an illustration of a scatter diagram using the data in Table 6.1. This table shows the data about the six years working experience and the income of eight employees in a big industrial corporation.
![Page 3: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/3.jpg)
Table 6.1
EmployeesYears of Working
ExperienceIncome
(Thousand of Pesos)
X YA 2 8B 8 10C 4 11D 11 13E 5 9F 13 17G 4 8H 15 14
ΣX = 62 ΣY = 90 =7.75 Ῡ = 11.25
Working experience and Income of Eight Employees
![Page 4: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/4.jpg)
Figure 6.1 – A Scatter Diagram for Table 6.1 Data
10 12 14 162 4 6 8X
Y
17
15
13
11
9
7
0 X XX
X
X
X
X
X
![Page 5: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/5.jpg)
For you to roughly predict the value of a dependent variable, such as years of working experience, from the dependent variable, which is income, your next step is to draw a trend line. This is a line passing through the series of points such that the total vertical measurement of the points below this line is more or less equal to the total measurements of the points above the line. If these requirements are satisfied, you draw a correct trend Y. The illustration is shown in figure 6.1
![Page 6: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/6.jpg)
Figure 6.2 - A trend line drawn on the linear direction between working experience and income of eight employees
10 12 14 162 4 6 8X
Y
17
15
13
11
9
7
0
Trend Line
![Page 7: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/7.jpg)
Using the trend line draw in Figure 6.1 above, the value estimated for Y when X is 16, is 18. You should not fail to remember that if a “straight line” appears to describe the relationship, the algebraic approach called the regression formula can be used as explained in the next topic.
![Page 8: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/8.jpg)
B. The Least Square Linear Regression
Equation The least square linear regression equation can
be understood through this formula known from algebra.
Y = a + bxFor instance the Y = a+bx in figure 6.1 in that line
that gives the smallest sum of the squares of the vertical measurements or distance of the points from the line.
In solving the regression equations, you need to solve first, Σ (Xi - X) (Yi - Ῡ)
Σ (Xi - X)²b = and
a = Ῡ - bX
![Page 9: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/9.jpg)
Example: Solve the least squares regression line for the data scores in Table 6.1.
Employees X Y (X–X) (Y–Y) (X–X) (Y–Y) (X–X)²
A 2 8 -5.75 -3.25 18.6875 33.0625B 8 10 0.25 -1.25 -0.3125 0.0625C 4 11 -3.75 -0.25 0.9375 14.0625D 11 13 3.25 1.75 5.6875 10.5625E 5 9 -2.75 -2.25 6.1875 7.5625F 13 17 5.25 5.75 30.1875 27.5625G 4 8 -3.75 -3.25 12.1875 14.0625H 15 14 7.25 2.75 19.9375 52.5625
93.8125 159.50– .312593.50
ΣX = 62 ΣY = 90
X = 7.75 Y = 11.25
![Page 10: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/10.jpg)
Solution:Σ(Xi – X) (Yi – Y)
93.50159.50
=
=
a = Y – bX
= 11.25 ‒ .59 (7.75)
= 11.25 ‒ 4.75
= 6.68 Answer
0.5862068
0.59 Answer
b =
=
Σ(Xi – X)²
![Page 11: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/11.jpg)
After solving the values of b and a, your regression equation obtained from Table 6.1 is.
Y = 6.68 + .59 X
Now letting X = 16, What is Y?
Solution:
Y = 6.68 + .59 (16)
= 6.68 + 9.44
= 16.12
![Page 12: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/12.jpg)
Now, we are interested in the distance of the Y values from Y₁ the corresponding ordinate of the regression line. Here, we are going to base our measure of dispersion or variation around the regression line on the distance (Y₁ ‒ Y)². This can be well understood by this standard error of estimate formula given below.
Se = Σ(Yi ‒ Ŷ)² n ‒ 2
C. The standard Error of Estimate
√
![Page 13: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/13.jpg)
However, this formula entails a very tedious process of computing the standard error of estimate, so that the formula by Basil P. Korin (1977), which is easier to solve suggested as follows:
Se = ΣYi² ‒ a(Yi) ‒ b(Xi ‒ Yi)
n ‒ 2Note:
The symbol a and b stand for the intercept and the slope of the regression line.
√
![Page 14: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/14.jpg)
Example:Solve the standard error of estimate for the regression line
which was derived from the data in Table 6.1.Se = Σ(Yi ‒ Ŷ)²
n ‒ 2√ Y X Ŷ (Y ‒ Ŷ) (Y ‒ Ŷ)²8 2 7.86 0.14 0.0196
10 8 11.4 -1.4 1.9611 4 9.04 1.96 3.841613 11 13.17 -0.17 0.02899 5 9.63 -0.63 0.3969
17 13 14.35 2.65 7.02258 4 9.04 -1.04 1.0816
14 15 15.53 -1.53 2.340916.692
![Page 15: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/15.jpg)
Step 1 – Compute the value of Y at each of the X values.Example:
Y = 6.68 + .59 (2) = 6.68 + 1.18 = 7.68
Do the rest by following the same procedure.Step 2 – Get the difference between (Yi ‒ Ŷ).
Example:8 – 7.86 = .14
Step 3 – Square all the difference Yi ‒ Ŷ.Example:
(.14)² = .0196
![Page 16: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/16.jpg)
Step 4 – Apply the formula.
Se = Σ(Yi ‒ Ŷ)² n ‒ 2
= 16.692 8 – 2
= 16.692 6
= 2.782
= 1.67
√
√
√√
![Page 17: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/17.jpg)
Xi Yi Yi² XiYi2 8 64 168 10 100 804 11 121 44
11 13 169 1435 9 81 45
13 17 289 2214 8 64 32
15 14 196 210
ΣY = 90 ΣY² = 1084 ΣXY = 791
Solution 2:Se = ΣYi² ‒ a(Yi) ‒ b(Xi ‒ Yi)
n ‒ 2
![Page 18: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/18.jpg)
Step 1 – Square Y₁ Example:
(8²) = 64Step 2 – Multiply XiYi
Example:2 X 8 = 16
Step 3 – Get the sum of Yi² and XiYiStep 4 – Apply the formula
= 1084 – 6.68 (90) – .59 (791) n – 2
= 1084 – 601.2 – 466.69 8 – 2
= 1084 – 1067.89 8 – 2= 16.11 6= 2.685= 1.64
√√√√√
![Page 19: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/19.jpg)
The standard error of estimate is interpreted as the standard deviation. For example, if we measure vertically three standard errors from the regression line above and below, we will find that the same value of X will always fall between the upper and lower 3Se Limits.
In the example above of the standard error of estimate which is 1.64 you will come up with 4.92 units (3) (1.64) above and below the regression line. This means that these “bounds” of 4.92 unit above and below the regression line pertain to all observations taken for that particular sample. If you draw two parallel lines, each of them lying one Se from the regression line, you will expect two thirds of the observations falling between these bounds. See Figure 6.1 for the illustration of the data in Table 6.1.
![Page 20: Simple linear regression](https://reader038.vdocuments.site/reader038/viewer/2022103116/558727f9d8b42ae7138b45d1/html5/thumbnails/20.jpg)
7
X
Y
17
15
13
11
9
02 4 6 8 10 12 14 16
Figure 6.3 – A regression Line with One Standard Error Distance
Y = 6.68 + .59 X