ch11 curve fitting dr. deshi ye [email protected]
TRANSCRIPT
![Page 2: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/2.jpg)
2/30
Outline
The method of Least Squares
Inferences based on the Least Squares Estimators
Curvilinear Regression
Multiple Regression
![Page 3: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/3.jpg)
3/30
11.1 The Method of Least Squares
Study the case where a dependent variable is to be predicted in terms of a single independent variable.
The random variable Y depends on a random variable X.
Regressing curve of Y on x, the relationship between x and the mean of the corresponding distribution of Y.
![Page 4: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/4.jpg)
4/30
Linear regression
![Page 5: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/5.jpg)
5/30
Linear regression
Linear regression: for any x, the mean of the distribution of the Y’s is given by x
In general, Y will differ from this mean, and we denote this difference as follows
Y x is a random variable and we can also choose
so that the mean of the distribution of this random is equal to zero.
![Page 6: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/6.jpg)
6/30
EXx 1 2 3 4 5 6 7 8 9 10 11 12
y 16 35 45 64 86 96 106 124 134 156 164 182
![Page 7: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/7.jpg)
7/30
Analysisˆ
ˆi i i
y a bx
e y y
1
n
ii
e as close as possible to zero.
![Page 8: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/8.jpg)
8/30
Principle of least squares
2 2
1 1
( ( ))n n
i i ii i
e y a bx
Choose a and b so that
is minimum. The procedure of finding the equation of the line which best fits a given set of paired data, called the method of least squares. Some notations:
2
2 2 1
1 1
( )( )
n
in ni
xx i ii i
xS x x x
n
2
2 2 1
1 1
( )( )
n
in ni
yy i ii i
yS y y y
n
1 1
1 1
( )( )( )( )
n n
i in ni i
xy i i i ii i
x yS x x y y x y
n
![Page 9: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/9.jpg)
9/30
Least squares estimators
, where , are the means of ,xy
xx
Sa y b x and b x y x y
S
Fitted (or estimated) regression line
y a bx Residuals: observation – fitted value= ( )i iy a bx
The minimum value of the sum of squares is called the residual sum of squares or error sum of squares. We will show that n
2
1
2
residual sum of squares= ( - - )
/
i ii
xy xy xx
SSE y a bx
S S S
![Page 10: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/10.jpg)
10/30
EX solution
Y = 14.8 X + 4.35
![Page 11: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/11.jpg)
11/30
X-and-Y
X-axis Y-axis
independent dependent
predictor predicted
carrier response
input output
![Page 12: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/12.jpg)
12/30
Example
You’re a marketing analyst for Hasbro Toys. You gather the following data:
Ad $ Sales (Units)1 12 13 24 25 4
What is the relationship between sales & advertising?
![Page 13: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/13.jpg)
13/30
0
1
2
3
4
0 1 2 3 4 5
Scattergram Sales vs. Advertising
Sales
Advertising
![Page 14: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/14.jpg)
14/30
the Least Squares Estimators
![Page 15: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/15.jpg)
15/30
11.2 Inference based on the Least Squares Estimators
We assume that the regression is linear in x and, furthermore, that the n random variable Yi are independently normally distribution with the means
Statistical model for straight-line regression
i i iY x
ix
i are independent normal distributed random variable having zero means and the common variance 2
![Page 16: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/16.jpg)
16/30
Standard error of estimate
The i-th deviation and the estimate of is
2
2 2
1
1[ ( )]
2
n
e i ii
S y a bxn
Estimate of can also be written as follows2
2
2
( )
2
xyyy
xxe
SS
SS
n
![Page 17: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/17.jpg)
17/30
Statistics for inferences: based on the assumption made concerning the distribution of the values of Y, the following theorem holds.
Theorem. The statistics
2
( ) ( )
( )xx
xxe xx e
nSa bt and t S
s S n x s
are values of random variables having the t distribution with n-2 degrees of freedom.
Confidence intervals2
/ 2
/ 2
1 ( ):
1:
exx
e
xx
xa t s
n S
b t sS
![Page 18: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/18.jpg)
18/30
Example
The following data pertain to number of computer jobs per day and the central processing unit (CPU) time required.
Number of jobs
x
CPU time
y
1
2
3
4
5
2
5
4
9
10
![Page 19: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/19.jpg)
19/30
EX
1) Obtain a least squares fit of a line to the observations on CPU time
2, 0xy
xx
Sb a y bx
S 2y x
![Page 20: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/20.jpg)
20/30
Example
2) Construct a 95% confidence interval for α
22 / 46 400 /10
22 3
yy xy xxe
S S Ss
n
The 95% confidence interval of α, / 2 0.025 3.182t t
2
/ 2
1 1 90 3.182* 2 * 4.72
5 10exx
xa t s
n S
![Page 21: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/21.jpg)
21/30
Example
3) Test the null hypothesis against the alternative hypothesis at the 0.05 level of significance.
1
1
Solution: the t statistic is given by
( ) 2 110 2.236
2xx
e
bt S
s
Criterion: 0.05 2.353t t
Decision: we cannot reject the null hypothesis
![Page 22: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/22.jpg)
22/30
11.3 Curvilinear Regression
Regression curve is nonlinear.
Polynomial regression: 2
0 1 2p
pY x x x
Y on x is exponential, the mean of the distribution of values of Y is given by xy
Take logarithms, we have log log logy x
Thus, we can estimate by the pairs of value , ( , log )i ix y
![Page 23: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/23.jpg)
23/30
Polynomial regression
If there is no clear indication about the function form of the regression of Y on x, we assume it is polynomial regression
20 1 2
kkY a a x a x a x
![Page 24: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/24.jpg)
24/30
Polynomial Fitting
•Really just a generalization of the previous case•Exact solution•Just big matrices
![Page 25: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/25.jpg)
25/30
11.4 Multiple Regression
0 1 1 2 2 k kb b x b x b x The mean of Y on x is given by
0 1 1 2 2
21 0 1 1 1 2 1 2
22 0 2 1 1 2 2 2
y nb b x b x
x y b x b x b x x
x y b x b x x b x
20 1 1
1
[ ( )]n
i i k iki
y b b x b x
Minimize
We can solve it when r=2 by the following equations
![Page 26: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/26.jpg)
26/30
Example
P365.
![Page 27: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/27.jpg)
27/30
Multiple Linear Fitting
X1(x), . . .,XM(x) are arbitrary fixed functions of x (can be nonlinear), called the basis functions
normal equations of the least squaresproblem
Can be put in matrix form and solved
![Page 28: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/28.jpg)
28/30
Correlation Models
1. How strong is the linear relationship between 2 variables?
2. Coefficient of correlation usedPopulation correlation coefficient denoted Values range from -1 to +1
![Page 29: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/29.jpg)
29/30
Correlation
Standardized observation
The sample correlation coefficient r
Observation - Sample mean
Sample standard deviationi
x
x x
s
1
1( )( )
1
ni i
i x y
x x y yr
n s s
![Page 30: Ch11 Curve Fitting Dr. Deshi Ye yedeshi@zju.edu.cn](https://reader035.vdocuments.site/reader035/viewer/2022062216/56649ce55503460f949b2a6d/html5/thumbnails/30.jpg)
30/30
Coefficient of Correlation Values
-1.0-1.0 +1.0+1.000-.5-.5 +.5+.5
No No CorrelationCorrelation
Increasing degree of Increasing degree of negative correlationnegative correlation
Increasing degree of Increasing degree of positive correlationpositive correlation