new linear regression (part ii) · 2020. 9. 23. · linear regression (part ii) [email protected]...
TRANSCRIPT
![Page 1: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •](https://reader034.vdocuments.site/reader034/viewer/2022051815/6040f605f0e1b161d102389a/html5/thumbnails/1.jpg)
11/3/2008
1
Linear Regression(Part II)
Outline
• Generalize the linear regression
• Simple multiple regression
• Regularization
• Bias vs. Variance tradeoff
• Model selection
2
![Page 2: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •](https://reader034.vdocuments.site/reader034/viewer/2022051815/6040f605f0e1b161d102389a/html5/thumbnails/2.jpg)
11/3/2008
2
Linear regression in D dimensions
From D = 1
To D > 1 and linear on x(multiple regression)
To D > 1 and any set of nonlinear functions on x
3
All are linear on wLinear regression
Linear basis functions models
4
![Page 3: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •](https://reader034.vdocuments.site/reader034/viewer/2022051815/6040f605f0e1b161d102389a/html5/thumbnails/3.jpg)
11/3/2008
3
Example of basis functions transformation
5
Maximum likelihood and least squares
Assuming normal distribution
The likelihood to observe N target values t from a set of X D dimensional predictors
Maximize ln of likelihood…
6
is the same as minimizing E
![Page 4: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •](https://reader034.vdocuments.site/reader034/viewer/2022051815/6040f605f0e1b161d102389a/html5/thumbnails/4.jpg)
11/3/2008
4
Solving MLEFind optimal w…
based on Moore-Penrose pseudo-inverse
7
Outline
• Generalize the linear regression
• Simple multiple regression
• Regularization
• Bias vs. Variance tradeoff
• Model selection
8
![Page 5: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •](https://reader034.vdocuments.site/reader034/viewer/2022051815/6040f605f0e1b161d102389a/html5/thumbnails/5.jpg)
11/3/2008
5
The case of simple multiple regression
wi XiY=w0 +
w
w
w0
w1
wD
9
The case of simple multiple regression
w
10
![Page 6: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •](https://reader034.vdocuments.site/reader034/viewer/2022051815/6040f605f0e1b161d102389a/html5/thumbnails/6.jpg)
11/3/2008
6
Least squares estimatew
w
w w w
w
w
11
ww w
Least squares estimate
w
w
12
![Page 7: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •](https://reader034.vdocuments.site/reader034/viewer/2022051815/6040f605f0e1b161d102389a/html5/thumbnails/7.jpg)
11/3/2008
7
Example
13
Solution
14
![Page 8: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •](https://reader034.vdocuments.site/reader034/viewer/2022051815/6040f605f0e1b161d102389a/html5/thumbnails/8.jpg)
11/3/2008
8
Solution (cont.)
15
w
What it looks like?
16
![Page 9: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •](https://reader034.vdocuments.site/reader034/viewer/2022051815/6040f605f0e1b161d102389a/html5/thumbnails/9.jpg)
11/3/2008
9
Outline
• Generalize the linear regression
• Simple multiple regression
• Regularization
• Bias vs. Variance tradeoff
• Model selection
17
Regularization
Recall the case of
• a single predictor
• and polynomial basis function φj(x) = xj
18
![Page 10: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •](https://reader034.vdocuments.site/reader034/viewer/2022051815/6040f605f0e1b161d102389a/html5/thumbnails/10.jpg)
11/3/2008
10
1st Order Polynomial
19
3rd Order Polynomial
20
![Page 11: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •](https://reader034.vdocuments.site/reader034/viewer/2022051815/6040f605f0e1b161d102389a/html5/thumbnails/11.jpg)
11/3/2008
11
9th Order Polynomial
21
Over‐fitting
Root‐Mean‐Square (RMS) Error:
22
![Page 12: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •](https://reader034.vdocuments.site/reader034/viewer/2022051815/6040f605f0e1b161d102389a/html5/thumbnails/12.jpg)
11/3/2008
12
Polynomial Coefficients
23
Data Set Size:
9th Order Polynomial
24
![Page 13: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •](https://reader034.vdocuments.site/reader034/viewer/2022051815/6040f605f0e1b161d102389a/html5/thumbnails/13.jpg)
11/3/2008
13
Data Set Size:
9th Order Polynomial
25
Regularization
Penalize large coefficient values
26
![Page 14: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •](https://reader034.vdocuments.site/reader034/viewer/2022051815/6040f605f0e1b161d102389a/html5/thumbnails/14.jpg)
11/3/2008
14
Regularization:
27
Regularization:
28
![Page 15: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •](https://reader034.vdocuments.site/reader034/viewer/2022051815/6040f605f0e1b161d102389a/html5/thumbnails/15.jpg)
11/3/2008
15
Regularization: vs.
29
Polynomial Coefficients
30
![Page 16: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •](https://reader034.vdocuments.site/reader034/viewer/2022051815/6040f605f0e1b161d102389a/html5/thumbnails/16.jpg)
11/3/2008
16
Outline
• Generalize the linear regression
• Simple multiple regression
• Regularization
• Bias vs. Variance tradeoff
• Model selection
31
Bias vs. Variance
noise
32
variance
bias
![Page 17: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •](https://reader034.vdocuments.site/reader034/viewer/2022051815/6040f605f0e1b161d102389a/html5/thumbnails/17.jpg)
11/3/2008
17
How this applies to regression?
33
How this applies to regression?
34
![Page 18: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •](https://reader034.vdocuments.site/reader034/viewer/2022051815/6040f605f0e1b161d102389a/html5/thumbnails/18.jpg)
11/3/2008
18
How this applies to regression?
35
Outline
• Generalize the linear regression
• Simple multiple regression
• Regularization
• Bias vs. Variance tradeoff
• Model selection
36
![Page 19: New Linear Regression (Part II) · 2020. 9. 23. · Linear Regression (Part II) nanopoulos@ismll.de Outline • Generalize the linear regression • Simple multiple regression •](https://reader034.vdocuments.site/reader034/viewer/2022051815/6040f605f0e1b161d102389a/html5/thumbnails/19.jpg)
11/3/2008
19
Model Selection
M (degree of polynomial) corresponds to the l it f th d lcomplexity of the model
The bigger M, the bigger the overfitting
Regularization (λ) helps by controlling the complexity
But how to find good combinations for M and λ?But how to find good combinations for M and λ?
37
Information criteria
p(D|wML) is the likelihoodM is the number of parameters
• AIC (Akaike Information Criterion)
maximize p(D|wML) - M
Tends to favor overly simply models
38