![Page 1: REGULARIZATION (ADDITIONAL) Soft Computing Week 4](https://reader036.vdocuments.site/reader036/viewer/2022081513/56649f575503460f94c7cca7/html5/thumbnails/1.jpg)
REGULARIZATION(ADDITIONAL)
Soft Computing Week 4
![Page 2: REGULARIZATION (ADDITIONAL) Soft Computing Week 4](https://reader036.vdocuments.site/reader036/viewer/2022081513/56649f575503460f94c7cca7/html5/thumbnails/2.jpg)
The problem of overfitting
![Page 3: REGULARIZATION (ADDITIONAL) Soft Computing Week 4](https://reader036.vdocuments.site/reader036/viewer/2022081513/56649f575503460f94c7cca7/html5/thumbnails/3.jpg)
Example: Linear regression (housing prices)
Overfitting: If we have too many features, the learned hypothesis may fit the training set very well ( ), but fail to generalize to new examples (predict prices on new examples).
Pri
ce
SizePri
ceSize
Pri
ce
Size
![Page 4: REGULARIZATION (ADDITIONAL) Soft Computing Week 4](https://reader036.vdocuments.site/reader036/viewer/2022081513/56649f575503460f94c7cca7/html5/thumbnails/4.jpg)
Example: Logistic regression
( = sigmoid function)
x1
x2
x1
x2
x1
x2
![Page 5: REGULARIZATION (ADDITIONAL) Soft Computing Week 4](https://reader036.vdocuments.site/reader036/viewer/2022081513/56649f575503460f94c7cca7/html5/thumbnails/5.jpg)
Addressing overfitting:
Options:1. Reduce number of features.
― Manually select which features to keep.― Model selection algorithm (later in course).
2. Regularization.― Keep all the features, but reduce
magnitude/values of parameters .― Works well when we have a lot of features,
each of which contributes a bit to predicting .
![Page 6: REGULARIZATION (ADDITIONAL) Soft Computing Week 4](https://reader036.vdocuments.site/reader036/viewer/2022081513/56649f575503460f94c7cca7/html5/thumbnails/6.jpg)
Regularization Cost function
![Page 7: REGULARIZATION (ADDITIONAL) Soft Computing Week 4](https://reader036.vdocuments.site/reader036/viewer/2022081513/56649f575503460f94c7cca7/html5/thumbnails/7.jpg)
Intuition
Suppose we penalize and make , really small.
Pri
ce
Size of house
Pri
ce
Size of house
![Page 8: REGULARIZATION (ADDITIONAL) Soft Computing Week 4](https://reader036.vdocuments.site/reader036/viewer/2022081513/56649f575503460f94c7cca7/html5/thumbnails/8.jpg)
Regularization.
Small values for parameters ― “Simpler” hypothesis― Less prone to overfitting
Housing:― Features: ― Parameters:
![Page 9: REGULARIZATION (ADDITIONAL) Soft Computing Week 4](https://reader036.vdocuments.site/reader036/viewer/2022081513/56649f575503460f94c7cca7/html5/thumbnails/9.jpg)
Regularization.
Pri
ce
Size of house
![Page 10: REGULARIZATION (ADDITIONAL) Soft Computing Week 4](https://reader036.vdocuments.site/reader036/viewer/2022081513/56649f575503460f94c7cca7/html5/thumbnails/10.jpg)
Regularization
In regularized linear regression, we choose to minimize
What if is set to an extremely large value (perhaps for too large for our problem, say )?- Algorithm works fine; setting to be very large
can’t hurt it- Algortihm fails to eliminate overfitting.- Algorithm results in underfitting. (Fails to fit even
training data well).- Gradient descent will fail to converge.
![Page 11: REGULARIZATION (ADDITIONAL) Soft Computing Week 4](https://reader036.vdocuments.site/reader036/viewer/2022081513/56649f575503460f94c7cca7/html5/thumbnails/11.jpg)
Regularization
In regularized linear regression, we choose to minimize
What if is set to an extremely large value (perhaps for too large for our problem, say )?
Pri
ce
Size of house
![Page 12: REGULARIZATION (ADDITIONAL) Soft Computing Week 4](https://reader036.vdocuments.site/reader036/viewer/2022081513/56649f575503460f94c7cca7/html5/thumbnails/12.jpg)
Regularized linear regression
![Page 13: REGULARIZATION (ADDITIONAL) Soft Computing Week 4](https://reader036.vdocuments.site/reader036/viewer/2022081513/56649f575503460f94c7cca7/html5/thumbnails/13.jpg)
Regularized linear regression
![Page 14: REGULARIZATION (ADDITIONAL) Soft Computing Week 4](https://reader036.vdocuments.site/reader036/viewer/2022081513/56649f575503460f94c7cca7/html5/thumbnails/14.jpg)
Gradient descent
Repeat
![Page 15: REGULARIZATION (ADDITIONAL) Soft Computing Week 4](https://reader036.vdocuments.site/reader036/viewer/2022081513/56649f575503460f94c7cca7/html5/thumbnails/15.jpg)
Regularized logistic regression
![Page 16: REGULARIZATION (ADDITIONAL) Soft Computing Week 4](https://reader036.vdocuments.site/reader036/viewer/2022081513/56649f575503460f94c7cca7/html5/thumbnails/16.jpg)
Regularized logistic regression.
Cost function:x1
x2
![Page 17: REGULARIZATION (ADDITIONAL) Soft Computing Week 4](https://reader036.vdocuments.site/reader036/viewer/2022081513/56649f575503460f94c7cca7/html5/thumbnails/17.jpg)
Gradient descent
Repeat
![Page 18: REGULARIZATION (ADDITIONAL) Soft Computing Week 4](https://reader036.vdocuments.site/reader036/viewer/2022081513/56649f575503460f94c7cca7/html5/thumbnails/18.jpg)
Advanced optimization
gradient(1) = [ ];
function [jVal, gradient] = costFunction(theta)
jVal = [ ];
gradient(2) = [ ];
gradient(n+1) = [ ];
code to compute
code to compute
code to compute
code to compute
gradient(3) = [ ];code to compute