boosting rong jin. inefficiency with bagging d bagging … d1d1 d2d2 dkdk boostrap sampling h1h1...
TRANSCRIPT
![Page 1: Boosting Rong Jin. Inefficiency with Bagging D Bagging … D1D1 D2D2 DkDk Boostrap Sampling h1h1 h2h2 hkhk Inefficient boostrap sampling: Every example](https://reader038.vdocuments.site/reader038/viewer/2022103021/56649c7e5503460f94933c08/html5/thumbnails/1.jpg)
Boosting
Rong Jin
![Page 2: Boosting Rong Jin. Inefficiency with Bagging D Bagging … D1D1 D2D2 DkDk Boostrap Sampling h1h1 h2h2 hkhk Inefficient boostrap sampling: Every example](https://reader038.vdocuments.site/reader038/viewer/2022103021/56649c7e5503460f94933c08/html5/thumbnails/2.jpg)
Inefficiency with Bagging
D
Bagging
…
D1 D2 Dk
Boostrap Sampling
Pr( | , )iic h x
h1 h2 hk
Inefficient boostrap sampling:• Every example has equal chance to be sampled• No distinction between “easy” examples and “difficult” examples
Inefficient model combination:• A constant weight for each classifier• No distinction between accurate classifiers and inaccurate classifiers
![Page 3: Boosting Rong Jin. Inefficiency with Bagging D Bagging … D1D1 D2D2 DkDk Boostrap Sampling h1h1 h2h2 hkhk Inefficient boostrap sampling: Every example](https://reader038.vdocuments.site/reader038/viewer/2022103021/56649c7e5503460f94933c08/html5/thumbnails/3.jpg)
Improve the Efficiency of Bagging
Better sampling strategy• Focus on the examples that are difficult to classify
Better combination strategy• Accurate model should be assigned larger weights
![Page 4: Boosting Rong Jin. Inefficiency with Bagging D Bagging … D1D1 D2D2 DkDk Boostrap Sampling h1h1 h2h2 hkhk Inefficient boostrap sampling: Every example](https://reader038.vdocuments.site/reader038/viewer/2022103021/56649c7e5503460f94933c08/html5/thumbnails/4.jpg)
Intuition
Training Examples
X1
Y1
X2
Y2
X3
Y3
X4
Y4
Mistakes
X1
Y1
X3
Y3
Classifier1 Classifier2
Mistakes
X1
Y1
+Classifier3
No training mistakes !!
May overfitting !!
+
![Page 5: Boosting Rong Jin. Inefficiency with Bagging D Bagging … D1D1 D2D2 DkDk Boostrap Sampling h1h1 h2h2 hkhk Inefficient boostrap sampling: Every example](https://reader038.vdocuments.site/reader038/viewer/2022103021/56649c7e5503460f94933c08/html5/thumbnails/5.jpg)
AdaBoost Algorithm
![Page 6: Boosting Rong Jin. Inefficiency with Bagging D Bagging … D1D1 D2D2 DkDk Boostrap Sampling h1h1 h2h2 hkhk Inefficient boostrap sampling: Every example](https://reader038.vdocuments.site/reader038/viewer/2022103021/56649c7e5503460f94933c08/html5/thumbnails/6.jpg)
AdaBoost Example: t=ln2
x1, y1 x2, y2 x3, y3 x4, y4 x5, y5
1/5 1/5 1/51/5 1/5D0:
x5, y5x3, y3x1, y1
Sample
h1
Training
2/7 1/7 2/71/7 1/7D1:
x1, y1 x2, y2 x3, y3 x4, y4 x5, y5
Update Weightsh1
Samplex3, y3
x1, y1
h2
Training
x1, y1 x2, y2 x3, y3 x4, y4 x5, y5
h2 Update Weights
2/9 1/9 4/91/9 1/9D2: Sample …
![Page 7: Boosting Rong Jin. Inefficiency with Bagging D Bagging … D1D1 D2D2 DkDk Boostrap Sampling h1h1 h2h2 hkhk Inefficient boostrap sampling: Every example](https://reader038.vdocuments.site/reader038/viewer/2022103021/56649c7e5503460f94933c08/html5/thumbnails/7.jpg)
How To Choose t in AdaBoost?How to construct the best distribution Dt+1(i)1. Dt+1(i) should be significantly different from Dt(i)
2. Dt+1(i) should create a situation that classifier ht performs poorly
![Page 8: Boosting Rong Jin. Inefficiency with Bagging D Bagging … D1D1 D2D2 DkDk Boostrap Sampling h1h1 h2h2 hkhk Inefficient boostrap sampling: Every example](https://reader038.vdocuments.site/reader038/viewer/2022103021/56649c7e5503460f94933c08/html5/thumbnails/8.jpg)
How To Choose t in AdaBoost?
![Page 9: Boosting Rong Jin. Inefficiency with Bagging D Bagging … D1D1 D2D2 DkDk Boostrap Sampling h1h1 h2h2 hkhk Inefficient boostrap sampling: Every example](https://reader038.vdocuments.site/reader038/viewer/2022103021/56649c7e5503460f94933c08/html5/thumbnails/9.jpg)
Optimization View for Choosing t
ht(x): x{1,-1}; a base (weak) classifier
HT(x): a linear combination of basic classifiers
Goal: minimize training error
Approximate error swith a exponential function
![Page 10: Boosting Rong Jin. Inefficiency with Bagging D Bagging … D1D1 D2D2 DkDk Boostrap Sampling h1h1 h2h2 hkhk Inefficient boostrap sampling: Every example](https://reader038.vdocuments.site/reader038/viewer/2022103021/56649c7e5503460f94933c08/html5/thumbnails/10.jpg)
AdaBoost: Greedy OptimizationFix HT-1(x), and solve hT(x) and t
![Page 11: Boosting Rong Jin. Inefficiency with Bagging D Bagging … D1D1 D2D2 DkDk Boostrap Sampling h1h1 h2h2 hkhk Inefficient boostrap sampling: Every example](https://reader038.vdocuments.site/reader038/viewer/2022103021/56649c7e5503460f94933c08/html5/thumbnails/11.jpg)
Empirical Study of AdaBoost
AdaBoosting decision trees• Generate 50 decision trees by
AdaBoost• Linearly combine decision trees
using the weights of AdaBoost
In general:• AdaBoost = Bagging > C4.5• AdaBoost usually needs less number
of classifiers than Bagging
![Page 12: Boosting Rong Jin. Inefficiency with Bagging D Bagging … D1D1 D2D2 DkDk Boostrap Sampling h1h1 h2h2 hkhk Inefficient boostrap sampling: Every example](https://reader038.vdocuments.site/reader038/viewer/2022103021/56649c7e5503460f94933c08/html5/thumbnails/12.jpg)
Bia-Variance Tradeoff for AdaBoost• AdaBoost can reduce both variance and bias
simultaneously
single decision tree
Bagging decision tree
bias
variance
AdaBoosting decision trees