lessons from 2mm machine learning models
TRANSCRIPT
![Page 1: Lessons from 2MM machine learning models](https://reader035.vdocuments.site/reader035/viewer/2022062412/586fe9061a28ab92198b4989/html5/thumbnails/1.jpg)
Kaggle
The home of data science
![Page 2: Lessons from 2MM machine learning models](https://reader035.vdocuments.site/reader035/viewer/2022062412/586fe9061a28ab92198b4989/html5/thumbnails/2.jpg)
GE Flight Quest 2Optimize flight routes basedon weather & traffic
$250,000122 teams
Hewlett Foundation: Automated Essay ScoringDevelop an automated scoring algorithmfor student-written essays
$100,000155 teams
Allstate Purchase Prediction ChallengeDevelop an automated scoring algorithmfor student-written essays
$50,0001,570 teams
Merck Molecular Activity ChallengeHelp develop safe and effective medicinesby predicting molecular activity
$40,000236 teams
Higgs Boson Machine Learning ChallengeUse the ATLAS experiment toidentify the Higgs boson
$13,0001,302 teams
![Page 3: Lessons from 2MM machine learning models](https://reader035.vdocuments.site/reader035/viewer/2022062412/586fe9061a28ab92198b4989/html5/thumbnails/3.jpg)
Age Income Default
58 $95,824 True73 $20,708 False59 $82,152 False66 $25,334 True
Age Income Default
73 $53,44561 $36,67947 $90,42244 $79,040
Training Data Test Data
The Kaggle Approach
![Page 4: Lessons from 2MM machine learning models](https://reader035.vdocuments.site/reader035/viewer/2022062412/586fe9061a28ab92198b4989/html5/thumbnails/4.jpg)
![Page 5: Lessons from 2MM machine learning models](https://reader035.vdocuments.site/reader035/viewer/2022062412/586fe9061a28ab92198b4989/html5/thumbnails/5.jpg)
Mapping Dark Matter
Competition Progress
Accuracy(lower is better)
Week 1 Week 3 Week 5 Week 7 End
.0150
.0170Martin O’LearyPhD student in Glaciology, Cambridge U
![Page 6: Lessons from 2MM machine learning models](https://reader035.vdocuments.site/reader035/viewer/2022062412/586fe9061a28ab92198b4989/html5/thumbnails/6.jpg)
“In less than a week, Martin O’Leary, a PhD student in glaciology, outperformed the state-of-the-art algorithms”
“The world’s brightest physicists have been working for decades on solving one of the great unifying problems of our universe”
![Page 7: Lessons from 2MM machine learning models](https://reader035.vdocuments.site/reader035/viewer/2022062412/586fe9061a28ab92198b4989/html5/thumbnails/7.jpg)
Mapping Dark Matter
Competition Progress
Accuracy(lower is better)
Week 1 Week 3 Week 5 Week 7 End
.0150
.0170
Martin O’LearyPhD student in Glaciology, Cambridge U
Marius CobzarencoGrad student in computer vision, UC London
Ali Haissaine & Eu Jin LocSignature Verification, Qatar U & Grad Student @ Deloitte
Other
deepZot (David Kirkby & Daniel Margala)Particle Physicist & Cosmologist
![Page 8: Lessons from 2MM machine learning models](https://reader035.vdocuments.site/reader035/viewer/2022062412/586fe9061a28ab92198b4989/html5/thumbnails/8.jpg)
We’ve worked with many of the world’s largest companies
Healthcare & Pharma
Consumer Internet
Finance IndustrialConsumerMarketing
Oil& Gas
$50b+Beverage
Co.
Global Bank
Top CreditCard
Issuer
Top 5 E&P
Top 20 E&P
![Page 9: Lessons from 2MM machine learning models](https://reader035.vdocuments.site/reader035/viewer/2022062412/586fe9061a28ab92198b4989/html5/thumbnails/9.jpg)
![Page 10: Lessons from 2MM machine learning models](https://reader035.vdocuments.site/reader035/viewer/2022062412/586fe9061a28ab92198b4989/html5/thumbnails/10.jpg)
That submit over 100K machine learning models per month
May-10 May-11 May-12 May-13 May-14 May-150
20,000
40,000
60,000
80,000
100,000
120,000
140,000
160,000
Monthly Submissions to Kaggle Competitions
![Page 11: Lessons from 2MM machine learning models](https://reader035.vdocuments.site/reader035/viewer/2022062412/586fe9061a28ab92198b4989/html5/thumbnails/11.jpg)
There’s a cookbook for winning competitions on structured data. It starts with exploring the data.
![Page 12: Lessons from 2MM machine learning models](https://reader035.vdocuments.site/reader035/viewer/2022062412/586fe9061a28ab92198b4989/html5/thumbnails/12.jpg)
2. Create and select features
![Page 13: Lessons from 2MM machine learning models](https://reader035.vdocuments.site/reader035/viewer/2022062412/586fe9061a28ab92198b4989/html5/thumbnails/13.jpg)
3. Parameter tuning and ensembling
![Page 14: Lessons from 2MM machine learning models](https://reader035.vdocuments.site/reader035/viewer/2022062412/586fe9061a28ab92198b4989/html5/thumbnails/14.jpg)
A second cookbook is emerging on computer vision and speech problems. It involves using convolutional neural networks.
![Page 15: Lessons from 2MM machine learning models](https://reader035.vdocuments.site/reader035/viewer/2022062412/586fe9061a28ab92198b4989/html5/thumbnails/15.jpg)
The vast majority of time is spent training algorithms when CNNs are applied.
![Page 16: Lessons from 2MM machine learning models](https://reader035.vdocuments.site/reader035/viewer/2022062412/586fe9061a28ab92198b4989/html5/thumbnails/16.jpg)
There are the problems that land in the middle…
![Page 17: Lessons from 2MM machine learning models](https://reader035.vdocuments.site/reader035/viewer/2022062412/586fe9061a28ab92198b4989/html5/thumbnails/17.jpg)
Anthony [email protected] 283 9781