![Page 1: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/1.jpg)
Stats5Seminar:MachineLearning
Winter2018
ProfessorPadhraicSmythDepartmentsofComputerScienceandStatisticsUniversityofCalifornia, Irvine
![Page 2: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/2.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:2
ClassOrganization
• Meetweeklyfor40minuteseminarwith5-10minutediscussion
• 8topics(withguestspeakers),weeks2through9– Youareencouraged toaskquestionsduringandafterthetalks
• Introandwrap-uptalksinweeks1and10
• ClassWebsiteisatwww.ics.uci.edu/~smyth/courses/stats5– Slidesandrelatedmaterialswillbeposted duringthequarter
![Page 3: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/3.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:3
Date Speaker DepartmentOr Organization Topic
Jan9 PadhraicSmyth ComputerScience IntroductiontoDataScience
Jan16 Padhraic Smyth ComputerScience Classification AlgorithmsinMachineLearning
Jan23 MichaelCarey ComputerScience DatabasesandDataManagement
Jan30 SameerSingh ComputerScience StatisticalNaturalLanguageProcessing
Feb6 Zhaoxia Yu Statistics AnIntroductiontoClusterAnalysis
Feb13 ErikSudderth ComputerScience ComputerVision andMachineLearning
Feb20 JohnBrock Cylance, Inc DataScienceandCyberSecurity
Feb27 VideoLecture(KateCrawford)
MicrosoftResearchandNYU BiasinMachineLearning
Mar6 MattHarding Economics DataScienceinEconomics andFinance
Mar13 PadhraicSmyth ComputerScience Review:PastandFutureofDataScience
ScheduleofLectures
![Page 4: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/4.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:4
SubmissionofReviewForms(Weeks2to10)
• SubmitReviewformsforLectures2through10• Availableathttp://www.ics.uci.edu/~smyth/courses/stats5/Forms/
• Reviewformswillbeavailableonlineatthestartofeachclass– Afewrelativelyshortquestionsbasedonthelecturethatday– Needstobesubmitted toEEEby12:15foreachlecture– Bringyour laptoporotherdevice
• Requirementstopasstheclass– Attendandsubmit reviewform for least8lecturesforweeks2through 10
(allowedtomissoneifyouneedtoforsomereason)
• Nofinalexam:pass/failbasedonattendanceandreviewforms
![Page 5: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/5.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:5
OutlineofToday’sTopic
• Whatismachinelearning?
• Classificationalgorithms
• Examplesfromimageandsequenceclassification
• Conclusionsanddiscussion
[Acknowledgement toProfessor AlexIhler forvariousslidesandfigures inthislecture]
![Page 6: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/6.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:6
WhatisMachineLearning?
![Page 7: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/7.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:7
Machinelearning(ML)
• Learningmodelsfromdata• Makingpredictions(ordecisions)• Gettingbetterwithexperience(data)• Problemswhosesolutionsare“hardtodescribe”
![Page 8: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/8.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:8
Typesofmachinelearningproblems
• Supervisedlearning– “Labeled”trainingdata– Everyexamplehasadesiredtargetvalue(a“knownanswer”)– Rewardpredictionsclosetotarget;penalizepredictionswithlargeerrors
– Classification:adiscrete-valuedprediction– Regression:acontinuous-valued prediction
![Page 9: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/9.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:9
![Page 10: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/10.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:10
Typesofmachinelearningproblems
• Supervisedlearning– “Labeled”trainingdata– Everyexamplehasadesiredtargetvalue(a“bestanswer”)– Rewardpredictionbeingclosetotarget
– Classification:adiscrete-valuedprediction– Regression:acontinuous-valued prediction
– Recommendersystems12
11
10
987654321
455? 311
3124452
534321423
245424
5224345
423316
users
movies
![Page 11: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/11.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:11
Typesofmachinelearningproblems
• Supervisedlearning– Trainingdatahaslabelsortargetvalues
• Unsupervisedlearning– Trainingdatahasnolabelsortargetvalues– Interestedindiscoveringnaturalstructureindata– Oftenusedinexplorationofdata,e.g.,inscience,inbusiness– Example:
• Clusteringcustomersormedicalpatientsintogroups• Discoveringanumericalrepresentationofwordsormovies
![Page 12: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/12.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:12
Datain2Dimensionswith5Clusters
SeeLecturebyProfZhaoxia YulaterthisquarteronClusteringAlgorithms
![Page 13: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/13.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:13
Embeddings ofWordsasVectors
From:https://www.mathworks.com/help/examples/textanalytics/
![Page 14: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/14.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:14
Figure from Koren, Bell, Volinksy, IEEE Computer, 2009
![Page 15: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/15.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:15
Typesofmachinelearningproblems
• Supervisedlearning• Unsupervisedlearning
• Reinforcementlearning– Algorithmgetsindirectfeedbackonitsprogress(ratherthancorrect/incorrect)– E.g.,aprogramlearningtoplaychess,orGo,oravideogame– E.g.,anautonomous vehiclelearninghowtonavigateacity– Mathematicalmodelsfordelayedreward,creditassignment,explore/exploit
![Page 16: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/16.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:16
![Page 17: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/17.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:17
ClassificationusingSupervisedLearning
![Page 18: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/18.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:18
LearningaClassificationModel
PatientID Zipcode Age …. Test Score Diagnosis
18261 92697 55 83 1
42356 92697 19 99 1
00219 90001 35 21 0
83726 24351 0 35 0
TrainingData
Learningalgorithmlearnsafunctionthattakesvaluesonthelefttopredictthevalue(diagnosis)ontheright
![Page 19: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/19.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:19
MakingPredictionswithaClassificationModel
PatientID Zipcode Age …. Test Score Diagnosis
18261 92697 55 83 1
42356 92697 19 99 1
00219 90001 35 21 0
83726 24351 0 35 0
12837 92697 40 70 ??
72623 92697 32 44 ??
Wecanthenusethemodeltomakepredictionswhentargetvaluesareunknown
TrainingData
TestData
![Page 20: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/20.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:20
0 10 20 30 40 50 60 70 80 900
2000
4000
6000
8000
10000
12000
14000
AGE
MO
NTHL
Y IN
COM
EEachdotisa2-dimensionalpointrepresentingoneperson=[AGE,MONTHLYINCOME]
![Page 21: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/21.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:21
0 10 20 30 40 50 60 70 80 900
2000
4000
6000
8000
10000
12000
14000
AGE
MO
NTHL
Y IN
COM
E
Goodboundary?
Betterboundary?
Bluedots=goodloansReddots=badloans
![Page 22: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/22.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:22
0 10 20 30 40 50 60 70 80 900
2000
4000
6000
8000
10000
12000
14000
AGE
MO
NTHL
Y IN
COM
E
Amuchmorecomplexboundary– butperhapsoverfitting tonoise?
![Page 23: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/23.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:23
BasicConcepts
• Thecurverepresentsaclassifier(amodel,apredictor)– Pointsononesideofthelinegetclassifiedasoneclass– Pointsontheother sidegetclassifiedastheotherclass– Onceweknowthecurvewecantakenewpointsandclassifythem
• Thecurveisrepresentedinternallybyasetofcoefficients– Thesearealsoknownas“parameters”or“weights”
• Thealgorithmsystematicallyadjuststhecoefficientsontrainingdatatoreducetheerrorasmuchasitcan
• Thisprocessoffindingtheweightsisknownas“learningamodel”
• Foundationalideasarefromstatisticsandoptimization
![Page 24: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/24.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:24
0 10 20 30 40 50 60 70 80 900
2000
4000
6000
8000
10000
12000
14000
AGE
MO
NTHL
Y IN
COM
E
Initialguessforcoefficients(notverygood,higherror)
![Page 25: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/25.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:25
0 10 20 30 40 50 60 70 80 900
2000
4000
6000
8000
10000
12000
14000
AGE
MO
NTHL
Y IN
COM
E
Initialguessforcoefficients(notverygood,higherror)
Finalsolutionforcoefficients(muchbetter,lowerror)
![Page 26: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/26.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:26
010 20
30 4050
60 7080
90
0
2000
4000
6000
8000
10000
12000
14000-3
-2
-1
0
1
2
3
4
5
AGEMONTHLY INCOME
ASS
ETS
Noweachdotisa3-dimensionalpointrepresentingoneperson=[AGE,MONTHLYINCOME,ASSETS]
Ourboundarylinewillnowbecomeaplane
![Page 27: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/27.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:27
HowDoesthisWorkinPractice?
• Weusecomputeralgorithmstosearchforthebestlineorcurve
• Thesesearchalgorithmsarequitesimple1. Startwithaninitialrandomguessforcoefficients2. Changethecoefficientsslightly toreducetheerror
(canusecalculustodothis)3. Movetothenewcoefficients4. Keeprepeatinguntil“convergence”
• Thissearchcanbedone10,100,1000,or1million“dimensions”….with10’sofmillionsofexamples
• Thissearchprocessisatthecoreofmachinelearningalgorithms
![Page 28: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/28.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:28
KeyPoints
• Werepresentourtrainingdataaspointsinamulti-dimensionalspace– Howdoweobtainthelabelsforthedatapoints?
• Wewanttofindaboundarycurvethatcanseparatepointsintotwoclasses
• Thecurvesarerepresentedbysetsofcoefficients(orweights)
• Machinelearningalgorithmsusesearch(oroptimization)toautomaticallyfindthecoefficientswiththelowesterroronthetrainingdata
![Page 29: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/29.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:29
IftheModelistooComplexitcanOverfit
x
y
x
y
x
y
x
y
Toosimple?
Toocomplex? Aboutright?
Data
![Page 30: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/30.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:30
NeuralNetworkClassifiers
![Page 31: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/31.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:31
Machine Learning Notation
Features x e.g.,pixelinputs(usuallyamultidimensionalvector)
Targets y e.g.,truelabelforanimage:“cat”or“nocat”
Predictionsŷ e.g.,model’spredictiongiveninputs,e.g.,“cat”
Error e(y,ŷ ) e.g.,e=0ifpredictionmatchestarget,1otherwise
Parametersθ e.g.,weights,coefficientsspecifyingthemodel
![Page 32: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/32.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:32
Example:ASimpleLinearModel
x1
x2
x3
+1
f(x)
Themachinelearningalgorithmwilllearnaweightforeacharrowinthediagram
Thisasimplemodel:oneweightperinput
![Page 33: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/33.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:33
ASimpleNeuralNetwork
Herethemodellearns3differentfunctionsandthencombinestheoutputsofthe3tomakeaprediction
Thisismorecomplexandhasmoreparametersthanthesimplemodel
x1
x2
x3
+1
f(x)
HiddenLayer
Output
Inputs
![Page 34: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/34.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:34
ASimpleNeuralNetwork
Herethemodellearns3differentfunctionsandthencombinestheoutputsofthe3tomakeaprediction
Thisismorecomplexandhasmoreparametersthanthesimplemodel
x1
x2
x3
+1
f(x)
HiddenLayer
Output
Inputs
![Page 35: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/35.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:35
ASimpleNeuralNetwork
Herethemodellearns3differentfunctionsandthencombinestheoutputsofthe3tomakeaprediction
Thisismorecomplexandhasmoreparametersthanthesimplemodel
x1
x2
x3
+1
f(x)
HiddenLayer
Output
Inputs
![Page 36: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/36.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:36
DeepLearning:ModelswithMoreHiddenLayers
Wecanbuildonthisideatocreate“deepmodels”withmanyhiddenlayers
x1
x2
x3
+1
f(x)
Veryflexibleandcomplexfunctions
HiddenLayer 1
HiddenLayer 2
Output
Inputs
![Page 37: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/37.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:37
Figure from http://parse.ele.tue.nl/
ExampleofaNetworkforImageRecognition
Mathematicallythisisjustafunction(acomplicatedone)
![Page 38: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/38.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:38
ABriefHistoryofNeuralNetworks…
• ThePerceptronEra:1950sand60s– Greatoptimism withperceptrons(linearmodels)....– ...untilMinsky,1969:perceptronshadlimitedrepresentationpower– Hardproblemsrequirehiddenlayers....buttherewasnotrainingalgorithm
• TheBackpropagationEra:Late1980stomid-90’s– Invention ofbackpropagation– trainingofmodelswithhiddenlayers– Wildenthusiasm(intheUSatleast)....NIPSconference,funding,etc– Mid1990’s:enthusiasmdiesout: trainingdeepNNsishard
• TheDeepLearningEra:2010-present– 3rdwaveofneuralnetworkenthusiasm– Whathappenedsincemid90’s?
• Muchlargerdatasets• Muchgreatercomputationalpower• Fastoptimizationtechniques
![Page 39: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/39.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:39
LearningviaGradientDescent
![Page 40: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/40.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:40
Finding good parameters
• Wanttofindparametersθ whichminimizeourerror…
• Thinkofacost“surface”:errorresidualforthat θ…
![Page 41: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/41.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:41
Gradientdescent
?
• Howtochangeθ toimproveJ(θ)?
• ChooseadirectioninwhichJ(θ)isdecreasing
![Page 42: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/42.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:42
Gradientdescent
• Howtochangeθ toimproveJ(θ)?
• ChooseadirectioninwhichJ(θ)isdecreasing
• Derivative
• Positive=>increasing• Negative=>decreasing
![Page 43: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/43.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:43
Gradientdescentinmoredimensions
• Gradientvector
• Indicatesdirectionofsteepestascent(negative=steepestdescent)
![Page 44: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/44.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:44
Commentsongradientdescent
• Simpleandgeneralalgorithm– Usableinbroadvarietyofmodels
• Localminima– Sensitivetostartingpoint
![Page 45: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/45.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:45
ImageClassificationExamples
![Page 46: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/46.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:46
Example:ClassifyingHandwrittenDigits
Whatthedatalooksliketothehumaneye
Inputs:pixelvaluesfromeachimageOutput:10possibleclasses(0,1,…,9)
![Page 47: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/47.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:47
PixelInputsRepresentedNumerically
From https://www.tensorflow.org/get_started/mnist/beginners
![Page 48: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/48.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:48
Example:ClassifyingHandwrittenDigits
ClassificationAccuracyhasgonefrom93%to99.9%inthepast10years
![Page 49: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/49.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:49
ExamplesofErrorsmadebytheNeuralNetworkClassifier
Image from http://neuralnetworksanddeeplearning.com/chap6.html
Human label (“truth”)
Label predicted by the classifier
![Page 50: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/50.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:50
Russakovsky etal,ImageNet LargeScaleVisualRecognitionChallenge, 2015
![Page 51: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/51.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:51
Trainingdatainputsx=rawpixelvalueslabelsy=valuesfrom1to1000
Trainedonmillionsofimages
Howisnetworkstructuredetermined?Essentiallytrial-and-error(expensive!)
DeepNetworkarchitectureforGoogLeNet network,27layers
![Page 52: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/52.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:52
Figure from Kevin Murphy, Google, 2016
![Page 53: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/53.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:53
Figure from Krizhevsky, Sutskever, Hinton, 2012
![Page 54: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/54.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:54
Figure from Krizhevsky, Sutskever, Hinton, 2012
![Page 55: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/55.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:55
Figure from Lee et al., ICML 2009
![Page 56: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/56.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:56
SequencePredictionExamples
![Page 57: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/57.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:57
LearningbyPredictingwhat’sNext
• Examples– Predictthenextwordapersonwilltypeorspeak,givenwordsuptothispoint– Predictthevalueofthe DowJonestomorrowafternoon,givenhistory
• Wecanusethesamegeneralmethodologiesasbefore– Modelnowusespastdatatopredictnextevent
• Applications– Speechrecognition– Auto-suggest inhumantyping– Machinetranslation– Consumermodeling– Chatbots– …andmore
![Page 58: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/58.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:58
Example:PredictingtheNextCharacter
Figure from http://cs.stanford.edu/people/karpathy/recurrentjs/
![Page 59: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/59.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:59
Example:PredictingCharacterswithaRecurrentNetwork
Figure from http://cs.stanford.edu/people/karpathy/recurrentjs/
![Page 60: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/60.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:60
OutputfromaModelLearnedonShakespeare
KING LEAR: O, if you were a feeble sight, the courtesy of your law, Your sight and several breath, will wear the gods With his heads, and my hands are wonder'd at the deeds, So drop upon your lordship's head, and your opinion Shall be against your honour.
Second Senator: They are away this miseries, produced upon my soul, Breaking and strongly should be buried, when I perish The earth and thoughts of many states.
DUKE VINCENTIO: Well, your wit is in the care of side and that.
Examples from “The Unreasonable Effectiveness of Recurrent Neural Networks”, Andrej Kaparthy, blog, http://karpathy.github.io/2015/05/21/rnn-effectiveness/
![Page 61: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/61.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:61
OutputfromaModelLearnedonCookingRecipes
From https://gist.github.com/nylki/1efbaa36635956d35bcc
![Page 62: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/62.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:62
OutputfromaModelLearnedonSourceCode
Examples from “The Unreasonable Effectiveness of Recurrent Neural Networks”, Andrej Kaparthy, blog, http://karpathy.github.io/2015/05/21/rnn-effectiveness/
![Page 63: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/63.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:63
OutputfromaModelLearnedonMathematicsPapers
Examples from “The Unreasonable Effectiveness of Recurrent Neural Networks”, Andrej Kaparthy, blog, http://karpathy.github.io/2015/05/21/rnn-effectiveness/
![Page 64: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/64.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:64
OutputfromaModelLearnedfromUSPresidentSpeeches
From https://medium.com/@samim/
![Page 65: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/65.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:65
LimitationsofClassificationAlgorithms
![Page 66: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/66.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:66
ADeepNeuralNetworkforImageRecognitionFromNguyen,Yosinski, Clune, CVPR2015
![Page 67: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/67.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:67
ADeepNeuralNetworkforImageRecognition
ImagesusedforTraining NewImages
FromNguyen,Yosinski, Clune, CVPR2015
![Page 68: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/68.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:68
ADeepNeuralNetworkforImageRecognitionFromNguyen,Yosinski, Clune, CVPR2015
![Page 69: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/69.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:69
FromNguyen,Yosinski, Clune, CVPR2015
ADeepNeuralNetworkforImageRecognition
![Page 70: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/70.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:70
![Page 71: Stats5 Seminar: Machine Learningsmyth/courses/stats5/onlineslides/...Feb 20 John Brock Cylance, Inc Data Science and CyberSecurity Feb 27 Video Lecture (Kate Crawford) Microsoft Research](https://reader033.vdocuments.site/reader033/viewer/2022042909/5f3a94665c2c7b0fac0604e3/html5/thumbnails/71.jpg)
P.Smyth:Stats5:DataScience Seminar,Winter2018:71
Date Speaker DepartmentOr Organization Topic
Jan9 PadhraicSmyth ComputerScience IntroductiontoDataScience
Jan16 Padhraic Smyth ComputerScience MachineLearning
Jan23 MichaelCarey ComputerScience DatabasesandDataManagement
Jan30 SameerSingh ComputerScience StatisticalNaturalLanguageProcessing
Feb6 Zhaoxia Yu Statistics AnIntroductiontoClusterAnalysis
Feb13 ErikSudderth ComputerScience ComputerVision andMachineLearning
Feb20 JohnBrock Cylance, Inc DataScienceandCyberSecurity
Feb27 VideoLecture(KateCrawford)
MicrosoftResearchandNYU BiasinMachineLearning
Mar6 MattHarding Economics DataScienceinEconomics andFinance
Mar13 PadhraicSmyth ComputerScience Review:PastandFutureofDataScience
ScheduleofLectures