lecture 1 recap · powerpoint presentation author: laura lt created date: 4/22/2018 5:59:36 pm
TRANSCRIPT
![Page 1: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/1.jpg)
Lecture 1 Recap
Prof. Leal-Taixé and Prof. Niessner 1
![Page 2: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/2.jpg)
Reinforcement learning
Agents Environmentinteraction
Machine learning
Unsupervised learning Supervised learning
Prof. Leal-Taixé and Prof. Niessner 2
![Page 3: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/3.jpg)
Nearest Neighbor
?Prof. Leal-Taixé and Prof. Niessner 3
![Page 4: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/4.jpg)
Nearest Neighbor
distance
NN classifier = dog
Prof. Leal-Taixé and Prof. Niessner 4
![Page 5: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/5.jpg)
Nearest Neighbor
Courtesy of Stanford course cs231n
Decision boundaries
Prof. Leal-Taixé and Prof. Niessner 5
![Page 6: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/6.jpg)
Linear Regression
Prof. Leal-Taixé and Prof. Niessner 6
![Page 7: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/7.jpg)
Linear regression
• Supervised learning
• Find a linear model that explains a target given the inputs
Prof. Leal-Taixé and Prof. Niessner 7
![Page 8: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/8.jpg)
Training
Data points
Linear regression
Input (e.g. image, measurement)
Labels (e.g. cat/dog)
Learner
Model parameters
Prof. Leal-Taixé and Prof. Niessner 8
![Page 9: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/9.jpg)
Training
Testing
Data points
Learner
Model parameters
Predictor
Estimation
Linear regression
Prof. Leal-Taixé and Prof. Niessner 9
![Page 10: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/10.jpg)
Input data, features
weights, model parameters
d inputs
Linear prediction• A linear model is expressed in the form
Prof. Leal-Taixé and Prof. Niessner 10
![Page 11: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/11.jpg)
1 bias
Linear prediction• A linear model is expressed in the form
Prof. Leal-Taixé and Prof. Niessner 11
![Page 12: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/12.jpg)
Temperature of a building
Outside temperature
Number of people
Sun exposure
Level of humidity
Linear prediction
Prof. Leal-Taixé and Prof. Niessner 12
![Page 13: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/13.jpg)
Input features
Model parameters
Prediction
Linear prediction
Prof. Leal-Taixé and Prof. Niessner 13
![Page 14: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/14.jpg)
Temperature of the building
Out
sid
e te
mp
erat
ure
Hum
idity
Num
ber
peo
ple
Sun
exp
osur
e (%
)
MODEL
Linear prediction
Prof. Leal-Taixé and Prof. Niessner 14
![Page 15: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/15.jpg)
Temperature of the building
Out
sid
e te
mp
erat
ure
Hum
idity
Num
ber
peo
ple
Sun
exp
osur
e (%
)
MODEL
How do we obtain the
model?
Linear prediction
Prof. Leal-Taixé and Prof. Niessner 15
![Page 16: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/16.jpg)
How to obtain the model?
Data points
Model parameters
Labels (ground truth)
Estimation
Loss function
Optimization
Prof. Leal-Taixé and Prof. Niessner 16
![Page 17: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/17.jpg)
How to obtain the model?
• Loss function: measures how good my estimation is (how good my model is) and tells the optimization method how to make it better.
• Optimization: changes the model in order to improve the loss function (i.e. to improve my estimation).
Prof. Leal-Taixé and Prof. Niessner 17
![Page 18: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/18.jpg)
Prediction: Temperature of
the building
Linear regression: loss function
Prof. Leal-Taixé and Prof. Niessner 18
![Page 19: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/19.jpg)
Prediction: Temperature of
the building
Linear regression: loss function
Prof. Leal-Taixé and Prof. Niessner 19
![Page 20: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/20.jpg)
MinimizingObjective function
EnergyCost function
Linear regression: loss function
Prof. Leal-Taixé and Prof. Niessner 20
![Page 21: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/21.jpg)
Optimization: linear least squares• Linear least squares: an approach to fit a
mathematical model to the data
• Convex problem, there exists a closed-form solution that is unique.
Prof. Leal-Taixé and Prof. Niessner 21
![Page 22: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/22.jpg)
Optimization: linear least squares
The estimation comes from the
linear model
training samples
Prof. Leal-Taixé and Prof. Niessner 22
![Page 23: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/23.jpg)
Optimization: linear least squares
Matrix notation
labelstraining samples, each input vector
has size Prof. Leal-Taixé and Prof. Niessner 23
![Page 24: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/24.jpg)
Optimization: linear least squares
Matrix notation
Thursday April 19th: exercise session, more on matrix notation
Prof. Leal-Taixé and Prof. Niessner 24
![Page 25: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/25.jpg)
Optimization: linear least squares
Convex
OptimumProf. Leal-Taixé and Prof. Niessner 25
![Page 26: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/26.jpg)
Optimization
More info on Thursday!
No theta there!
Prof. Leal-Taixé and Prof. Niessner 26
![Page 27: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/27.jpg)
Output: Temperature of
the buildingInputs: Outside temperature,
number of people…
Optimization
Prof. Leal-Taixé and Prof. Niessner 27
![Page 28: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/28.jpg)
Is this the best estimate?• Least squares estimate
Prof. Leal-Taixé and Prof. Niessner 28
![Page 29: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/29.jpg)
Maximum Likelihood
Prof. Leal-Taixé and Prof. Niessner 29
![Page 30: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/30.jpg)
Maximum Likelihood Estimate
True underlying distribution
Controlled by a parameter
Parametric family of distributions
Prof. Leal-Taixé and Prof. Niessner 30
![Page 31: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/31.jpg)
Maximum Likelihood Estimate• A method of estimating the parameters of a statistical
model given observations,
Observations from
Prof. Leal-Taixé and Prof. Niessner 31
![Page 32: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/32.jpg)
Maximum Likelihood Estimate• A method of estimating the parameters of a statistical
model given observations, by finding the parameter values that maximize the likelihood of making the observations given the parameters.
Training samples
Prof. Leal-Taixé and Prof. Niessner 32
![Page 33: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/33.jpg)
Maximum Likelihood Estimate
Logarithmic property
Prof. Leal-Taixé and Prof. Niessner 33
![Page 34: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/34.jpg)
Back to linear regression
Prof. Leal-Taixé and Prof. Niessner 34
![Page 35: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/35.jpg)
Back to linear regression
Assuming
Gaussian or Normal distribution
mean
Prof. Leal-Taixé and Prof. Niessner 35
![Page 36: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/36.jpg)
Back to linear regression
Assuming
Prof. Leal-Taixé and Prof. Niessner 36
![Page 37: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/37.jpg)
Back to linear regression
Assuming
mean
Prof. Leal-Taixé and Prof. Niessner 37
![Page 38: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/38.jpg)
Back to linear regression
Original optimization problem
Prof. Leal-Taixé and Prof. Niessner 38
![Page 39: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/39.jpg)
Back to linear regression
Matrix notation
Canceling log and e
Prof. Leal-Taixé and Prof. Niessner 39
![Page 40: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/40.jpg)
Back to linear regression
How can we find the estimate of theta?
Prof. Leal-Taixé and Prof. Niessner 40
![Page 41: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/41.jpg)
Linear regression• Maximum Likelihood Estimate (MLE) corresponds to
the Least Squares Estimate (given the assumptions)
• Introduced the concepts of loss function and optimization to obtain the best model for regression
Prof. Leal-Taixé and Prof. Niessner 41
![Page 42: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/42.jpg)
Image classification
Prof. Leal-Taixé and Prof. Niessner 42
![Page 43: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/43.jpg)
Regression vs. Classification
• Regression: predict a continuous output value (e.g. temperature of a room)
• Classification: predicts a discrete value – Binary classification: output is either 0 or 1– Multi-class classification: set of N classes
Prof. Leal-Taixé and Prof. Niessner 43
![Page 44: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/44.jpg)
Logistic regression
CAT classifier
Prof. Leal-Taixé and Prof. Niessner 44
![Page 45: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/45.jpg)
Sigmoid for binary predictions• 0
Prof. Leal-Taixé and Prof. Niessner 45
Can be interpreted as a probability
1
![Page 46: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/46.jpg)
Spoiler alert: 1 layer Neural Network• 0
Prof. Leal-Taixé and Prof. Niessner 46
Can be interpreted as a probability
1
![Page 47: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/47.jpg)
Logistic regression• Probability of a binary output
Prof. Leal-Taixé and Prof. Niessner 47
![Page 48: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/48.jpg)
Logistic regression• Probability of a binary output
Prof. Leal-Taixé and Prof. Niessner 48
Model for coins
Bernoulli trial
![Page 49: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/49.jpg)
Logistic regression• Probability of a binary output
Model for coins
Prof. Leal-Taixé and Prof. Niessner 49
![Page 50: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/50.jpg)
Logistic regression: loss function• Probability of a binary output
• Maximum Likelihood Estimate
Prof. Leal-Taixé and Prof. Niessner 50
![Page 51: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/51.jpg)
Logistic regression: loss function
Prof. Leal-Taixé and Prof. Niessner 51
![Page 52: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/52.jpg)
Logistic regression: loss function
Prof. Leal-Taixé and Prof. Niessner 52
Maximize!
![Page 53: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/53.jpg)
Logistic regression: loss function
Prof. Leal-Taixé and Prof. Niessner 53
We want. large, since logarithm is a monotonically increasing function, we also want
large (1 is the largest value my estimate can take!)
![Page 54: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/54.jpg)
Logistic regression: loss function
Prof. Leal-Taixé and Prof. Niessner 54
We want. large, so we want to be small (0 is the smallest value my estimate can take!)
![Page 55: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/55.jpg)
Logistic regression: loss function
• Related to the multi-class loss you will see in this course (also called softmax loss)
Prof. Leal-Taixé and Prof. Niessner 55
Referred to as cross-entropy loss
![Page 56: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/56.jpg)
Logistic regression: optimization• Loss function
• Cost function
Prof. Leal-Taixé and Prof. Niessner 56
Minimization
![Page 57: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/57.jpg)
Logistic regression: optimization• No closed-form solution
• Make use of an iterative method gradient descent
Prof. Leal-Taixé and Prof. Niessner 57
![Page 58: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/58.jpg)
Gradient descent
Prof. Leal-Taixé and Prof. Niessner 58
![Page 59: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/59.jpg)
Following the slope
Optimum
Prof. Leal-Taixé and Prof. Niessner 59
Initialization
![Page 60: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/60.jpg)
Following the slope
Prof. Leal-Taixé and Prof. Niessner 60
Initialization
Optimum
![Page 61: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/61.jpg)
Following the slope
Prof. Leal-Taixé and Prof. Niessner 61
Initialization
Follow the slope of the DERIVATIVE
Optimum
![Page 62: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/62.jpg)
Gradient steps• From derivative to gradient
• Gradient steps in direction of negative gradient
Prof. Leal-Taixé and Prof. Niessner 62
Direction of greatest
increase of the function
Learning rate
![Page 63: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/63.jpg)
Gradient steps• From derivative to gradient
• Gradient steps in direction of negative gradient
Prof. Leal-Taixé and Prof. Niessner 63
Direction of greatest
increase of the function
SMALL Learning rate
![Page 64: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/64.jpg)
Gradient steps• From derivative to gradient
• Gradient steps in direction of negative gradient
Prof. Leal-Taixé and Prof. Niessner 64
Direction of greatest
increase of the function
LARGE Learning rate
![Page 65: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/65.jpg)
Convergence
Prof. Leal-Taixé and Prof. Niessner 65
Initialization
What is the gradient when we reach this point?
Not guaranteed to reach the
optimumOptimum
![Page 66: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/66.jpg)
Numerical gradient
• Approximate• Slow evaluation
Prof. Leal-Taixé and Prof. Niessner 66
![Page 67: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/67.jpg)
Analytical gradient• Exact and fast
Prof. Leal-Taixé and Prof. Niessner 67
Analytical gradient
Recall Linear Regression
A step in that direction
![Page 68: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/68.jpg)
Gradient descent for least squares
Prof. Leal-Taixé and Prof. Niessner 68
Convex, always converges to the same solution
![Page 69: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/69.jpg)
Gradient descent for logistic regression
Prof. Leal-Taixé and Prof. Niessner 69
Explained in the following lectures when the following topics are introduced:• Computational graphs• Backpropagation of the gradients
![Page 70: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/70.jpg)
Regularization
Prof. Leal-Taixé and Prof. Niessner 71
![Page 71: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/71.jpg)
Regularization
Loss
L2 regularization
L1 regularization
Max norm regularization
Dropout
Prof. Leal-Taixé and Prof. Niessner 72
![Page 72: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/72.jpg)
Regularization: example• Input: 3 features
• Two linear classifiers that give the same result:
Ignores 2 features
Takes information from all features
Prof. Leal-Taixé and Prof. Niessner 73
![Page 73: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/73.jpg)
Regularization: example
Loss
L2 regularization
Minimization
Prof. Leal-Taixé and Prof. Niessner 74
![Page 74: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/74.jpg)
Regularization: example
Loss
Minimization
L1 regularization
Prof. Leal-Taixé and Prof. Niessner 75
![Page 75: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/75.jpg)
Regularization: example• Input: 3 features
• Two linear classifiers that give the same result:
Ignores 2 features
Takes information from all features
Prof. Leal-Taixé and Prof. Niessner 76
![Page 76: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/76.jpg)
Regularization: example• Input: 3 features
• Two linear classifiers that give the same result:
L1 regularization enforces sparsity
Takes information from all features
Prof. Leal-Taixé and Prof. Niessner 77
![Page 77: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/77.jpg)
Regularization: example• Input: 3 features
• Two linear classifiers that give the same result:
L1 regularization enforces sparsity
L2 regularization enforces that the weights have similar values
Prof. Leal-Taixé and Prof. Niessner 78
![Page 78: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/78.jpg)
Regularization: effect• Dog classifier takes different inputs
Furry
Has two eyes
Has a tail
Has paws
Has two ears
L1 regularization will focus all the attention to a few key features
Prof. Leal-Taixé and Prof. Niessner 79
![Page 79: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/79.jpg)
Regularization: effect• Dog classifier takes different inputs
Furry
Has two eyes
Has a tail
Has paws
Has two ears
L2 regularization will take all information into account to make decisions
Prof. Leal-Taixé and Prof. Niessner 80
![Page 80: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/80.jpg)
Regularization
Credits: University of Washington
What is the goal of regularization?
What happens to the training error?
Prof. Leal-Taixé and Prof. Niessner 81
![Page 81: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/81.jpg)
Regularization• Any strategy that aims to
Prof. Leal-Taixé and Prof. Niessner 82
Lower validation error
Increasing training error
![Page 82: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/82.jpg)
Beyond linear
This lecture Next lectures
Prof. Leal-Taixé and Prof. Niessner 83
![Page 83: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/83.jpg)
Next lectures• Next Monday: Introduction to Neural Networks
• This Thursday: Math refresh #2, particularly useful for the jump to NN!
Prof. Leal-Taixé and Prof. Niessner 84
![Page 84: Lecture 1 Recap · PowerPoint Presentation Author: Laura LT Created Date: 4/22/2018 5:59:36 PM](https://reader033.vdocuments.site/reader033/viewer/2022060213/5f0535da7e708231d411d3ce/html5/thumbnails/84.jpg)
Useful references• General Machine Learning book:
– Pattern Recognition and Machine Learning. C. Bishop.
Prof. Leal-Taixé and Prof. Niessner 85