introduction to deep learning - university of torontofidler/slides/2015/csc420/lecture15.pdf ·...
TRANSCRIPT
![Page 1: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/1.jpg)
Introduction to Deep Learning
A. G. Schwing & S. Fidler
University of Toronto, 2015
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 1 / 39
![Page 2: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/2.jpg)
Outline
1 Universality of Neural Networks
2 Learning Neural Networks
3 Deep Learning
4 Applications
5 References
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 2 / 39
![Page 3: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/3.jpg)
What are neural networks?
Let’s ask
• Biological
• Computational
Input #1
Input #2
Input #3
Input #4
Output
Hiddenlayer
Inputlayer
Outputlayer
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 3 / 39
![Page 4: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/4.jpg)
What are neural networks?
...Neural networks (NNs) are computational models inspired bybiological neural networks [...] and are used to estimate or
approximate functions... [Wikipedia]
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 4 / 39
![Page 5: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/5.jpg)
What are neural networks?
Origins:Traced back to threshold logic [W. McCulloch and W. Pitts, 1943]Perceptron [F. Rosenblatt, 1958]
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 5 / 39
![Page 6: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/6.jpg)
What are neural networks? Use cases
ClassificationPlaying video gamesCaptchaNeural Turing Machine (e.g., learn how to sort) Alex Graves
http://www.technologyreview.com/view/532156/googles-secretive-deepmind-startup-unveils-a-neural-turing-machine/
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 6 / 39
![Page 7: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/7.jpg)
What are neural networks?Example:
input xparameters w1, w2, b
x ∈ Rh1
b ∈ R
fw1 w2
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 7 / 39
![Page 8: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/8.jpg)
What are neural networks?Example:
input xparameters w1, w2, b
x ∈ Rh1
b ∈ R
fw1 w2
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 7 / 39
![Page 9: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/9.jpg)
How to compute the function?
Forward propagation/pass, inference, prediction:Given input x and parameters w , bCompute (latent variables/) intermediate results in a feed-forwardmannerUntil we obtain output function f
x ∈ Rh1
b ∈ R
fw1 w2
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 8 / 39
![Page 10: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/10.jpg)
How to compute the function?
Forward propagation/pass, inference, prediction:Given input x and parameters w , bCompute (latent variables/) intermediate results in a feed-forwardmannerUntil we obtain output function f
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 8 / 39
![Page 11: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/11.jpg)
How to compute the function?
Example: input x , parameters w1, w2, b
x ∈ Rh1
b ∈ R
fw1 w2
h1 = σ(w1 · x + b)f = w2 · h1
Sigmoid function:σ(z) = 1/(1 + exp(−z))
x = ln 2, b = ln 3, w1 = 2, w2 = 2h1 =?f =?
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 9 / 39
![Page 12: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/12.jpg)
How to compute the function?
Example: input x , parameters w1, w2, b
x ∈ Rh1
b ∈ R
fw1 w2
h1 = σ(w1 · x + b)f = w2 · h1
Sigmoid function:σ(z) = 1/(1 + exp(−z))
x = ln 2, b = ln 3, w1 = 2, w2 = 2h1 =?f =?
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 9 / 39
![Page 13: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/13.jpg)
How to compute the function?
Example: input x , parameters w1, w2, b
x ∈ Rh1
b ∈ R
fw1 w2
h1 = σ(w1 · x + b)f = w2 · h1
Sigmoid function:σ(z) = 1/(1 + exp(−z))
x = ln 2, b = ln 3, w1 = 2, w2 = 2h1 =?f =?
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 9 / 39
![Page 14: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/14.jpg)
How to compute the function?
Given parameters, what is f for x = 0, x = 1, x = 2, ...
f = w2σ(w1 · x + b)
−5 0 50
0.5
1
1.5
2
x
f
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 10 / 39
![Page 15: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/15.jpg)
How to compute the function?
Given parameters, what is f for x = 0, x = 1, x = 2, ...
f = w2σ(w1 · x + b)
−5 0 50
0.5
1
1.5
2
x
f
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 10 / 39
![Page 16: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/16.jpg)
Let’s mess with parameters:
x ∈ Rh1
b ∈ R
fw1 w2
h1 = σ(w1 · x + b)f = w2 · h1
σ(z) = 1/(1 + exp(−z))
w1 = 1.0,b changes b = 0,w1 changes
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
b = −2b = 0b = 2
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
w1 = 0
w1 = 0.5
w1 = 1.0
w1 = 100
Keep in mind the step function.
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 11 / 39
![Page 17: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/17.jpg)
Let’s mess with parameters:
x ∈ Rh1
b ∈ R
fw1 w2
h1 = σ(w1 · x + b)f = w2 · h1
σ(z) = 1/(1 + exp(−z))
w1 = 1.0,b changes
b = 0,w1 changes
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
b = −2b = 0b = 2
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
w1 = 0
w1 = 0.5
w1 = 1.0
w1 = 100
Keep in mind the step function.
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 11 / 39
![Page 18: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/18.jpg)
Let’s mess with parameters:
x ∈ Rh1
b ∈ R
fw1 w2
h1 = σ(w1 · x + b)f = w2 · h1
σ(z) = 1/(1 + exp(−z))
w1 = 1.0,b changes
b = 0,w1 changes
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
b = −2b = 0b = 2
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
w1 = 0
w1 = 0.5
w1 = 1.0
w1 = 100
Keep in mind the step function.
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 11 / 39
![Page 19: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/19.jpg)
Let’s mess with parameters:
x ∈ Rh1
b ∈ R
fw1 w2
h1 = σ(w1 · x + b)f = w2 · h1
σ(z) = 1/(1 + exp(−z))
w1 = 1.0,b changes b = 0,w1 changes
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
b = −2b = 0b = 2
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
w1 = 0
w1 = 0.5
w1 = 1.0
w1 = 100
Keep in mind the step function.
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 11 / 39
![Page 20: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/20.jpg)
Let’s mess with parameters:
x ∈ Rh1
b ∈ R
fw1 w2
h1 = σ(w1 · x + b)f = w2 · h1
σ(z) = 1/(1 + exp(−z))
w1 = 1.0,b changes b = 0,w1 changes
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
b = −2b = 0b = 2
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
w1 = 0
w1 = 0.5
w1 = 1.0
w1 = 100
Keep in mind the step function.
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 11 / 39
![Page 21: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/21.jpg)
Let’s mess with parameters:
x ∈ Rh1
b ∈ R
fw1 w2
h1 = σ(w1 · x + b)f = w2 · h1
σ(z) = 1/(1 + exp(−z))
w1 = 1.0,b changes b = 0,w1 changes
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
b = −2b = 0b = 2
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
w1 = 0
w1 = 0.5
w1 = 1.0
w1 = 100
Keep in mind the step function.
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 11 / 39
![Page 22: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/22.jpg)
How to use Neural Networks for binary classification?
Feature/Measurement: x
Output: How likely is the input to be a cat?
−5 0 50
0.2
0.4
0.6
0.8
1
x
y
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 12 / 39
![Page 23: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/23.jpg)
How to use Neural Networks for binary classification?
Feature/Measurement: x
Output: How likely is the input to be a cat?
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 12 / 39
![Page 24: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/24.jpg)
How to use Neural Networks for binary classification?
Feature/Measurement: x
Output: How likely is the input to be a cat?
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 12 / 39
![Page 25: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/25.jpg)
How to use Neural Networks for binary classification?
Feature/Measurement: x
Output: How likely is the input to be a cat?
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 12 / 39
![Page 26: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/26.jpg)
How to use Neural Networks for binary classification?
Shifted feature/measurement: x
Output: How likely is the input to be a cat?
Previous features
Shifted features
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
Learning/Training means finding the right parameters.
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 13 / 39
![Page 27: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/27.jpg)
How to use Neural Networks for binary classification?
Shifted feature/measurement: x
Output: How likely is the input to be a cat?
Previous features Shifted features
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
Learning/Training means finding the right parameters.
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 13 / 39
![Page 28: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/28.jpg)
How to use Neural Networks for binary classification?
Shifted feature/measurement: x
Output: How likely is the input to be a cat?
Previous features Shifted features
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
Learning/Training means finding the right parameters.
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 13 / 39
![Page 29: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/29.jpg)
So far we are able to scale and translate sigmoids.
How well can we approximate an arbitrary function?With the simple model we are obviously not going very far.
Features are goodSimple classifier
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
How can we generalize?
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 14 / 39
![Page 30: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/30.jpg)
So far we are able to scale and translate sigmoids.
How well can we approximate an arbitrary function?With the simple model we are obviously not going very far.
Features are good Features are noisySimple classifier More complex classifier
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
How can we generalize?
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 14 / 39
![Page 31: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/31.jpg)
So far we are able to scale and translate sigmoids.
How well can we approximate an arbitrary function?With the simple model we are obviously not going very far.
Features are good Features are noisySimple classifier More complex classifier
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
−5 0 50
0.2
0.4
0.6
0.8
1
x
f
How can we generalize?
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 14 / 39
![Page 32: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/32.jpg)
Let’s use more hidden variables:
x ∈ Rh1
b1
h2
b2
f
w1 w2
w3 w4
h1 = σ(w1 · x + b1)h2 = σ(w3 · x + b2)
f = w2 · h1 + w4 · h2
Combining two step functions gives a bump.
−5 0 51
1.2
1.4
1.6
1.8
2
x
f
w1 = −100, b1 = 40, w3 = 100, b2 = 60, w2 = 1, w4 = 1
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 15 / 39
![Page 33: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/33.jpg)
Let’s use more hidden variables:
x ∈ Rh1
b1
h2
b2
f
w1 w2
w3 w4
h1 = σ(w1 · x + b1)h2 = σ(w3 · x + b2)
f = w2 · h1 + w4 · h2
Combining two step functions gives a bump.
−5 0 51
1.2
1.4
1.6
1.8
2
x
f
w1 = −100, b1 = 40, w3 = 100, b2 = 60, w2 = 1, w4 = 1
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 15 / 39
![Page 34: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/34.jpg)
So let’s simplify:
x ∈ Rh1
b1
h2
b2
f
w1 w2
w3 w4
fBump(x1, x2, h)
We simplify a pair of hidden nodes to a “bump” function:Starts at x1
Ends at x2
Has height h
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 16 / 39
![Page 35: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/35.jpg)
Now we can represent “bumps” very well. How can we generalize?
f
Bump(0.0, 0.2, h1)
Bump(0.2, 0.4, h2)
Bump(0.4, 0.6, h3)
Bump(0.6, 0.8, h4)
Bump(0.8, 1.0, h5)
0 0.5 1−0.5
0
0.5
1
1.5
x
f
TargetApproximation
More bumps gives more accurate approximation.
Corresponds to a single layer network.
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 17 / 39
![Page 36: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/36.jpg)
Now we can represent “bumps” very well. How can we generalize?
f
Bump(0.0, 0.2, h1)
Bump(0.2, 0.4, h2)
Bump(0.4, 0.6, h3)
Bump(0.6, 0.8, h4)
Bump(0.8, 1.0, h5)
0 0.5 1−0.5
0
0.5
1
1.5
x
f
TargetApproximation
More bumps gives more accurate approximation.
Corresponds to a single layer network.A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 17 / 39
![Page 37: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/37.jpg)
Universality: theoretically we can approximate an arbitraryfunctionSo we can learn a really complex cat classifierWhere is the catch?
Complexity, we might need quite a few hidden unitsOverfitting, memorize the training data
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 18 / 39
![Page 38: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/38.jpg)
Universality: theoretically we can approximate an arbitraryfunctionSo we can learn a really complex cat classifierWhere is the catch?
Complexity, we might need quite a few hidden unitsOverfitting, memorize the training data
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 18 / 39
![Page 39: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/39.jpg)
Generalizations are possible to
include more input dimensionscapture more output dimensionsemploy multiple layers for more efficient representations
See ‘http://neuralnetworksanddeeplearning.com/chap4.html’ for agreat read!
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 19 / 39
![Page 40: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/40.jpg)
Generalizations are possible toinclude more input dimensionscapture more output dimensionsemploy multiple layers for more efficient representations
See ‘http://neuralnetworksanddeeplearning.com/chap4.html’ for agreat read!
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 19 / 39
![Page 41: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/41.jpg)
How do we find the parameters to obtain a good approximation? Howdo we tell a computer to do that?
Intuitive explanation:Compute approximation error at the outputPropagate error back by computing individual contributions ofparameters to error
[Fig. from H. Lee]
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 20 / 39
![Page 42: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/42.jpg)
How do we find the parameters to obtain a good approximation? Howdo we tell a computer to do that?
Intuitive explanation:Compute approximation error at the outputPropagate error back by computing individual contributions ofparameters to error
[Fig. from H. Lee]
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 20 / 39
![Page 43: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/43.jpg)
Example for backpropagation of error:
Target function: 5x2
Approximation: f (x ,w)
Domain of interest: x ∈ {0,1,2,3}Error:
e(w) =∑
x∈{0,1,2,3}
(5x2 − f (x ,w))2
Program of interest:
minw
e(w) = minw
∑x∈{0,1,2,3}
(5x2 − f (x ,w))2︸ ︷︷ ︸`(x ,w)
How to optimize? Gradient descent
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 21 / 39
![Page 44: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/44.jpg)
Example for backpropagation of error:
Target function: 5x2
Approximation: f (x ,w)
Domain of interest: x ∈ {0,1,2,3}Error:
e(w) =∑
x∈{0,1,2,3}
(5x2 − f (x ,w))2
Program of interest:
minw
e(w) = minw
∑x∈{0,1,2,3}
(5x2 − f (x ,w))2︸ ︷︷ ︸`(x ,w)
How to optimize?
Gradient descent
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 21 / 39
![Page 45: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/45.jpg)
Example for backpropagation of error:
Target function: 5x2
Approximation: f (x ,w)
Domain of interest: x ∈ {0,1,2,3}Error:
e(w) =∑
x∈{0,1,2,3}
(5x2 − f (x ,w))2
Program of interest:
minw
e(w) = minw
∑x∈{0,1,2,3}
(5x2 − f (x ,w))2︸ ︷︷ ︸`(x ,w)
How to optimize? Gradient descent
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 21 / 39
![Page 46: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/46.jpg)
Gradient descentmin
we(w)
Algorithm: start with w0, t = 01 Compute gradient gt =
∂e∂w
∣∣w=wt
2 Update wt+1 = wt − ηgt
3 Set t ← t + 1
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 22 / 39
![Page 47: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/47.jpg)
Gradient descentmin
we(w)
Algorithm: start with w0, t = 01 Compute gradient gt =
∂e∂w
∣∣w=wt
2 Update wt+1 = wt − ηgt
3 Set t ← t + 1
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 22 / 39
![Page 48: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/48.jpg)
Chain rule is important to compute gradients:
minw
e(w) = minw
∑x∈{0,1,2,3}
(5x2 − f (x ,w))2︸ ︷︷ ︸`(x ,w)
Loss function: `(x ,w)
Squared lossLog lossHinge loss
Derivatives:
∂e(w)
w=
∑x∈{0,1,2,3}
∂`(x ,w)
∂w
=
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 23 / 39
![Page 49: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/49.jpg)
Chain rule is important to compute gradients:
minw
e(w) = minw
∑x∈{0,1,2,3}
(5x2 − f (x ,w))2︸ ︷︷ ︸`(x ,w)
Loss function: `(x ,w)
Squared lossLog lossHinge loss
Derivatives:
∂e(w)
w=
∑x∈{0,1,2,3}
∂`(x ,w)
∂w
=
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 23 / 39
![Page 50: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/50.jpg)
Chain rule is important to compute gradients:
minw
e(w) = minw
∑x∈{0,1,2,3}
(5x2 − f (x ,w))2︸ ︷︷ ︸`(x ,w)
Loss function: `(x ,w)
Squared lossLog lossHinge loss
Derivatives:
∂e(w)
w=
∑x∈{0,1,2,3}
∂`(x ,w)
∂w
=
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 23 / 39
![Page 51: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/51.jpg)
Chain rule is important to compute gradients:
minw
e(w) = minw
∑x∈{0,1,2,3}
(5x2 − f (x ,w))2︸ ︷︷ ︸`(x ,w)
Loss function: `(x ,w)
Squared lossLog lossHinge loss
Derivatives:
∂e(w)
w=
∑x∈{0,1,2,3}
∂`(x ,w)
∂w
=
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 23 / 39
![Page 52: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/52.jpg)
Chain rule is important to compute gradients:
minw
e(w) = minw
∑x∈{0,1,2,3}
(5x2 − f (x ,w))2︸ ︷︷ ︸`(x ,w)
Loss function: `(x ,w)
Squared lossLog lossHinge loss
Derivatives:
∂e(w)
w=
∑x∈{0,1,2,3}
∂`(x ,w)
∂w
=∑
x∈{0,1,2,3}
∂`(x ,w)
∂f∂f (x ,w)
∂w
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 23 / 39
![Page 53: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/53.jpg)
Slightly more complex example:Composite function represented as a directed a-cyclic graph
`(x ,w) = f1(w1, f2(w2, f3(. . .)))
f1(w1, f2)
w1 f2(w2, f3)
w2 f3(. . .)
∂f1∂w1
∂f1∂f2
∂f1∂f2
∂f2∂w2
∂f1∂f2
∂f2∂f3
Repeated application of chain rule for efficient computation of allgradients
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 24 / 39
![Page 54: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/54.jpg)
Back propagation doesn’t work well for deep sigmoid networks:Diffusion of gradient signal (multiplication of many small numbers)Attractivity of many local minima (random initialization is very farfrom good points)Requires a lot of training samplesNeed for significant computational power
Solution: 2 step approachGreedy layerwise pre-trainingPerform full fine tuning at the end
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 25 / 39
![Page 55: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/55.jpg)
Back propagation doesn’t work well for deep sigmoid networks:Diffusion of gradient signal (multiplication of many small numbers)Attractivity of many local minima (random initialization is very farfrom good points)Requires a lot of training samplesNeed for significant computational power
Solution: 2 step approachGreedy layerwise pre-trainingPerform full fine tuning at the end
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 25 / 39
![Page 56: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/56.jpg)
Why go deep?
Representation efficiency (fewercomputational units for the same function)Hierarchical representation (non-localgeneralization)Combinatorial sharing (re-use of earliercomputation)Works very well
[Fig. from H. Lee]
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 26 / 39
![Page 57: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/57.jpg)
To obtain more flexibility/non-linearity we use additional functionprototypes:
SigmoidRectified linear unit (ReLU)PoolingDropoutConvolutions
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 27 / 39
![Page 58: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/58.jpg)
To obtain more flexibility/non-linearity we use additional functionprototypes:
SigmoidRectified linear unit (ReLU)PoolingDropoutConvolutions
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 27 / 39
![Page 59: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/59.jpg)
Convolutions
What do the numbers mean?
See Sanja’s lecture 14 for the answers...[Fig. adapted from A. Krizhevsky]
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 28 / 39
![Page 60: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/60.jpg)
fconv( , )
=
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 29 / 39
![Page 61: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/61.jpg)
fconv( , )
=
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 29 / 39
![Page 62: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/62.jpg)
Max Pooling
What is happening here?[Fig. adapted from A. Krizhevsky]
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 30 / 39
![Page 63: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/63.jpg)
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 31 / 39
![Page 64: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/64.jpg)
Rectified Linear Unit (ReLU)
Drop information if smaller than zeroFixes the problem of vanishing gradients to some degree
DropoutDrop information at randomKind of a regularization, enforcing redundancy
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 32 / 39
![Page 65: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/65.jpg)
Rectified Linear Unit (ReLU)Drop information if smaller than zeroFixes the problem of vanishing gradients to some degree
DropoutDrop information at randomKind of a regularization, enforcing redundancy
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 32 / 39
![Page 66: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/66.jpg)
Rectified Linear Unit (ReLU)Drop information if smaller than zeroFixes the problem of vanishing gradients to some degree
Dropout
Drop information at randomKind of a regularization, enforcing redundancy
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 32 / 39
![Page 67: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/67.jpg)
Rectified Linear Unit (ReLU)Drop information if smaller than zeroFixes the problem of vanishing gradients to some degree
DropoutDrop information at randomKind of a regularization, enforcing redundancy
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 32 / 39
![Page 68: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/68.jpg)
A famous deep learning network called “AlexNet.”
The network won the ImageNet competition in 2012.How many parameters?Given an image, what is happening?Inference Time: about 2ms per image when processing manyimages in parallel on the GPUTraining Time: a few days given a single recent GPU
[Fig. adapted from A. Krizhevsky]
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 33 / 39
![Page 69: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/69.jpg)
Demo
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 34 / 39
![Page 70: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/70.jpg)
Neural networks have been used for many applications:
Classification and Recognition in Computer VisionText Parsing in Natural Language ProcessingPlaying Video GamesStock Market PredictionCaptcha
Demos:Russ websiteAntonio Places website
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 35 / 39
![Page 71: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/71.jpg)
Classification in Computer Vision: ImageNet Challengehttp://deeplearning.cs.toronto.edu/
Since it’s the end of the semester, let’s find the beach...
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 36 / 39
![Page 72: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/72.jpg)
Classification in Computer Vision: ImageNet Challengehttp://deeplearning.cs.toronto.edu/
A place to maybe prepare for exams...
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 37 / 39
![Page 73: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/73.jpg)
Links:
Tutorials: http://deeplearning.net/tutorial/deeplearning.pdfToronto Demo by Russ and students:http://deeplearning.cs.toronto.edu/MIT Demo by Antonio and students:http://places.csail.mit.edu/demo.htmlHonglak Lee:http://deeplearningworkshopnips2010.files.wordpress.com/2010/09/nips10-workshop-tutorial-final.pdfYann LeCun:http://www.cs.nyu.edu/ yann/talks/lecun-ranzato-icml2013.pdfRichard Socher: http://lxmls.it.pt/2014/socher-lxmls.pdf
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 38 / 39
![Page 74: Introduction to Deep Learning - University of Torontofidler/slides/2015/CSC420/lecture15.pdf · Introduction to Deep Learning A. G. Schwing & S. Fidler University of Toronto, 2015](https://reader031.vdocuments.site/reader031/viewer/2022030419/5aa5ff277f8b9afa758df6f0/html5/thumbnails/74.jpg)
Videos:
Video games: https://www.youtube.com/watch?v=mARt-xPablECaptcha: http://singularityhub.com/2013/10/29/tiny-ai-startup-vicarious-says-its-solved-captcha/https://www.youtube.com/watch?v=lge-dl2JUAM#t=27Stock exchange:http://cs.stanford.edu/people/eroberts/courses/soco/projects/neural-networks/Applications/stocks.html
A. G. Schwing & S. Fidler (UofT) CSC420: Intro to Image Understanding 2015 39 / 39