unsupervised learning - ntu speech processing...
TRANSCRIPT
![Page 1: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/1.jpg)
Unsupervised Learning:Deep Auto-encoder
![Page 2: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/2.jpg)
Unsupervised Learning
“We expect unsupervised learning to become far more
important in the longer term. Human and animal learning is
largely unsupervised: we discover the structure of the world by observing it, not by being told the name of every object.”
– LeCun, Bengio, Hinton, Nature 2015
As I've said in previous statements: most of human and animal
learning is unsupervised learning. If intelligence was a cake,
unsupervised learning would be the cake, supervised learning
would be the icing on the cake, and reinforcement learning
would be the cherry on the cake. We know how to make the
icing and the cherry, but we don't know how to make the cake.
- Yann LeCun, March 14, 2016 (Facebook)
![Page 3: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/3.jpg)
Auto-encoder
NNEncoder
NNDecoder
codeCompact
representation of the input object
codeCan reconstruct
the original object
Learn together28 X 28 = 784
Usually <784
![Page 4: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/4.jpg)
Recap: PCA
𝑥
Input layer
𝑊
ො𝑥
𝑊𝑇
output layerhidden layer
(linear)
𝑐
As close as possible
Minimize 𝑥 − ො𝑥 2
Bottleneck later
Output of the hidden layer is the code
encode decode
![Page 5: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/5.jpg)
Initialize by RBM layer-by-layer
Reference: Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. "Reducing the dimensionality of data with neural networks." Science 313.5786 (2006): 504-507
• Of course, the auto-encoder can be deep
Deep Auto-encoder
Inp
ut Layer
Layer
Layer
bo
ttle
Ou
tpu
t Layer
Layer
Layer
Layer
Layer
… …
Code
As close as possible
𝑥 ො𝑥
𝑊1𝑊1
𝑇𝑊2
𝑊2𝑇
Symmetric is not necessary.
![Page 6: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/6.jpg)
Deep Auto-encoder
Original Image
PCA
DeepAuto-encoder
78
4
78
4
78
4
10
00
50
0
25
0
30
25
0
50
0
10
00
78
4
30
![Page 7: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/7.jpg)
78
4
78
4
78
4
10
00
50
0
25
0
2
2
25
0
50
0
10
00
78
4
![Page 8: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/8.jpg)
Auto-encoder
• De-noising auto-encoder
𝑥 ො𝑥𝑐
encode decode
Add noise
𝑥′
As close as possible
More: Contractive auto-encoderRef: Rifai, Salah, et al. "Contractive
auto-encoders: Explicit invariance
during feature extraction.“ Proceedings
of the 28th International Conference on
Machine Learning (ICML-11). 2011.
Vincent, Pascal, et al. "Extracting and composing robust features with denoising autoencoders." ICML, 2008.
![Page 9: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/9.jpg)
Deep Auto-encoder - Example
Pixel -> tSNE
𝑐NN
Encoder
PCA 降到32-dim
![Page 10: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/10.jpg)
Auto-encoder – Text Retrieval
word string:“This is an apple”
…
this
is
a
an
apple
pen
1
1
0
1
1
0
Bag-of-word
Semantics are not considered.
Vector Space Model
document
query
![Page 11: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/11.jpg)
Auto-encoder – Text Retrieval
Bag-of-word(document or query)
query
LSA: project documents to 2 latent topics
2000
500
250
125
2
The documents talking about the same thing will have close code.
![Page 12: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/12.jpg)
Auto-encoder –Similar Image Search
Retrieved using Euclidean distance in pixel intensity space
(Images from Hinton’s slides on Coursera)
Reference: Krizhevsky, Alex, and Geoffrey E. Hinton. "Using very deep
autoencoders for content-based image retrieval." ESANN. 2011.
![Page 13: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/13.jpg)
Auto-encoder –Similar Image Search
32x32
81
92
40
96
20
48
10
24
51
2
25
6
code
(crawl millions of images from the Internet)
![Page 14: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/14.jpg)
retrieved using 256 codes
Retrieved using Euclidean distance in pixel intensity space
![Page 15: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/15.jpg)
Auto-encoder for CNN
Convolution
Pooling
Convolution
Pooling
Deconvolution
Unpooling
Deconvolution
Unpooling
As close as possible
Deconvolution
code
![Page 16: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/16.jpg)
CNN -Unpooling
14 x 14 28 x 28
Source of image : https://leonardoaraujosantos.gitbooks.io/artificial-inteligence/content/image_segmentation.html
Alternative: simply repeat the values
![Page 17: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/17.jpg)
CNN - Deconvolution
=
Actually, deconvolution is convolution.
+
+ +
+
![Page 18: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/18.jpg)
Auto-encoder – Pre-training DNN
• Greedy Layer-wise Pre-training again
Targ
et
784
1000
1000
10
Input
output
Input 784
1000
784
W1
𝑥
ො𝑥
500
W1’
![Page 19: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/19.jpg)
Auto-encoder – Pre-training DNN
• Greedy Layer-wise Pre-training again
Targ
et
784
1000
1000
500
10
Input
output
Input 784
1000
W1
1000
1000
fix
𝑥
𝑎1
ො𝑎1
W2
W2’
![Page 20: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/20.jpg)
Auto-encoder – Pre-training DNN
• Greedy Layer-wise Pre-training again
Targ
et
784
1000
1000
10
Input
output
Input 784
1000
W1
1000
fix
𝑥
𝑎1
ො𝑎2
W2fix
𝑎2
1000
W3
500 500W3’
![Page 21: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/21.jpg)
Auto-encoder – Pre-training DNN
• Greedy Layer-wise Pre-training again
Targ
et
784
1000
1000
10
Input
output
Input 784
1000
W1
1000
𝑥
W2
W3
500 500
10output
W4Random
init
Find-tune by backpropagation
![Page 22: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/22.jpg)
Learning More - Restricted Boltzmann Machine• Neural networks [5.1] : Restricted Boltzmann machine –
definition• https://www.youtube.com/watch?v=p4Vh_zMw-
HQ&index=36&list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH
• Neural networks [5.2] : Restricted Boltzmann machine –inference
• https://www.youtube.com/watch?v=lekCh_i32iE&list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH&index=37
• Neural networks [5.3] : Restricted Boltzmann machine - free energy
• https://www.youtube.com/watch?v=e0Ts_7Y6hZU&list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH&index=38
![Page 23: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/23.jpg)
Learning More - Deep Belief Network• Neural networks [7.7] : Deep learning - deep belief network
• https://www.youtube.com/watch?v=vkb6AWYXZ5I&list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH&index=57
• Neural networks [7.8] : Deep learning - variational bound
• https://www.youtube.com/watch?v=pStDscJh2Wo&list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH&index=58
• Neural networks [7.9] : Deep learning - DBN pre-training
• https://www.youtube.com/watch?v=35MUlYCColk&list=PL6Xpj9I5qXYEcOhn7TqghAJ6NAPrNmUBH&index=59
![Page 24: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/24.jpg)
Next …..
• Can we use decoder to generate something?
NNDecoder
code
![Page 25: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/25.jpg)
Next …..
• Can we use decoder to generate something?
NNDecoder
code
![Page 26: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/26.jpg)
Appendix
![Page 27: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/27.jpg)
Pokémon
• http://140.112.21.35:2880/~tlkagk/pokemon/pca.html
• http://140.112.21.35:2880/~tlkagk/pokemon/auto.html
• The code is modified from
• http://jkunst.com/r/pokemon-visualize-em-all/
![Page 28: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/28.jpg)
Add: Ladder Network
• http://rinuboney.github.io/2016/01/19/ladder-network.html
• https://mycourses.aalto.fi/pluginfile.php/146701/mod_resource/content/1/08%20semisup%20ladder.pdf
• https://arxiv.org/abs/1507.02672
![Page 29: Unsupervised Learning - NTU Speech Processing …speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/auto.pdf · Unsupervised Learning “We expect unsupervised learning to become](https://reader034.vdocuments.site/reader034/viewer/2022052319/5b7bb2157f8b9ab87f8e88ec/html5/thumbnails/29.jpg)