decaf: a deep convolutional activation feature for generic...
TRANSCRIPT
![Page 1: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/1.jpg)
DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition
ECE 289G: Paper Presentation #3 Philipp Gysel
![Page 2: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/2.jpg)
Autonomous Car
ECE 289G Paper Presentation, Philipp Gysel Slide 2
![Page 3: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/3.jpg)
Real-time object recognition
ECE 289G Paper Presentation, Philipp Gysel Slide 3
So
urc
e: m
ap
s.g
oogle
.com
![Page 4: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/4.jpg)
Object classification
ECE 289G Paper Presentation, Philipp Gysel Slide 4
So
urc
e: im
age
net.
sta
nfo
rd.e
d
• Car
• Traffic Light
• Street Sign
• …
Convolutional
Neural Network
Training
![Page 5: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/5.jpg)
CNN for Object Recognition
ECE 289G Paper Presentation, Philipp Gysel Slide 5
Lines
Dots
Rectangles
Gradients
Leave
Building
Feature extraction Classification
So
urc
e: [2
]
Object
category
![Page 6: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/6.jpg)
Feature extraction
ECE 289G Paper Presentation, Philipp Gysel Slide 6
So
urc
e: h
ttp
s:/
/de
ve
loper.
app
le.c
om
![Page 7: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/7.jpg)
From features to object classes
ECE 289G Paper Presentation, Philipp Gysel Slide 7
Feature extraction
Object
category
High-level features:
• Shape of a car
• Road marking
• Face with eyes and ears
• Cat skin
Classes:
• Cat
• Car
• …
Classification
So
urc
e: [2
]
![Page 8: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/8.jpg)
Visualization of high dimensional feature space
ECE 289G Paper Presentation, Philipp Gysel Slide 8
▪ LLC [5] vs GIST [6] vs DeCAF [1]
▪ Vizualisation with t-SNE algorithm [4]
Source: [1]
![Page 9: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/9.jpg)
Repurpose Features from CNN
ECE 289G Paper Presentation, Philipp Gysel Slide 9
So
urc
e: im
age
net.
sta
nfo
rd.e
d
Convolutional
Neural Network Object class
Convolutional
Neural Network
Learned
Features
![Page 10: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/10.jpg)
Classification with small training dataset
ECE 289G Paper Presentation, Philipp Gysel Slide 10
So
urc
e: [2
]
ILSVRC
2012
Freeze trained
convolution kernels
Logistic
Regression
SVM
High-level
features
Target
database
Classify
new
database
DeCAF5 DeCAF6 DeCAF7
![Page 11: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/11.jpg)
Experiments: Are features transferrable to solve new tasks?
▪ Train AlexNet [2] on ILSVRC 2012 object recognition dataset
▪ Reuse extracted features for new tasks:
▪ Experiment #1: Basic Object Recognition
▪ Experiment #2: Domain Adaption
▪ Experiment #3: Fine-grained recognition
▪ Experiment #4: Scene recognition
ECE 289G Paper Presentation, Philipp Gysel Slide 11
![Page 12: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/12.jpg)
Experiment #1: Basic object recognition
ECE 289G Paper Presentation, Philipp Gysel Slide 12
So
urc
e: [1
]
▪ Classify new objects on new dataset (Caltech-101 dataset)
▪ 2.6% better than state-of-art
![Page 13: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/13.jpg)
Experiment #2: Domain adaption
▪ Train object recognition in different surrounding, only few labeled data in target domain available
▪ Office dataset
ECE 289G Paper Presentation, Philipp Gysel Slide 13
So
urc
e: [1
]
![Page 14: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/14.jpg)
Experiment #3: Subcategory recognition
▪ Caltech-UCSD birds dataset
▪ 8% better than state-of-art
ECE 289G Paper Presentation, Philipp Gysel Slide 14
So
urc
e: [1
]
![Page 15: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/15.jpg)
Experiment #4: Scene recognition
▪ Classes like abbey, diner, mosque, stadium
▪ SUN-397 dataset
▪ >2% better than state-of-art
ECE 289G Paper Presentation, Philipp Gysel Slide 15
So
urc
e: [1
]
![Page 16: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/16.jpg)
Conclusions
▪ Extract features from ILSVRC dataset to solve new classification tasks
▪ State-of-the-art performance in 4 different tasks
▪ CNN features are generic enough to solve completely new problems
▪ Bigger datasets yield better accuracy
▪ Release of DeCAF (predecessor of Caffe)
ECE 289G Paper Presentation, Philipp Gysel Slide 16
![Page 17: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/17.jpg)
Conclusions cont.
ECE 289G Paper Presentation, Philipp Gysel
Slide 17
So
urc
e: m
ap
s.g
oogle
.com
![Page 18: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/18.jpg)
Conclusions cont.
ECE 289G Paper Presentation, Philipp Gysel Slide 18
So
urc
e: im
age
net.
sta
nfo
rd.e
d
• Car
• Traffic Light
• Street Sign
• …
Convolutional
Neural Network
Training ▪ Challenges:
▪ Find labeled data
▪ Training time of CNN
![Page 19: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/19.jpg)
Q&A
![Page 20: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/20.jpg)
References
[1] Donahue, Jeff, et al. "Decaf: A deep convolutional activation feature for generic visual recognition." arXiv preprint arXiv:1310.1531 (2013).
[2] Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
[3] Chopra, S., Balakrishnan, S., and Gopalan, R. Dlid: Deep learning for domain adaptation by interpolating between domains. In ICML Workshop on Challenges in Representation Learning, 2013.
[4] van der Maaten, L. and Hinton, G. Visualizing data using t-sne. JMLR, 9, 2008.
[5] Wang, J., Yang, J., Yu, K., Lv, F., Huang, T., and Gong, Y. Locality-constrained linear coding for image classification. In CVPR, 2010.
[6] Oliva, A. and Torralba, A. Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV, 2001.
ECE 289G Paper Presentation, Philipp Gysel Slide 20
![Page 21: DeCAF: A Deep Convolutional Activation Feature for Generic ...yjlee/teaching/ecs289g-fall2015/decaf.pdf · with deep convolutional neural networks." Advances in neural information](https://reader030.vdocuments.site/reader030/viewer/2022040723/5e323e18e3413c198f34d702/html5/thumbnails/21.jpg)
Computing time of forward propagation
ECE 289G Paper Presentation, Philipp Gysel Slide 21
So
urc
e: [1
] a
nd
[2
]