deep learning for computer vision · 2020. 3. 13. · deep learning for computer vision lecture 11:...

Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery, Domain Transfer and Fine-Tuning Peter Belhumeur Computer Science Columbia University

Upload: others

Post on 16-Oct-2020

10 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Deep Learning for

Computer VisionLecture 11: Building ConvNets in Keras, Pre-trained Neural Networks,

Performing Network Surgery, Domain Transfer and Fine-Tuning

Peter Belhumeur

Computer Science Columbia University

Page 2: Deep Learning for Computer Vision · 2020. 3. 13. · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery,

CIFAR-10[Krizhevsky 2009]

10 categories x 6,000 instances = 60,000 images (32x32x3)

Page 3: Deep Learning for Computer Vision · 2020. 3. 13. · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery,

Small Convolutional Net

32x32x3

Conv layer: k=3x3 stride=1 pad=1 no pooling depth=32

32x32x32

Conv layer: k=3x3 stride=1 pad=1 no pooling depth=64

16x16x32

FC layer: dim=512

1x1x512 1x1x10

Output +Softmax

Conv layer: k=3x3 stride=1 pad=1 max pool=2 depth=32

16x16x64

Conv layer: k=3x3 stride=1 pad=0 max pool=2 depth=64

7x7x64

Page 4: Deep Learning for Computer Vision · 2020. 3. 13. · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery,

Open cifar10_keras.html in browser

Page 5: Deep Learning for Computer Vision · 2020. 3. 13. · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery,

Cats vs. DogsData from [Kaggle.com]

2 categories x 11,000 instances = 22,000 images

vs.

Page 6: Deep Learning for Computer Vision · 2020. 3. 13. · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery,

Small Convolutional Net

150x150x3

Conv layer: k=3x3 stride=1 pad=0 max pool=2 depth=32

74x74x32

Conv layer: k=3x3 stride=1 pad=0 max pool=2 depth=64

36x36x32

FC layer: dim=64

1x1x64 1x1x1

Output +Softmax

Conv layer: k=3x3 stride=1 pad=0 max pool=2 depth=32

17x17x64

0.94

Page 7: Deep Learning for Computer Vision · 2020. 3. 13. · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery,

Open imagenet_vgg16_fine-tuning.html in browser

Page 8: Deep Learning for Computer Vision · 2020. 3. 13. · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery,

Why can’t we use one a much deeper ConvNet for this problem?

Page 9: Deep Learning for Computer Vision · 2020. 3. 13. · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery,

Open imagenet_vgg16_fine-tuning.html in browser

Page 10: Deep Learning for Computer Vision · 2020. 3. 13. · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery,

We usually don’t have enough data!

Page 11: Deep Learning for Computer Vision · 2020. 3. 13. · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery,

…but here is a FANTASTIC trick!

Page 12: Deep Learning for Computer Vision · 2020. 3. 13. · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery,

Domain Transfer• Train a deep ConvNet using lots of data on a large image

classification problem like ImageNet.

• Save the weights.

• Chop off the output layer (or final layers) of this ConvNet.

• Replace the output layer (or final layers) with one that fits your problem.

• Freeze the weights in the old layers and train on your data, allowing the weights in the new layer to settle.

• Unfreeze the whole network and train.

Page 13: Deep Learning for Computer Vision · 2020. 3. 13. · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery,

VGG16 [Simonyan and Zisserman, 2015]

Page 14: Deep Learning for Computer Vision · 2020. 3. 13. · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery,

ImageNet

[Deng et al. 2009]

20,000+ categories x ~1000 instances = 14,000,000+ images

Page 15: Deep Learning for Computer Vision · 2020. 3. 13. · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery,

Domain Transfer + Fine-Tuning

Page 16: Deep Learning for Computer Vision · 2020. 3. 13. · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery,

Network Surgery

Keep Toss

Page 17: Deep Learning for Computer Vision · 2020. 3. 13. · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery,

Network Surgery

Keep Add

1 x 1 x 256 1 x 1 x 1

Page 18: Deep Learning for Computer Vision · 2020. 3. 13. · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery,

First, train the top layers

Freeze Train

1 x 1 x 256 1 x 1 x 1

Page 19: Deep Learning for Computer Vision · 2020. 3. 13. · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery,

Then, train the whole network

Train Train

1 x 1 x 256 1 x 1 x 1

Page 20: Deep Learning for Computer Vision · 2020. 3. 13. · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery,

Open imagenet_vgg16_fine-tuning.html in browser

Lecture: Deep Convolutional Neural Networksvision.stanford.edu/teaching/cs131_fall1819/files/20_deep_cnns.pdfConvolutional Neural Networks (ConvNets) • Neural networks which involve

Lecture 8: Introduction to Deep Learning: Part 2 (More on ... · Lecture 8: Introduction to Deep Learning: Part 2 (More on backpropagation, and ConvNets) Sanjeev Arora Elad Hazan

Mapping Auto-context Decision Forests to Deep ConvNets for … · 2017-05-30 · ConvNets for Semantic Segmentation. ConvNets can be applied to semantic segmen-tation either in a

Deep Learning with Convolutional Neural Networks for ...phillimc/research/Cheng - ConvNets - RSNA 2016.pdf · Convolutional Neural Networks for Radiologic Image ... Computer Vision

Deep Convolutional Neural Networks for E cient Pose Estimation in Gesture …vgg/publications/2014/Pfister14a/... · 2014. 11. 2. · Deep ConvNets for E cient Pose Estimation in

Deep Fried Convnets

Deep Learning for Vision

Deep Learning for Computer Vision · Deep Learning for Computer Vision Lecture 11: Building ConvNets in Keras, Pre-trained Neural Networks, Performing Network Surgery, Domain Transfer

Lecture 5: Modern ConvNets - GitHub PagesPicture credit: Bharath Raj UVA DEEP LEARNING COURSE –EFSTRATIOS GAVVES MODERN CONVNETS - 15 o Multiple kernel filters of different sizes

Do Convnets Learn Correspondence?papers.nips.cc/paper/5420-do-convnets-learn-correspondence.pdf · Do Convnets Learn Correspondence? ... combine a weak spatial model with deep learning

20151216 Computer Vision Meetup Deep Learning Deep Learning - Rene...Deep Learning . René Donner Deep Learning Overview 2 ... coursera ... 20151216 Computer Vision Meetup Deep Learning

Deep Learning for Computer Vision: Deep Networks (UPC 2016)

WILDCAT: Weakly Supervised Learning of Deep ConvNets for …webia.lip6.fr/~durandt/pdfs/2017_CVPR/Durand_WILDCAT... · 2018-01-04 · 1. Introduction Over the last few years, deep

Vision - Accueilbengioy/talks/DL-Tutorial-NIPS2015-vision.pdf · NEC, NVIDIA, MobilEye, Qualcomm….. Everyone uses ConvNets ... vision-based navigation for off-road robot (LAGR Project

Deep Learning for Computer Vision

ConvNets and ImageNet Beyond Accuracy: Understanding ...€¦ · ConvNets and ImageNet have driven the recent success of deep learn-ing for image classiﬁcation. However, the marked

Convnets in TensorFlow - Stanford University · Convnets in TensorFlow CS 20SI: TensorFlow for Deep Learning Research Lecture 7 2/3/2017 1. 2. Agenda Playing with convolutions Convolution

Lecture 5: Modern ConvNets - GitHub Pages€¦ · UVA DEEP LEARNING COURSE –EFSTRATIOS GAVVES MODERN CONVNETS - 20 o The network was too deep (at the time) o Roughly speaking, backprop

Predicting Weather Uncertainty with Deep Convnets · Predicting Weather Uncertainty with Deep Convnets Peter Grönquist ETH Zürich [email protected] Tal Ben-Nun ETH Zürich

Module 5 Deep Convnets for Local Recognition · Deep Convnets for Local Recognition Joost van de Weijer 4 April 2016. Previously, end-to-end.. Slide credit: Jose M 2 ... (SPP) •end-to-end

ConvNets for NLP

Mapping from Frame-Driven to Frame-Free Event-Driven Vision … from... · 2014-12-09 · tional neural networks (ConvNets) [12]; reported ConvNets operate based on frame-driven principles

Deep-Learningperso.mines-paristech.fr/.../deepLearning-convNets... · Deep-Learning: general principles + convNets, Pr. Fabien MOUTARDE, Center for Robotics, MINES ParisTech, PSL,

Nando de Freitas - cs.ox.ac.uk · Convnets for object recognition and language How to design convolutional layers How to design pooling layers How to build convnets in torch. Convnets

Computer Vision System Design: Deep Learning, 3D Vision ......Computer Vision System Design: Deep Learning, 3D Vision, and Embedded Vision Dr. Amod Anandkumar Tabrez Khan 2 Outline

Deep Convnets for Video Processing (Master in Computer Vision Barcelona, 2016)

Cross-stitch Networks for Multi-task Learningcontext of ConvNets used in computer vision. Multi-task learning is generally used with ConvNets in computer vision to model related tasks

Do Convnets Learn Correspondence.pdf

P05 deep boltzmann machines cvpr2012 deep learning methods for vision

Deep-Learningmembers.cbio.mines-paristech.fr/~jvert/svn/lsml/lsml17/moutarde.pdf · Deep-Learning: principles+convNets+DBN+RNN, Pr. Fabien Moutarde, Robotics Lab, MINES ParisTech

Lecture 1: Introduction to Deep LearningEvolution of ConvNets • LeNet (LeCun, 1998) – Basic structures: convolution, max-pooling, softmax • Alexnet (Krizhevsky et.al 2012) –

Computer Vision with Deep Learning

CDE Marketplace: Deep Vision

Deep Learning in Higher Dimensions · – Audio / Speech – Also Point Clouds • 2D ConvNets – Images (AlexNet, VGG, ResNet -> Classification, Localization, etc..) • 3D ConvNets

Deep learning in Computer Vision