object detection with digits -...
TRANSCRIPT
![Page 1: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/1.jpg)
1
Twin Karmakharm
Object Detection with DIGITS
Certified Instructor, NVIDIA Deep Learning Institute
![Page 2: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/2.jpg)
2
DEEP LEARNING INSTITUTE
DLI Mission
Helping people solve challenging problems using AI and deep learning.
• Developers, data scientists and engineers
• Self-driving cars, healthcare and robotics
• Training, optimizing, and deploying deep neural networks
![Page 3: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/3.jpg)
3 3
TOPICS
• Lab Perspective
• Object Detection
• NVIDIA’s DIGITS
• Caffe
• Lab Discussion / Overview
• Lab Review
![Page 4: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/4.jpg)
4
LAB PERSPECTIVE
![Page 5: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/5.jpg)
5
WHAT THIS LAB IS
• Discussion/Demonstration of object detection using Deep Learning
• Hands-on exercises using Caffe and DIGITS
![Page 6: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/6.jpg)
6
WHAT THIS LAB IS NOT
• Intro to machine learning from first principles
• Rigorous mathematical formalism of convolutional neural networks
• Survey of all the features and options of Caffe
![Page 7: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/7.jpg)
7
ASSUMPTIONS
• You are familiar with convolutional neural networks (CNN)
• Helpful to have:
• Object detection experience
• Caffe experience
![Page 8: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/8.jpg)
8
TAKE AWAYS
• You can setup your own object detection workflow in Caffe and adapt it to your use case
• Know where to go for more info
• Familiarity with Caffe
![Page 9: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/9.jpg)
9
OBJECT DETECTION
![Page 10: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/10.jpg)
10
COMPUTER VISION TASKSImage
SegmentationObject DetectionImage
Classification + Localization
Image Classification
(inspired by a slide found in cs231n lecture from Stanford University)
![Page 11: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/11.jpg)
11
OBJECT DETECTION
• Object detection can identify and classify one or more objects in an image
• Detection is also about localizing the extent of an object in an image
• Bounding boxes / heat maps
• Training data must have objects within images labeled
• Can be hard to find / produce training dataset
![Page 12: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/12.jpg)
12
OBJECT DETECTION IN REMOTE SENSING IMAGESBroad applicability
• Commercial asset tracking
• Humanitarian crisis mapping
• Search and rescue
• Land usage monitoring
• Wildlife tracking
• Human geography
• Geospatial intelligence production
• Military target recognition
Vermeulen et al, (2013) Unmanned Aerial Survey of Elephants. PLoS ONE 8(2): e54700
Imagery ©2016 Google, Map data © 2016 Google
![Page 13: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/13.jpg)
13
OBJECT DETECTION
GENERATE CANDIDATE DETECTIONS
EXTRACT PATCHES
![Page 14: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/14.jpg)
14 14
CHALLENGES FOR OBJECT DETECTION
Background clutter Occlusion
Illumination
Object variation
![Page 15: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/15.jpg)
15
ADDITIONAL APPROACHES TO OBJECT DETECTION ARCHITECTURE
• R-CNN = Region CNN
• Fast R-CNN
• Faster R-CNN Region Proposal Network
• RoI-Pooling = Region of Interest Pooling
![Page 16: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/16.jpg)
16
NVIDIA’S DIGITS
![Page 17: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/17.jpg)
17 17
Process Data Configure DNN VisualizationMonitor Progress
Interactive Deep Learning GPU Training System
NVIDIA’S DIGITS
![Page 18: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/18.jpg)
18
CAFFE
![Page 19: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/19.jpg)
19
WHAT IS CAFFE?
• Pure C++/CUDA architecture
• Command line, Python, MATLAB interfaces
• Fast, well-tested code
• Pre-processing and deployment tools, reference models and examples
• Image data management
• Seamless GPU acceleration
• Large community of contributors to the open-source project
An open framework for deep learning developed by the Berkeley Vision and Learning Center (BVLC)
caffe.berkeleyvision.orghttp://github.com/BVLC/caffe
![Page 20: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/20.jpg)
20 20
CAFFE FEATURES
Protobuf model format
• Strongly typed format
• Human readable
• Auto-generates and checks Caffe code
• Developed by Google
• Used to define network architecture and training parameters
• No coding required!
name: “conv1”type: “Convolution”bottom: “data”top: “conv1”convolution_param {
num_output: 20kernel_size: 5stride: 1weight_filler {
type: “xavier”}
}
Deep Learning model definition
![Page 21: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/21.jpg)
21
LAB DISCUSSION / OVERVIEW
![Page 22: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/22.jpg)
22 22
TRAINING APPROACH 1 – SLIDING WINDOW
![Page 23: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/23.jpg)
23 23
0
0
0
0
0
0
0
0
1
1
1
0
0
0
0
1
2
2
1
1
1
0
1
2
2
2
1
1
0
1
2
2
2
1
1
0
0
1
1
1
1
1
0
0
0
0
0
0
0
4
0
0
0
0
0
0
0
-4
1
0
-8
Source Pixel
Convolution kernel (a.k.a. filter) New pixel value
(destination pixel)
Center element of the kernel is placed over the source pixel. The source pixel is then replaced with a weighted sum of itself and nearby pixels.
CONVOLUTION
![Page 24: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/24.jpg)
24
TRAINING APPROACH 1 – POOLING
• Pooling is a down-sampling technique
• Reduces the spatial size of the representation
• Reduces number of parameters and number of computations (in upcoming layer)
• Limits overfitting
• No parameters (weights) in the pooling layer
• Typically involves using MAX operation with a 2 X 2 filter with a stride of 2
![Page 25: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/25.jpg)
25
TRAINING APPROACH 1 - DATASETS
• Two datasets
• First contains the wide area ocean shots containing the whales
• This dataset is located in data_336x224
• Second dataset is ~4500 crops of whale faces and an additional 4500 random crops from the same images
• We are going to use this second dataset to train our classifier in DIGITS
• These are the “patches”
![Page 26: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/26.jpg)
26
TRAINING APPROACH 1 - TRAINING
• Will train a simple two class CNN classifier on training dataset
• Customize the Image Classification model in DIGITS:
• Choose the Standard Network "AlexNet"
• Set the number of training epochs to 5
![Page 27: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/27.jpg)
27
Activation functions
tanh Sigmoid ReLU
![Page 28: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/28.jpg)
28
TRAINING APPROACH 1 – SLIDING WINDOW• Will execute code shown below
• Example of how you feed new images to a model• In practice, would write code in C++ and use TensorRT
import numpy as npimport matplotlib.pyplot as pltimport caffeimport time
MODEL_JOB_NUM = '20160920-092148-8c17' ## Remember to set this to be the job number for your modelDATASET_JOB_NUM = '20160920-090913-a43d' ## Remember to set this to be the job number for your dataset
MODEL_FILE = '/home/ubuntu/digits/digits/jobs/' + MODEL_JOB_NUM + '/deploy.prototxt' # Do not changePRETRAINED = '/home/ubuntu/digits/digits/jobs/' + MODEL_JOB_NUM + '/snapshot_iter_270.caffemodel' # Do not changeMEAN_IMAGE = '/home/ubuntu/digits/digits/jobs/' + DATASET_JOB_NUM + '/mean.jpg' # Do not change
# load the mean imagemean_image = caffe.io.load_image(MEAN_IMAGE)
# Choose a random image to test againstRANDOM_IMAGE = str(np.random.randint(10))IMAGE_FILE = 'data/samples/w_' + RANDOM_IMAGE + '.jpg'
![Page 29: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/29.jpg)
29
TRAINING APPROACH 2Fully-Convolutional Network (FCN)
“CONVOLUTIONIZATION”/
“NET SURGERY”
Con
vP
ool
Con
vP
ool
Con
vP
ool
Fully
con
nect
ed
Fully
con
nect
ed
CLASS PREDICTIONS
CA
RTR
UC
KD
IGG
ER
BA
CK
GR
OU
ND
Con
vP
ool
Con
vP
ool
Con
vP
ool
1x1
Con
v
1x1
Con
v
PATCHES
WIDE AREA IMAGE CLASS PREDICTION
HEATMAP
![Page 30: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/30.jpg)
30
TRAINING APPROACH 2 - EXAMPLEAlexnet converted to FCN for four class classification
![Page 31: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/31.jpg)
31
TRAINING APPROACH 2 - FALSE ALARM MINIMIZATION
Imbalanced dataset and InfogainLoss
Data augmentation
Random scale, crop, flip, rotate
Transfer learning
ImageNetdata
ImageNetclasses
Kesprydata
Kespryclasses
Extract pre-trained CNN
weights
Pre-training
Fine-tuning
![Page 32: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/32.jpg)
32
TRAINING APPROACH 2 - INCREASING FCN PRECISION
Multi-scale and shifted inputs
Slide credit: Fei-Fei Li & Andrej Karpathy, Stanford cs231n
![Page 33: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/33.jpg)
33
TRAINING APPROACH 3 - DETECTNET• Train a CNN to simultaneously
• Classify the most likely object present at each location within an image
• Predict the corresponding bounding box for that object through regression
• Benefits:
• Simple one-shot detection, classification and bounding box regression pipeline
• Very low latency
• Very low false alarm rates due to strong, voluminous background training data
![Page 34: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/34.jpg)
34
TRAINING APPROACH 3 - DETECTNETTrain on wide-area images with bounding box annotations
![Page 35: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/35.jpg)
35
NAVIGATING TO QWIKLABS
1. Navigate to: https://nvlabs.qwiklab.com
2. Login or create a new account
![Page 36: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/36.jpg)
36
ACCESSING LAB ENVIRONMENT
1. Select the event specific In-Session Class in the upper left
2. Click the “Approaches to Object Detection Using DIGITS” Class from the list
*** Model building may take some time and may appear to initially not be progressing ***
![Page 37: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/37.jpg)
37
LAB REVIEW
![Page 38: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/38.jpg)
38
TRAINING APPROACHS
• Approach 1:• Patches to build model• Sliding window looks for location of whale face
![Page 39: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/39.jpg)
39
TRAINING APPROACHS
• Approach 2:• Fully-convolut
ion network (FCN)
![Page 40: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/40.jpg)
40
TRAINING APPROACHS
• Approach 3:• DetectNet
![Page 41: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/41.jpg)
41
WHAT’S NEXT
• Use / practice what you learned
• Discuss with peers practical applications of DNN
• Reach out to NVIDIA and the Deep Learning Institute
• Attend local meetup groups
• Follow people like Andrej Karpathy and Andrew Ng
![Page 42: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/42.jpg)
42 42
WHAT’S NEXT
…for the chance to win an NVIDIA SHIELD TV.
Check your email for a link.
TAKE SURVEYCheck your email for details to access more DLI training online.
ACCESS ONLINE LABS
Visit www.nvidia.com/dli for workshops in your area.
ATTEND WORKSHOPVisit https://developer.nvidia.com/join for more.
JOIN DEVELOPER PROGRAM
![Page 43: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/43.jpg)
43 43
![Page 44: Object Detection with DIGITS - gpucomputing.shef.ac.ukgpucomputing.shef.ac.uk/static/slides/2018-07-19-dl-cv/object... · • Object detection can identify and classify one or more](https://reader033.vdocuments.site/reader033/viewer/2022050601/5fa836c4790de8283861eb29/html5/thumbnails/44.jpg)
44
www.nvidia.com/dli
Instructor: Charles Killam, LP.D.