mobilenet - pr044

MobileNets:Efficient Convolutional Neural Networks for

Mobile Vision Applications

29th October, 2017PR12 Paper Review

Jinwon LeeSamsung Electronics

MobileNets

Key Requirements for Commercial Computer Vision Usage• Data-centers(Clouds)

Rarely safety-critical

Low power is nice to have

Real-time is preferable

• Gadgets – Smartphones, Self-driving cars, Drones, etc. Usually safety-critical(except smartphones)

Low power is must-have

Real-time is required

Slide Credit : Small Deep Neural Networks - Their Advantages, and Their Design by Forrest Iandola

What’s the “Right” Neural Network for Use in a Gadget?• Desirable Properties

Sufficiently high accuracy

Low computational complexity

Low energy usage

Small model size

Why Small Deep Neural Networks?

• Small DNNs train faster on distributed hardware

• Small DNNs are more deployable on embedded processors

• Small DNNs are easily updatable Over-The-Air(OTA)

Techniques for Small Deep Neural Networks

• Remove Fully-Connected Layers

• Kernel Reduction ( 3x3 1x1 )

• Channel Reduction

• Evenly Spaced Downsampling

• Depthwise Separable Convolutions

• Shuffle Operations

• Distillation & Compression

Key Idea : Depthwise Separable Convolution!

Recap – Convolution Operation

1 1 1 1 01 1 1 1 00 0 0 1 10 1 1 1 00 1 1 0 0

0 1 1 0 10 1 1 0 00 0 1 1 00 0 1 1 11 1 1 0 0

1 1 1 0 00 1 1 1 00 0 1 1 10 0 1 1 00 1 1 0 0

-1 0 00 1 00 0 -1

0 -1 0-1 1 -10 -1 0

1 0 10 1 01 0 1

-1 0 -10 1 00 0 -1

0 -1 0-1 1 00 -1 0

1 0 10 -1 01 0 1

1 -1 10 -1 -13 1 0

3 0 1-2 0 20 2 3

=convolution

Input channel : 3 Output channel : 2# of filters : 2

Recap – VGG, Inception-v3

• VGG – use only 3x3 convolution Stack of 3x3 conv layers has same effective receptive

field as 5x5 or 7x7 conv layer

Deeper means more non-linearities

Fewer parameters: 2 x (3 x 3 x C) vs (5 x 5 x C)

regularization effect

• Inception-v3 Factorization of filters

Why should we always consider all channels?

Standard Convolution

Depthwise convolution

Figures from http://machinethink.net/blog/googles-mobile-net-architecture-on-iphone/

Depthwise Separable Convolution

• Depthwise Convolution + Pointwise Convolution(1x1 convolution)

Depthwise convolution Pointwise convolution

Figures from http://machinethink.net/blog/googles-mobile-net-architecture-on-iphone/

Standard Convolution vs Depthwise Separable Convolution

Standard Convolution vs Depthwise Separable Convolution• Standard convolutions have the computational cost of

DK x DK x M x N x DF x DF

• Depthwise separable convolutions cost DK x DK x M x DF x DF + M x N x DF x DF

• Reduction in computations 1 / N + 1 / DK

If we use 3x3 depthwise separable convolutions, we get between 8 to 9 times less computations

DK : width/height of filtersDF : width/height of feature mapsM : number of input channelsN : number of output channels(number of filters)

Depthwise Separable Convolutions

Model Structure

Width Multiplier & Resolution Multiplier

• Width Multiplier – Thinner Models For a given layer and width multiplier α, the number of input channels M

becomes αM and the number of output channels N becomes αN – where αwith typical settings of 1, 0.75, 0.6 and 0.25

• Resolution Multiplier – Reduced Representation The second hyper-parameter to reduce the computational cost of a neural

network is a resolution multiplier ρ

0<ρ≤1, which is typically set of implicitly so that input resolution of network is 224, 192, 160 or 128(ρ = 1, 0.857, 0.714, 0.571)

• Computational cost:

DK x DK x αM x ρDF x ρDF + αM x αN x ρDF x ρDF

Width Multiplier & Resolution Multiplier

Experiments – Model Choices

Model Shrinking Hyperparameters

Results

PlaNet : 52M parameters, 5.74B mult-addsMobilNet : 13M parameters, 0.58M mult-adds

Results

Tensorflow Implementation

https://github.com/Zehaos/MobileNet/blob/master/nets/mobilenet.py

mobilenet - pr044

Technology

covid detection from chest x-ray using mobilenet and

shuﬄenet v2: practical guidelines for eﬃcient cnn...

artificial intelligence (ai) and computer vision function...

pr044 protocolo de prevención de errores de medicación ·...

dpu for convolutional neural network v2.0, dpu ip product...

superior engineering solutions - mobilenet services · pdf...

deepdish: multi-object...

se-ka-pr044-20171114110517 · ilÆ+Àßjt¥j/form: toi.02-...

edge computed nilm: a phone-based implementation using...

dpu for convolutional neural network v1.2, dpu ip product...

se-ka-pr044-20171114150653 - roxtec · ilt:vj+ßjvj/forrn:...

university of oklahoma petrographic analysis with...

rockchip user guide rknn api...this manual mainly consists...

dual super-resolution learning for semantic...

opening day user experience testing (at&t and verizon) ·...

se-ka-pr044-20190118134853 - roxtec · s ccs c:' this...

deblurgan-v2: deblurring (orders-of-magnitude) faster and...

a follow-me algorithm for ar.drone using mobilenet-ssd and...

manufacturing control system ece 189 landscape-style...

打造 ai 新旗舰 - mediatek...支持cat-12, 3xca, 4x4...