deep learning with tensorflo · deep learning is conceptually very simple that is it, that is all...
TRANSCRIPT
![Page 1: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/1.jpg)
Deep Learning With TensorFlow
Yonghui Wu
![Page 2: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/2.jpg)
Agenda● Intro to deep learning● TensorFlow from 10000 feet up● TensorFlow key concepts● Concluding remarks
![Page 3: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/3.jpg)
Intro to deep learning
![Page 4: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/4.jpg)
Deep learningDeep learning is a branch of artificial intelligence that is based on neural networks
● loosely inspired by the brain● built from simple, trainable functions.
inputoutput
![Page 5: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/5.jpg)
Primitives: the neuron
y = F(w1x1 +...+ wnxn+b)
X1 Xn...
inputs
● w1...wn are weights,● b is a bias,● weights and biases are
parameters,● F is a “differentiable”
non-linear function.
![Page 6: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/6.jpg)
Primitives: the neuron — example
X1 X2
n = 2 w1 = 1 w2 = −1.2b = 0.1F(x) = max(0, x) “Relu”
y = max(0, x1− 1.2x2 + 0.1)
![Page 7: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/7.jpg)
A modern network [Szegedy et al., 2016]
Image: Fernanda Viégas
![Page 8: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/8.jpg)
A learning algorithmGiven training examples “(input, output)” pairsWhile not done:
1. Pick a batch of random training example (X, Y)2. Run the neural network on X, generate Y’3. Compute a loss by comparing Y’ to Y4. Adjust model parameters to reduce the error (the
“loss”)
![Page 9: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/9.jpg)
Gradient descent● Compute gradient of the loss with respects to params in
the neural network.● new_params = old_params - alpha * gradient● alpha is the learning rate
![Page 10: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/10.jpg)
Deep learning is conceptually very simple● That is it, that is all the key elements of deep learning● Typical procedure to solve a problem with deep learning
○ Define a network suitable to the problem one would like to solve.○ Define a loss function.○ Initialize the network with small random number○ Iteratively adjust network params following some optimization algorithm until loss doesn’t
decrease any more
![Page 11: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/11.jpg)
The renaissance of deep learning● The concept and main algorithm were actually invented 1960’s ~ 1980’s● It gained popularity only very recently
○ AlexNet won ImageNet 2012 competition, beating runner up by as much as 10% in top-5 accuracy
● People attributed the renaissance of NN to:○ Large and Larger datasets○ Advance in computing power
![Page 12: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/12.jpg)
Some attractive properties of deep learning● Applicable across many domains.● With a fairly simple conceptual core.
○ Back propagation○ SGD○ Neuron
● Benefits from having lots of data.○ Often requires little data curation.○ Tolerates inconsistent data.
![Page 13: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/13.jpg)
● Model gets better with more data, more compute○ Data and compute and sometimes easy to get
● Requires architectural choices but no detailed design of algorithms and data representations.○ Can learn intriguing and powerful data representation
More attractive properties of deep learning
![Page 14: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/14.jpg)
Deep Learning
Universal Machine Learning
...that works better than the alternatives!
Current State-of-the-art in:Speech RecognitionImage Recognition
Machine TranslationMolecular Activity Prediction
Road Hazard DetectionOptical Character Recognition
...
![Page 15: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/15.jpg)
Team Year Place Error (top-5) Params
XRCE (pre-neural-net explosion) 2011 1st 25.8%
Supervision (AlexNet) 2012 1st 16.4% 60M
Clarifai 2013 1st 11.7% 65M
MSRA 2014 3rd 7.35%
VGG 2014 2nd 7.32% 180M
GoogLeNet (Inception) 2014 1st 6.66% 5M
Andrej Karpathy (human) 2014 N/A 5.1% 100 trillion?
BN-Inception (Arxiv) 2015 N/A 4.9% 13M
Inception-v3 (Arxiv) 2015 N/A 3.46% 25M
Rapid Progress in Image Recognition
ImageNet classification challenge
![Page 16: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/16.jpg)
“How cold is it outside?”
DeepRecurrent
Neural NetworkAcoustic Input Text Output
Neural nets are rapidly replacing previous technologies
Speech Recognition
Google Research Blog - August 2012, August 2015
![Page 17: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/17.jpg)
Machine Translation
GNMT significantly reduced the gap between in translation quality between MT and human
![Page 18: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/18.jpg)
AlphaGo
AlphaGo dominated the Game of Go
![Page 19: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/19.jpg)
TensorFlow from 10000 feet up
![Page 20: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/20.jpg)
Software for machine learningand particularly for deep learning.
First open-source release in November 2015, under anApache 2.0 license.https://tensorflow.org/
and
https://github.com/tensorflow/tensorflow
![Page 21: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/21.jpg)
System structure
Core TensorFlow Execution System
CPU GPU TPU ... ...
C++ front end Python front end...
![Page 22: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/22.jpg)
Raspberry PiAndroidiOS
TPU
GPUCPU
TensorFlow supports many platforms
![Page 23: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/23.jpg)
Java
… and many languages
![Page 24: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/24.jpg)
TensorFlow provides great tools like TensorBoard
![Page 25: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/25.jpg)
![Page 26: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/26.jpg)
SearchGmailTranslateMapsAndroidPhotosSpeechYouTubePlay… many others ...
Production use in many areas:Internal TensorFlow launch
Research use for:
100s of projects and papers
# of Google directories containing model description files
![Page 27: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/27.jpg)
2013 20152014 2016 2017
12500
25000
37500
50000
0
50
GitHub Star Count
![Page 28: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/28.jpg)
TensorFlow key concepts
![Page 29: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/29.jpg)
TensorFlow key concepts● Data flow graph● Distributed computing● Model parallelism and data parallelism● Control flows● Functions
![Page 30: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/30.jpg)
Mul
Add Relu
biases
weights
examples
labels
Xent
Data: tensors (N-dimensional arrays)
Nodes: operations (ops) on tensors
Dataflow graph
![Page 31: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/31.jpg)
Mul
Add Relu
biases
weights
examples
labels
Xent
Dataflow computation: basic semanticsEdges hold tensors, at most one at a time.
![Page 32: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/32.jpg)
Mul
Add Relu
biases
weights
examples
labels
Xent
Dataflow computation: basic semanticsA node may fire when all its inputs are available.
![Page 33: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/33.jpg)
Mul
Add Relu
biases
weights
examples
labels
Xent
Dataflow computation: basic semanticsAfter firing, the node is done.
![Page 34: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/34.jpg)
Mul
Add Relu
biases
weights
examples
labels
Xent
Dataflow computation: basic semanticsComputation ends when all data is drained from the graph.
![Page 35: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/35.jpg)
Mul
Add Relu
biases
weights
examples
labels
Xent
Dataflow computation: basic semanticsThis simple model extends to other features (control edges, state, conditionals, loops, …).
![Page 36: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/36.jpg)
Feeding and Fetching
Run(input={“b”: ...}, outputs={“f:0”})
![Page 37: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/37.jpg)
import tensorflow as tf
examples = tf.placeholder(tf.float32)
labels = tf.placeholder(tf.int32)
weight = tf.Variable(tf.zeros([784,10]))
bias = tf.Variable(tf.zeros([10]))
y = tf.matmul(examples, weight) + bias
loss = tf.cross_entropy(y, labels)
Python API -- Forward Computation
![Page 38: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/38.jpg)
weights_grad, bias_grad = tf.gradients(loss, [weights, bias])new_weights = tf.assign_sub(weight, lr * weights_grad)new_bias = tf.assign_sub(bias, lr * bias_grad) train_op = tf.group([new_weights, new_bias])
Python API -- Backward computation
![Page 39: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/39.jpg)
Add Mul...
learning rate
AssignSub
...
some ops compute losses and gradients
Dataflow computation also “backward”
biases
![Page 40: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/40.jpg)
Stateless ops vs stateful ops● Two types of TensorFlow ops
○ Stateless ops○ Stateful ops
● Stateless ops○ Pure functions○ Execution of the ops has no side effects○ Output is a deterministic function of the input○ Most mathematical ops, like MatMul, BiasAdd, Conv and etc are stateless ops
● Stateful ops○ Not pure function○ Op has states that persist across session runs○ Variables are stateful ops
![Page 41: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/41.jpg)
Add Mul
biases
...
learning rate
AssignSub
...
biases and other parameters are
variablessome ops update
parameterssome ops compute
losses and gradients
Dataflow computation with state
![Page 42: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/42.jpg)
Graph definition vs graph execution● Graph definition: define the data flow graph
○ Forward subgraph, backward subgraph, and a train subgraph○ Only specifies how computations should be done○ Typically done in some frontend, e.g. Python
● Graph execution is completely separate from graph definition○ The frontend sends GraphDef to the backend○ Backend will analyze the graph, and will carry out a few rounds of optimization:
■ constant folding, common subexpression elimination and etc○ Graph execution is done on the backend○ Actually executes necessary ops in a graph on real data
■ E.g. update the model params○ Graph execution is typically driven by a frontend as well
![Page 43: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/43.jpg)
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)
for i in range(1000):
batch_xs, batch_ys = mnist.train.next_batch(100)
sess.run(train_op, feed_dict={examples: batch_xs, labels:
batch_ys})
Python API - Drive the training loop
![Page 44: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/44.jpg)
Distributed execution of TF graph● With TensorFlow, you can easily distribute graph execution to multiple
devices, and/or multiple machines● It requires very minimal code change● TensorFlow will automatically insert nodes to the graph to enable distributed
execution of the graph
![Page 45: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/45.jpg)
GPU 0 CPU
Add Mul
biases
learning rate
AssignSub
......
Distributed dataflow computation
![Page 46: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/46.jpg)
GPU 0 CPU
Add Mul
biases
learning rate
AssignSub
......
Distributed dataflow computation
Send Recv
Send Recv
Send
Recv SendRecv
● TensorFlow adds Send/Recv ops to transport tensors.● Recv ops pull data from Send ops.
![Page 47: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/47.jpg)
/job:worker/gpu:0 /job:ps/cpu:0
Add Mul
biases
learning rate
AssignSub
......
Distributed dataflow computation● Communication across machines is abstracted just like
cross-device communication within a machine.
Send Recv
Send Recv
Send
Recv SendRecv
![Page 48: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/48.jpg)
/job:worker/gpu:0 /job:ps/cpu:0
Add Mul
biases
learning rate
AssignSub
......
Control edges for dataflow scheduling● Control edges impose additional execution ordering.● Critical-path analysis informs their addition.
Send Recv
Send Recv
Send
Recv SendRecv
control edge
control edge
![Page 49: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/49.jpg)
Python API - Distributed computationStacked LSTMs on one single devide
for i in range(8): for d in range(4): # d is depth input = x[i] if d is 0 else m[d-1] m[d], c[d] = LSTMCell( input, mprev[d], cprev[d]) mprev[d] = m[d] cprev[d] = c[d]
Stacked LSTMs on Multiple devices
for i in range(8): for d in range(4): # d is depth with tf.device("/gpu:%d" % d): input = x[i] if d is 0 else m[d-1] m[d], c[d] = LSTMCell( input, mprev[d], cprev[d]) mprev[d] = m[d] cprev[d] = c[d]
![Page 50: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/50.jpg)
A B C D __ A B C
GPU1
GPU2
GPU3
GPU4
![Page 51: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/51.jpg)
A B C D __ A B C
GPU1
GPU2
GPU3
GPU4
![Page 52: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/52.jpg)
A B C D __ A B C
GPU1
GPU2
GPU3
GPU4
![Page 53: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/53.jpg)
A B C D __ A B C
GPU1
GPU2
GPU3
GPU4
![Page 54: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/54.jpg)
A B C D __ A B C
GPU1
GPU2
GPU3
GPU4
![Page 55: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/55.jpg)
A B C D __ A B C
GPU1
GPU2
GPU3
GPU4
![Page 56: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/56.jpg)
A B C D __ A B C
GPU1
GPU2
GPU3
GPU4
![Page 57: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/57.jpg)
A B C D __ A B C
GPU1
GPU2
GPU3
GPU4
![Page 58: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/58.jpg)
A B C D __ A B C
GPU1
GPU2
GPU3
GPU4
![Page 59: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/59.jpg)
A B C D __ A B C
GPU1
GPU2
GPU3
GPU4
![Page 60: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/60.jpg)
A B C D __ A B C
GPU1
GPU2
GPU3
GPU4
![Page 61: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/61.jpg)
ParallelismTensorFlow enables model parallelism and data parallelism:
● A graph can be split across several devices.● Many graph replicas can process inputs in parallel.
......
...
input1
input2
...
......
...
......
...
...
......
...
![Page 62: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/62.jpg)
● Different levels of model parallelism:○ Across machines, across devices, across cores,
instruction parallelism● Across machines: limited by network bandwidth / latency● Across devices: for GPUs, often limited by PCIe bandwidth.● Across cores: thread parallelism. Almost free.● On a single core: Instruction parallelism (SIMD). Just free.
Exploiting Model Parallelism
![Page 63: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/63.jpg)
Data Parallelism● Use multiple model replicas to process different
examples at the same time○ All collaborate to update model state (parameters) in shared
parameter server(s)
● Speedups depend highly on kind of model○ Dense models: 10-40X speedup from 50 replicas○ Sparse models:
■ support many more replicas■ often can use as many as 1000 replicas
![Page 64: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/64.jpg)
Data Parallelism
Parameter Servers
...ModelReplicas
Data ...
p’Δp’
p’’ = p’ + Δp
![Page 65: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/65.jpg)
Data parallelism● Params access and update can be synchronous and asynchronous● In asynchronous model, this is the famous downpour SGD algorithm
○ No guarantee on the consistency of the params
● Synchronous params update usually works better○ At the cost of training speech due to synchronous cost
![Page 66: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/66.jpg)
Success of Data Parallelism● Data parallelism is really important for many of Google’s
problems (very large datasets, large models):○ RankBrain: 500 replicas○ ImageNet Inception: 50 GPUs, ~40X speedup○ SmartReply: 16 replicas, each with multiple GPUs○ Language model on “One Billion Word”: 32 GPUs
![Page 67: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/67.jpg)
Hours
10 replicas50 replicas
19.6 vs. 80.3 (4.1X)
5.6 vs. 21.8 (3.9X)
Image Model Training Time: 10 vs 50 GPUs
![Page 68: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/68.jpg)
Control-flow constructsConditionals and iteration can be useful in building dynamic models.
For example, Recursive Neural Networks (RNNs) are widely used for speech recognition, language modeling, translation, image captioning, and more.
Cells
y
while step < len(seq):
s, y = Cell(seq[step], s)
ys = ys.append(y)
step = step + 1
![Page 69: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/69.jpg)
Control-flow constructsWe want:
● Fit with the computational model● Parallel execution
(sometimes of the iterations of a loop)● Distributed execution ● Automatic generation of gradient code
Cells
y
![Page 70: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/70.jpg)
Dynamic dataflow architecture [Arvind et al., 1980s]
Switch p
d
Merge
d
d d
d
d
Enter(“L1”)
F T
d
d
Exit
d
d
NextIteration
Execution contexts identify different invocations of the same node. New operations manipulate these contexts.
![Page 71: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/71.jpg)
Python API - Condition● Conditional execution of a sub-graph
![Page 72: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/72.jpg)
Python API - While Loop● Repeated execution of a sub-graph● Very useful for RNN● TF essentially a programming language
![Page 73: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/73.jpg)
TF is essentially a programming language● With if, and while, TF is basically a programming langugage● TF can easily extended with custom kernels● Can be used to implement very complicated logic
○ E.g. BeamSearch can implemented entirely in TF
![Page 74: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/74.jpg)
Python API - Functions● You can define functions to combine primitive ops into a logical op
● TensorFlow executor treats functions as if they are primitive ops
![Page 75: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/75.jpg)
Function benefits● Reduce the size of the graphs● Benefits of monolithic op, but without the hassle of writing the gradient
functions.● Reduce the memory usage
○ Intermediate results are being discarded○ Could mean more computation during backprop.
![Page 76: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/76.jpg)
A Few TensorFlow Community Examples● DQN: github.com/nivwusquorum/tensorflow-deepq
● NeuralArt: github.com/woodrush/neural-art-tf
● Char RNN: github.com/sherjilozair/char-rnn-tensorflow
● Keras ported to TensorFlow: github.com/fchollet/keras
● Show and Tell: github.com/jazzsaxmafia/show_and_tell.tensorflow
● Mandarin translation: github.com/jikexueyuanwiki/tensorflow-zh
...
![Page 77: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/77.jpg)
More examples related to NLP● POS tagging:
https://github.com/rockingdingo/deepnlp/tree/master/deepnlp/pos● Syntaxnet:
https://research.googleblog.com/2016/05/announcing-syntaxnet-worlds-most.html
● Seq2seq: https://opensource.googleblog.com/2017/04/tf-seq2seq-sequence-to-sequence-framework-in-tensorflow.html
● Many tutorials: https://tensorflow.google.cn/tutorials/●
![Page 78: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/78.jpg)
Concluding remarks
![Page 79: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/79.jpg)
Conclusions● Ease of expression:
○ Lots of crazy ML ideas are just different Graphs○ Non-NN algorithms can also benefit if it maps to graph.
● Portability: can run on wide variety of platforms● Scalability:
○ Easy to scale ○ Much faster training
![Page 80: Deep Learning With TensorFlo · Deep learning is conceptually very simple That is it, that is all the key elements of deep learning Typical procedure to solve a problem with deep](https://reader033.vdocuments.site/reader033/viewer/2022042220/5ec680830ed9dc2fa0384c0a/html5/thumbnails/80.jpg)
Conclusions (cont.)● Open Sourcing of TensorFlow
○ Rapid exchange of research ideas (we hope!)○ Easy deployment of ML systems into products○ TensorFlow community doing interesting things!