tensorflow: what and why? - meetupfiles.meetup.com/18200471/meetup3_4.pdf · python api, c++ core...
TRANSCRIPT
TensorFlow: what and why?
Konstantin Shmelkov
Grenoble Data Science Meetup
14 Sep 2016
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 1 / 20
What is TensorFlow?
From the whitepaper: “TensorFlow is an interface for expressing machinelearning algorithms, and an implementation for executing such algorithms”.
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 2 / 20
Data flow graph
Pictures are from colah.github.io
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 3 / 20
Data flow graph
Pictures are from colah.github.io
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 3 / 20
Data flow graph
Pictures are from colah.github.io
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 3 / 20
Back in the past: Theano (2009)
Framework mostly developed in LISA group at the University of Montreal.
Features:
symbolic differentiation,
transparent use of GPU,
dynamic code generation (C and CUDA),
everything in Python!
Main challenges of deep learning framework: flexible, distributed,easy-to-use.
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 4 / 20
Back in the past: Theano (2009)
Framework mostly developed in LISA group at the University of Montreal.
Features:
symbolic differentiation,
transparent use of GPU,
dynamic code generation (C and CUDA),
everything in Python!
Main challenges of deep learning framework: flexible, distributed,easy-to-use.
Theano: flexible, distributed, easy-to-use.
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 4 / 20
TensorFlow features
distributed on all levels: real multithreading, multi-GPU, multiplecluster nodes,
symbolic differentiation,
freshly designed API,
Python API, C++ core (Eigen),
powerful visualization with TensorBoard,
model deployment with TensorFlow Serving,
integration with Google Cloud Platform.
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 5 / 20
TensorFlow features
distributed on all levels: real multithreading, multi-GPU, multiplecluster nodes,
symbolic differentiation,
freshly designed API,
Python API, C++ core (Eigen),
powerful visualization with TensorBoard,
model deployment with TensorFlow Serving,
integration with Google Cloud Platform.
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 5 / 20
TensorFlow features
distributed on all levels: real multithreading, multi-GPU, multiplecluster nodes,
symbolic differentiation,
freshly designed API,
Python API, C++ core (Eigen),
powerful visualization with TensorBoard,
model deployment with TensorFlow Serving,
integration with Google Cloud Platform.
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 5 / 20
TensorFlow features
distributed on all levels: real multithreading, multi-GPU, multiplecluster nodes,
symbolic differentiation,
freshly designed API,
Python API, C++ core (Eigen),
powerful visualization with TensorBoard,
model deployment with TensorFlow Serving,
integration with Google Cloud Platform.
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 5 / 20
Core abstractions
This is a data flow graph.
x is a Placeholder.W and b are Variables.Everything else are intermediate Tensors.
import tensorflow as tf
x = tf.placeholder(tf.float32 ,
shape=[None , 784])
W = tf.Variable(tf.random_normal ([784 , 10]))
b = tf.Variable(tf.zeros ([10]))
C = tf.nn.relu(tf.matmul(x, W) + b)
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 6 / 20
Core abstractions
This is a data flow graph.x is a Placeholder.
W and b are Variables.Everything else are intermediate Tensors.
import tensorflow as tf
x = tf.placeholder(tf.float32 ,
shape=[None , 784])
W = tf.Variable(tf.random_normal ([784 , 10]))
b = tf.Variable(tf.zeros ([10]))
C = tf.nn.relu(tf.matmul(x, W) + b)
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 6 / 20
Core abstractions
This is a data flow graph.x is a Placeholder.W and b are Variables.
Everything else are intermediate Tensors.
import tensorflow as tf
x = tf.placeholder(tf.float32 ,
shape=[None , 784])
W = tf.Variable(tf.random_normal ([784 , 10]))
b = tf.Variable(tf.zeros ([10]))
C = tf.nn.relu(tf.matmul(x, W) + b)
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 6 / 20
Core abstractions
This is a data flow graph.x is a Placeholder.W and b are Variables.Everything else are intermediate Tensors.
import tensorflow as tf
x = tf.placeholder(tf.float32 ,
shape=[None , 784])
W = tf.Variable(tf.random_normal ([784 , 10]))
b = tf.Variable(tf.zeros ([10]))
C = tf.nn.relu(tf.matmul(x, W) + b)
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 6 / 20
Core abstractions
This is a data flow graph.x is a Placeholder.W and b are Variables.Everything else are intermediate Tensors.
import tensorflow as tf
x = tf.placeholder(tf.float32 ,
shape=[None , 784])
W = tf.Variable(tf.random_normal ([784 , 10]))
b = tf.Variable(tf.zeros ([10]))
C = tf.nn.relu(tf.matmul(x, W) + b)
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 6 / 20
Core abstractions
This is a data flow graph.x is a Placeholder.W and b are Variables.Everything else are intermediate Tensors.
import tensorflow as tf
x = tf.placeholder(tf.float32 ,
shape=[None , 784])
W = tf.Variable(tf.random_normal ([784 , 10]))
b = tf.Variable(tf.zeros ([10]))
C = tf.nn.relu(tf.matmul(x, W) + b)
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 6 / 20
Let’s compute that!
import numpy as np
batch = np.random.randn (128, 784)
with tf.Session () as sess:
sess.run(tf.initialize_all_variables ())
val = sess.run(C, feed_dict ={x: batch})
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 7 / 20
Let’s compute that!
import numpy as np
batch = np.random.randn (128, 784)
with tf.Session () as sess:
sess.run(tf.initialize_all_variables ())
val = sess.run(C, feed_dict ={x: batch})
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 7 / 20
Simple network example
import tensorflow as tf
x = tf.placeholder(tf.float32 ,
shape=[None , 784])
W = tf.Variable(tf.random_normal ([784 , 10]))
b = tf.Variable(tf.zeros ([10]))
C = tf.matmul(x, W) + b
# C = tf.nn.relu(tf.matmul(x, W) + b)
y = tf.placeholder(tf.int64 , shape=[None])
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(C, y)
loss = tf.reduce_mean(xentropy)
optimizer = tf.train.GradientDescentOptimizer (1e-3)
train_op = optimizer.minimize(loss)
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 8 / 20
Simple network example
import tensorflow as tf
x = tf.placeholder(tf.float32 ,
shape=[None , 784])
W = tf.Variable(tf.random_normal ([784 , 10]))
b = tf.Variable(tf.zeros ([10]))
C = tf.matmul(x, W) + b
# C = tf.nn.relu(tf.matmul(x, W) + b)
y = tf.placeholder(tf.int64 , shape=[None])
xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(C, y)
loss = tf.reduce_mean(xentropy)
optimizer = tf.train.GradientDescentOptimizer (1e-3)
train_op = optimizer.minimize(loss)
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 8 / 20
Let’s optimize that!
import numpy as np
batch = np.random.randn (128, 784)
labels = np.random.randint (10, size =128)
with tf.Session () as sess:
sess.run(tf.initialize_all_variables ())
train_loss , _ = sess.run([loss , train_op],
feed_dict ={x: batch , y=labels })
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 9 / 20
ConvNet example in TF Slim
import tensorflow.contrib.slim as slim
with slim.arg_scope ([slim.conv2d , slim.fully_connected],
activation_fn=tf.nn.relu ,
weights_initializer=tf.truncated_normal_initializer (0.0, 0.01),
weights_regularizer=slim.l2_regularizer (0.0005)):
net = slim.conv2d(net , 64, [3, 3], scope=’conv1’)
net = slim.max_pool2d(net , [2, 2], scope=’pool1’)
net = slim.conv2d(net , 128, [3, 3], scope=’conv2’)
net = slim.max_pool2d(net , [2, 2], scope=’pool2’)
net = slim.conv2d(net , 256, [3, 3], scope=’conv3’)
net = slim.max_pool2d(net , [2, 2], scope=’pool3’)
net = slim.fully_connected(net , 1024, scope=’fc1’)
net = slim.dropout(net , 0.5, scope=’dropout1 ’)
net = slim.fully_connected(net , 10, activation_fn=None ,
scope=’fc2’)
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 10 / 20
Too many GPUs?
import tensorflow as tf
x = tf.placeholder(tf.float32 ,
shape=[None , 784])
W = tf.Variable(tf.random_normal ([784 , 10]))
b = tf.Variable(tf.zeros ([10]))
C = tf.nn.relu(tf.matmul(x, W) + b)
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 11 / 20
How to stop worrying and start using multi-GPU?
Model parallelism
import tensorflow as tf
with tf.device(’/gpu:1’):
x = tf.placeholder(tf.float32 ,
shape=[None , 784])
W = tf.Variable(tf.random_normal ([784 , 10]))
b = tf.Variable(tf.zeros ([10]))
C = tf.nn.relu(tf.matmul(x, W) + b)
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 12 / 20
Data parallelism scheme
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 13 / 20
Data parallelism code example
Data parallelism
import tensorflow as tf
with tf.device(’/cpu:0’):
x1 = tf.placeholder(tf.float32 ,
shape =[None , 784])
x2 = tf.placeholder(tf.float32 ,
shape =[None , 784])
W = tf.Variable(tf.random_normal ([784 , 10]))
b = tf.Variable(tf.zeros ([10]))
with tf.device(’/gpu:0’):
C1 = tf.nn.relu(tf.matmul(x1, W) + b)
with tf.device(’/gpu:1’):
C2 = tf.nn.relu(tf.matmul(x2, W) + b)
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 14 / 20
Scopes in TF
Flexible system of hierarchical structures
arg scope — redefine default arguments for enclosed functions.
name scope — group intermediate Tensors together.
variable scope — facilitate variable reuse to build complicated graphswith tied weights (implies a name scope).
Examples of variable scope:
/myNetwork/convLayer2/weights
/myNetwork/convLayer2/BatchNorm/gamma
/vgg16/fc7/biases
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 15 / 20
Scopes in TF
Flexible system of hierarchical structures
arg scope — redefine default arguments for enclosed functions.
name scope — group intermediate Tensors together.
variable scope — facilitate variable reuse to build complicated graphswith tied weights (implies a name scope).
Examples of variable scope:
/myNetwork/convLayer2/weights
/myNetwork/convLayer2/BatchNorm/gamma
/vgg16/fc7/biases
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 15 / 20
Scopes in TF
Flexible system of hierarchical structures
arg scope — redefine default arguments for enclosed functions.
name scope — group intermediate Tensors together.
variable scope — facilitate variable reuse to build complicated graphswith tied weights (implies a name scope).
Examples of variable scope:
/myNetwork/convLayer2/weights
/myNetwork/convLayer2/BatchNorm/gamma
/vgg16/fc7/biases
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 15 / 20
Scopes in TF
Flexible system of hierarchical structures
arg scope — redefine default arguments for enclosed functions.
name scope — group intermediate Tensors together.
variable scope — facilitate variable reuse to build complicated graphswith tied weights (implies a name scope).
Examples of variable scope:
/myNetwork/convLayer2/weights
/myNetwork/convLayer2/BatchNorm/gamma
/vgg16/fc7/biases
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 15 / 20
Tensorboard demo
LIVEFasten your seat belts and such
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 16 / 20
Other goodies
Queues and preprocessing: data loading, data augmentation, batchshuffling can be easily offloaded to TF threads.
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 17 / 20
Other goodies
Queues and preprocessing: data loading, data augmentation, batchshuffling can be easily offloaded to TF threads.
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 17 / 20
Other goodies
Queues and preprocessing: data loading, data augmentation, batchshuffling can be easily offloaded to TF threads.
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 17 / 20
Other goodies
Queues and preprocessing: data loading, data augmentation, batchshuffling can be easily offloaded to TF threads.
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 17 / 20
Other goodies
Queues and preprocessing: data loading, data augmentation, batchshuffling can be easily offloaded to TF threads.
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 17 / 20
Other goodies
Queues and preprocessing: data loading, data augmentation, batchshuffling can be easily offloaded to TF threads.
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 17 / 20
Other goodies
Queues and preprocessing: data loading, data augmentation, batchshuffling can be easily offloaded to TF threads.
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 17 / 20
Other goodies
Queues and preprocessing: data loading, data augmentation, batchshuffling can be easily offloaded to TF threads.
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 17 / 20
Other goodies
Queues and preprocessing: data loading, data augmentation, batchshuffling can be easily offloaded to TF threads.
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 17 / 20
Other goodies
Queues and preprocessing: data loading, data augmentation, batchshuffling can be easily offloaded to TF threads.
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 17 / 20
Other goodies
Queues and preprocessing: data loading, data augmentation, batchshuffling can be easily offloaded to TF threads.
Checkpoints: computational graph and variables can be easily saved ortransferred across the network.
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 17 / 20
TensorFlow Serving
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 18 / 20
TensorFlow Serving
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 18 / 20
TensorFlow Serving
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 18 / 20
Comparison
Caffe Theano Torch DL4J TensorFlow
RNN kind of Yes Yes Yes Yes
multi-GPU C++ only Yes Yes Yes Yes
multi-node No No kind of Yes Yes
API C++, Python Python Lua Java C++,Matlab Python
autodiff No Yes Recently No Yes
extensibility No Yes Yes ? Yes
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 19 / 20
Thank you!
Thank you for your time!Any questions?
Contact info: [email protected]
Konstantin Shmelkov (Grenoble Data Science Meetup)TensorFlow: what and why? 14 Sep 2016 20 / 20