20170126 tensorflow tutorial - department of …gkumar/slides/20170126...other credits on their...
TRANSCRIPT
TensorFlowGaurav Kumar
CLSP, JHU
2017/01/26
Some content is borrowed from Kevin Duh’s presentation “Theano tutorial” @ the JHU Neural Winter School. Other credits on their respective slides.
TensorFlow• Represents computations as graphs
1. Nodes in the graph are operations (called ops)
2. Edges in this graph are tensors representing data in and out
3. A Node may take zero or more tensors and produce zero or more tensors
https://www.tensorflow.org/get_started/basic_usage
TensorFlow Graph• A TensorFlow graph is a description of
computations
1. A graph must be launched in a Session
2. A Session is placed on a Device (CPU, GPU)
3. Nodes (Ops) in the graph take tensor input and produce tensor output.
https://www.tensorflow.org/get_started/basic_usage
TensorFlow : Features• Other Features
• Autodiff
• Helper functions for ingesting and pre-processing data, neural network operations, activations, loss functions, optimizers
• Training visualization (TensorBoard)
• APIs for C, C++ and Python
• Flexible multi-device placement (more later)
• Clean methods for session saving and restoration
TensorFlow : Usage• Represents computations as graphs
• Executes graphs in the context of Sessions
• Represents data as tensors
• Maintains state with Variables
• Uses Feeds and Fetches to get data into and out of arbitrary operations
Repository for this session
• https://github.com/noisychannel/tensorflow_tutorial
Installation
https://github.com/noisychannel/tensorflow_tutorial/blob/master/Makefile
install_venv:# Install a virtual environmentvirtualenv tf_cpu# Activate the virutal environmentsource tf_cpu/bin/activate# Install tensorflowexport TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/cpu/
tensorflow-0.12.1-cp27-none-linux_x86_64.whlpip install $TF_BINARY_URL
install_no_venv:# Install tensorflowexport TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/cpu/
tensorflow-0.12.1-cp27-none-linux_x86_64.whlpip install --user $TF_BINARY_URL
Quickstart
"""A quick start for TensorflowComputes y = \sum{W * x}Adapted for tensorflow from http://www.marekrei.com/blog/theano-tutorial/"""
import tensorflow as tfimport numpy as np
Placeholders
# A placeholder is tensorflow-talk for a container that will be filled later# In this case, this is a container for values to be provided when evaluating y# You need to specify the data type and shape for the values that will# eventually end up in this placeholder
x = tf.placeholder(tf.float32, shape=(2,), name='x')
Variables
# A Variable is another type of container for values that is initialized with values# You will typically use these for your model parameters
W = tf.Variable(tf.constant([0.2, 0.7]), name='W')
Build Graph : Initialize variables
#Firstinitializeallvariables(putvaluesinvariablecontainerswedefinedabove)#Alwaysrunthisbeforebuildingthecomputationgraph
init=tf.global_variables_initializer()
Build Graph : Create Session
#Createandlaunchthegraphwithinasessionnow#Multiplesessionscanbeplacedonmultipledevicesbut#we’llleavethiscomplexityforlater
sess=tf.Session()
sess.run(init)
Run session, feed dictionaries
# We want the value of y. We will use sess.run to get it.# However, remember that y depends on x which is currently empty# We will fill all dependencies (placeholders) for the expression# we want to evaluate with a dictionary called feed_dict# and pass it to sess.run()
print sess.run(y, feed_dict={x: [1.0, 1.0]})
A complete viewimport tensorflow as tfimport numpy as np
x = tf.placeholder(tf.float32, shape=(2,), name='x')
W = tf.Variable(tf.constant([0.2, 0.7]), name='W')
y = tf.reduce_sum(x * W)
init = tf.global_variables_initializer()
sess = tf.Session()sess.run(init)
print sess.run(y, feed_dict={x: [1.0, 1.0]})
Linear Regression
Image from https://en.wikipedia.org/wiki/Linear_regression
Linear Regression in TensorFlow
• https://github.com/noisychannel/tensorflow_tutorial/blob/master/line_fit.py
Sessions and Devices• By default, TensorFlow will use the first available
CPU/GPU.
• If you have multiple GPUs, you could distribute sessions across multiple devices (model/data parallelism)
with tf.Session() as sess: with tf.device("/gpu:1"): matrix1 = tf.constant([[3., 3.]]) matrix2 = tf.constant([[2.],[2.]]) product = tf.matmul(matrix1, matrix2) ...
Sessions and Devices• Devices are specified with strings
• “/cpu:0”: The CPU of your machine
• “/gpu:0”: The GPU of your machine
• “/gpu:1”: The second GPU of your machine
with tf.Session() as sess: with tf.device("/gpu:1"): matrix1 = tf.constant([[3., 3.]]) matrix2 = tf.constant([[2.],[2.]]) product = tf.matmul(matrix1, matrix2) ...
Saving and restoring sessions
• Tensorflow provides methods for saving sessions.
• To start, initialize a Saver object saver = tf.train.Saver()
• Call saver.save within your training loop. This will write a checkpoint file to the train_dir
saver.save(sess, FLAGS.train_dir, global_step=step)
• To restore a session from a file, call saver.restore saver.restore(sess, FLAGS.train_dir)
TensorBoard• Sometimes it helps to visualize stuff.
• Here’s how you would use TensorBoard to do this.
• Use scalar_summary to record the operation output that you are interested in:
loss = tf.reduce_mean(cross_entropy, name='xentropy_mean')
tf.scalar_summary(loss.op.name, loss)
TensorBoard• Compile all summaries summary = tf.merge_all_summaries()
• Write the summaries to a file with SummaryWriter summary_writer = tf.train.SummaryWriter(FLAGS.train_dir, sess.graph)
• Periodically add updates to the SummaryWriter object.
summary_str = sess.run(summary, feed_dict=feed_dict)summary_writer.add_summary(summary_str, step)
TensorBoard• Use TensorBoard to open the summary file.
https://www.tensorflow.org/tutorials/mnist/tf/
TensorFlow vs. Theano• Both use static graph declarations
• Faster compile times compared to Theano
• Streamlined saving/restoration in TensorFlow
• Data/Model parallelism across multiple devices is easier with TensorFlow.
• TensorBoard visualization
• Theano has more pre-trained models and open source implementations of models.
• Dynamic computation graphs are hard for both TensorFlow and Theano.
• Debugging in TensorFlow is cumbersome.
• Tensorflow has a bigger user community and developer base.
• Tensorflow uses Eigen (instead of BLAS) which is easier to port across device types and architectures.
TensorFlow vs. Theano
• Read more here (maybe slightly dated):
• https://deeplearning4j.org/compare-dl4j-torch7-pylearn#tensorflow
• https://github.com/zer0n/deepframeworks/blob/master/README.md