20170126 tensorflow tutorial - department of …gkumar/slides/20170126...other credits on their...

30
TensorFlow Gaurav Kumar CLSP, JHU 2017/01/26 Some content is borrowed from Kevin Duh’s presentation “Theano tutorial” @ the JHU Neural Winter School. Other credits on their respective slides.

Upload: truongkhanh

Post on 31-Mar-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

TensorFlowGaurav Kumar

CLSP, JHU

2017/01/26

Some content is borrowed from Kevin Duh’s presentation “Theano tutorial” @ the JHU Neural Winter School. Other credits on their respective slides.

Remember computation graphs?

Remember computation graphs?

Remember computation graphs?

Under the hood : TensorFlow

Figurefrom:h-p://learningsys.org/papers/LearningSys_2015_paper_33.pdf

TensorFlow• Represents computations as graphs

1. Nodes in the graph are operations (called ops)

2. Edges in this graph are tensors representing data in and out

3. A Node may take zero or more tensors and produce zero or more tensors

https://www.tensorflow.org/get_started/basic_usage

TensorFlow Graph• A TensorFlow graph is a description of

computations

1. A graph must be launched in a Session

2. A Session is placed on a Device (CPU, GPU)

3. Nodes (Ops) in the graph take tensor input and produce tensor output.

https://www.tensorflow.org/get_started/basic_usage

TensorFlow : Features• Other Features

• Autodiff

• Helper functions for ingesting and pre-processing data, neural network operations, activations, loss functions, optimizers

• Training visualization (TensorBoard)

• APIs for C, C++ and Python

• Flexible multi-device placement (more later)

• Clean methods for session saving and restoration

TensorFlow : Usage• Represents computations as graphs

• Executes graphs in the context of Sessions

• Represents data as tensors

• Maintains state with Variables

• Uses Feeds and Fetches to get data into and out of arbitrary operations

Repository for this session

• https://github.com/noisychannel/tensorflow_tutorial

Installation

https://github.com/noisychannel/tensorflow_tutorial/blob/master/Makefile

install_venv:# Install a virtual environmentvirtualenv tf_cpu# Activate the virutal environmentsource tf_cpu/bin/activate# Install tensorflowexport TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/cpu/

tensorflow-0.12.1-cp27-none-linux_x86_64.whlpip install $TF_BINARY_URL

install_no_venv:# Install tensorflowexport TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/cpu/

tensorflow-0.12.1-cp27-none-linux_x86_64.whlpip install --user $TF_BINARY_URL

Quickstart

"""A quick start for TensorflowComputes y = \sum{W * x}Adapted for tensorflow from http://www.marekrei.com/blog/theano-tutorial/"""

import tensorflow as tfimport numpy as np

Placeholders

# A placeholder is tensorflow-talk for a container that will be filled later# In this case, this is a container for values to be provided when evaluating y# You need to specify the data type and shape for the values that will# eventually end up in this placeholder

x = tf.placeholder(tf.float32, shape=(2,), name='x')

Variables

# A Variable is another type of container for values that is initialized with values# You will typically use these for your model parameters

W = tf.Variable(tf.constant([0.2, 0.7]), name='W')

Operations

# A symbolic mathematical operation which operates# on tensors

y = tf.reduce_sum(x * W)

Build Graph : Initialize variables

#Firstinitializeallvariables(putvaluesinvariablecontainerswedefinedabove)#Alwaysrunthisbeforebuildingthecomputationgraph

init=tf.global_variables_initializer()

Build Graph : Create Session

#Createandlaunchthegraphwithinasessionnow#Multiplesessionscanbeplacedonmultipledevicesbut#we’llleavethiscomplexityforlater

sess=tf.Session()

sess.run(init)

Run session, feed dictionaries

# We want the value of y. We will use sess.run to get it.# However, remember that y depends on x which is currently empty# We will fill all dependencies (placeholders) for the expression# we want to evaluate with a dictionary called feed_dict# and pass it to sess.run()

print sess.run(y, feed_dict={x: [1.0, 1.0]})

A complete viewimport tensorflow as tfimport numpy as np

x = tf.placeholder(tf.float32, shape=(2,), name='x')

W = tf.Variable(tf.constant([0.2, 0.7]), name='W')

y = tf.reduce_sum(x * W)

init = tf.global_variables_initializer()

sess = tf.Session()sess.run(init)

print sess.run(y, feed_dict={x: [1.0, 1.0]})

Linear Regression

Image from https://en.wikipedia.org/wiki/Linear_regression

Linear Regression in TensorFlow

• https://github.com/noisychannel/tensorflow_tutorial/blob/master/line_fit.py

Sessions and Devices• By default, TensorFlow will use the first available

CPU/GPU.

• If you have multiple GPUs, you could distribute sessions across multiple devices (model/data parallelism)

with tf.Session() as sess: with tf.device("/gpu:1"): matrix1 = tf.constant([[3., 3.]]) matrix2 = tf.constant([[2.],[2.]]) product = tf.matmul(matrix1, matrix2) ...

Sessions and Devices• Devices are specified with strings

• “/cpu:0”: The CPU of your machine

• “/gpu:0”: The GPU of your machine

• “/gpu:1”: The second GPU of your machine

with tf.Session() as sess: with tf.device("/gpu:1"): matrix1 = tf.constant([[3., 3.]]) matrix2 = tf.constant([[2.],[2.]]) product = tf.matmul(matrix1, matrix2) ...

Saving and restoring sessions

• Tensorflow provides methods for saving sessions.

• To start, initialize a Saver object saver = tf.train.Saver()

• Call saver.save within your training loop. This will write a checkpoint file to the train_dir

saver.save(sess, FLAGS.train_dir, global_step=step)

• To restore a session from a file, call saver.restore saver.restore(sess, FLAGS.train_dir)

TensorBoard• Sometimes it helps to visualize stuff.

• Here’s how you would use TensorBoard to do this.

• Use scalar_summary to record the operation output that you are interested in:

loss = tf.reduce_mean(cross_entropy, name='xentropy_mean')

tf.scalar_summary(loss.op.name, loss)

TensorBoard• Compile all summaries summary = tf.merge_all_summaries()

• Write the summaries to a file with SummaryWriter summary_writer = tf.train.SummaryWriter(FLAGS.train_dir, sess.graph)

• Periodically add updates to the SummaryWriter object.

summary_str = sess.run(summary, feed_dict=feed_dict)summary_writer.add_summary(summary_str, step)

TensorBoard• Use TensorBoard to open the summary file.

https://www.tensorflow.org/tutorials/mnist/tf/

TensorFlow vs. Theano• Both use static graph declarations

• Faster compile times compared to Theano

• Streamlined saving/restoration in TensorFlow

• Data/Model parallelism across multiple devices is easier with TensorFlow.

• TensorBoard visualization

• Theano has more pre-trained models and open source implementations of models.

• Dynamic computation graphs are hard for both TensorFlow and Theano.

• Debugging in TensorFlow is cumbersome.

• Tensorflow has a bigger user community and developer base.

• Tensorflow uses Eigen (instead of BLAS) which is easier to port across device types and architectures.

TensorFlow vs. Theano

• Read more here (maybe slightly dated):

• https://deeplearning4j.org/compare-dl4j-torch7-pylearn#tensorflow

• https://github.com/zer0n/deepframeworks/blob/master/README.md

Exercise: Feed-forward NLM

Stub code in repository : nlm_stub.py