rnns and tensorflow
TRANSCRIPT
![Page 1: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/1.jpg)
Recurrent Neural NetworkThe Deepest of Deep Learning
Chen Liang
![Page 2: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/2.jpg)
Deep Learning
![Page 3: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/3.jpg)
Deep Learning
![Page 4: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/4.jpg)
Deep learning works like human brain?
Demystify Deep Learning
![Page 5: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/5.jpg)
Deep learning works like human brain?
Demystify Deep Learning
![Page 6: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/6.jpg)
Deep learning works like human brain?
Demystify Deep Learning
![Page 7: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/7.jpg)
Deep Learning: Building Blocks
![Page 8: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/8.jpg)
Deep Learning: Deep Composition
![Page 9: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/9.jpg)
Deep Learning: Gradient Descent
![Page 10: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/10.jpg)
Deep Learning: Gradient Descent
![Page 12: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/12.jpg)
Deep Learning: Weight Sharing
![Page 13: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/13.jpg)
Recurrent Neural NetworkDeepest of Deep learning?
● Can be infinitely deep
Equation
![Page 14: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/14.jpg)
BPTT: Backpropagation Through Time
![Page 15: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/15.jpg)
Recurrent Neural NetworkShort term dependency
Long term dependency
Exploding/vanishing gradient
![Page 16: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/16.jpg)
LSTM: Long-Short Term Memory Add a direct pathway for gradient
![Page 17: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/17.jpg)
LSTM: Long-Short Term Memory Forget gate
![Page 18: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/18.jpg)
LSTM: Long-Short Term Memory Input gate
![Page 19: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/19.jpg)
LSTM: Long-Short Term Memory Update memory using forget gate and input gate
![Page 20: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/20.jpg)
LSTM: Long-Short Term Memory Output gate
![Page 21: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/21.jpg)
LSTM: Long-Short Term Memory Putting together
![Page 22: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/22.jpg)
RNN: A General Framework
Machine Translation
Speech recognition
Language ModelingSentiment Analysis
Image Recognition
Image Caption Generation
![Page 23: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/23.jpg)
Char-RNNHow it works?
Vocabulary:
[“h”, “e”, “l”, “o”]
Training sequence:
“hello”
![Page 24: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/24.jpg)
Char-RNNLinux
Latex
Wikipedia
Music
…
Check out the blog:
The Unreasonable Effectiveness of RNN
![Page 25: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/25.jpg)
TensorFlowTensorFlow™ is an open source software library for numerical computation using data flow graphs.
![Page 26: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/26.jpg)
import tensorflow as tfimport numpy as np
# Create 100 phony x, y data points in NumPy, y = x * 0.1 + 0.3x_data = np.random.rand(100).astype(np.float32)y_data = x_data * 0.1 + 0.3
# Try to find values for W and b that compute y_data = W * x_data + b# (We know that W should be 0.1 and b 0.3, but Tensorflow will# figure that out for us.)W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))b = tf.Variable(tf.zeros([1]))y = W * x_data + b
# Minimize the mean squared errors.loss = tf.reduce_mean(tf.square(y - y_data))optimizer = tf.train.GradientDescentOptimizer(0.5)train = optimizer.minimize(loss)
TensorFlow: Computation Graph
Import TensorFlow and NumPy
![Page 27: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/27.jpg)
import tensorflow as tfimport numpy as np
# Create 100 phony x, y data points in NumPy, y = x * 0.1 + 0.3x_data = np.random.rand(100).astype(np.float32)y_data = x_data * 0.1 + 0.3
# Try to find values for W and b that compute y_data = W * x_data + b# (We know that W should be 0.1 and b 0.3, but Tensorflow will# figure that out for us.)W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))b = tf.Variable(tf.zeros([1]))y = W * x_data + b
# Minimize the mean squared errors.loss = tf.reduce_mean(tf.square(y - y_data))optimizer = tf.train.GradientDescentOptimizer(0.5)train = optimizer.minimize(loss)
TensorFlow: Computation Graph
Synthesize some noisy data from a linear model
![Page 28: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/28.jpg)
import tensorflow as tfimport numpy as np
# Create 100 phony x, y data points in NumPy, y = x * 0.1 + 0.3x_data = np.random.rand(100).astype(np.float32)y_data = x_data * 0.1 + 0.3
# Try to find values for W and b that compute y_data = W * x_data + b# (We know that W should be 0.1 and b 0.3, but Tensorflow will# figure that out for us.)W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))b = tf.Variable(tf.zeros([1]))y = W * x_data + b
# Minimize the mean squared errors.loss = tf.reduce_mean(tf.square(y - y_data))optimizer = tf.train.GradientDescentOptimizer(0.5)train = optimizer.minimize(loss)
W x_data
*
+
b
TensorFlow: Computation Graph
![Page 29: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/29.jpg)
TensorFlow: Computation Graphimport tensorflow as tfimport numpy as np
# Create 100 phony x, y data points in NumPy, y = x * 0.1 + 0.3x_data = np.random.rand(100).astype(np.float32)y_data = x_data * 0.1 + 0.3
# Try to find values for W and b that compute y_data = W * x_data + b# (We know that W should be 0.1 and b 0.3, but Tensorflow will# figure that out for us.)W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))b = tf.Variable(tf.zeros([1]))y = W * x_data + b
# Minimize the mean squared errors.loss = tf.reduce_mean(tf.square(y - y_data))optimizer = tf.train.GradientDescentOptimizer(0.5)train = optimizer.minimize(loss)
W x_data
*
+
b
LossOptimizer
y_data
![Page 30: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/30.jpg)
TensorFlow: Session# Before starting, initialize the variables. We will 'run' this first.init = tf.initialize_all_variables()
# Launch the graph.sess = tf.Session()sess.run(init)
# Fit the line.for step in xrange(201): sess.run(train) if step % 20 == 0: print(step, sess.run(W), sess.run(b))
# Learns best fit is W: [0.1], b: [0.3]
![Page 31: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/31.jpg)
TensorFlow: Session# Before starting, initialize the variables. We will 'run' this first.init = tf.initialize_all_variables()
# Launch the graph.sess = tf.Session()sess.run(init)
# Fit the line.for step in xrange(201): sess.run(train) if step % 20 == 0: print(step, sess.run(W), sess.run(b))
# Learns best fit is W: [0.1], b: [0.3]
![Page 32: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/32.jpg)
Tensorboard Demo
![Page 33: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/33.jpg)
Tensorboard Demo
![Page 34: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/34.jpg)
Tensorboard Demo
![Page 35: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/35.jpg)
Now the part that everybody hates...
![Page 36: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/36.jpg)
Jon Snow is dead
![Page 37: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/37.jpg)
HomeworkPart 1: Backpropagation and gradient check
● NumPy
Part 2: Char-RNN
● Undergrad/Grad Descent○ Gradient descent => graduate descent
○ Systematic search of hyperparameters
● Do something fun with it!
Use gradient to find the best parameters
Use graduate student to find the best hyperparameters
![Page 38: RNNs and Tensorflow](https://reader033.vdocuments.site/reader033/viewer/2022052302/58a30bfe1a28abac578c2927/html5/thumbnails/38.jpg)
ReferencesChristopher Colah’s Blog: http://colah.github.io/
Andrej Karpathy’s Blog: http://karpathy.github.io/2015/05/21/rnn-effectiveness/
David Silver’s Talk: http://videolectures.net/rldm2015_silver_reinforcement_learning/
Geoffrey Hinton’s Coursera Talk:
https://class.coursera.org/neuralnets-2012-001/lecture