neural turing machines

25
A summary of Neural Turing Machines(NTM)

Upload: yuzuru-kato

Post on 17-Jul-2015

1.488 views

Category:

Engineering


2 download

TRANSCRIPT

A summary of Neural Turing Machines(NTM)

This is a brief summary of the paper

“Neural Turing Machines”http://arxiv.org/abs/1410.5401

Written byA. GravesG. Wayne

I. DanihelkaGoogle DeepMind, London UK

“Neural Turing Machines” are, in a single phrase, Neural Networks having the capability of

coupling to external memories.

The combined system is analogous to a Turing Machine.

Introduction

Neural Network

・Neural Network(NN) learns from large amount of observational data.(data is a tuple of [External Input, External Output])

Neural Network

・Recurrent Neural Network(RNN) introduces directed circles to NN,which work as a sort of internal memories.

(Current states are determined by previous states and External Input)

Recurrent Neural Network

Directed circle

Recurrent Neural Network

・”Neural Turing Machine” is NN which has the capabilityof coupling to the external memories.

(Controller is NN with parameters for coupling to external memories)

External Memory

Neural Turing Machine

・ Read/Write heads use weights to access external memory.・ Weights are determined by the parameters on controller.・ Parameters are learned from large amount of external I/O data.

N ×M matrixN locations for M size vector

N

M

Read head

Write head

e: to erase vectorsa: to add new vectors

weighted access

Controller (NN with parametersfor adjusting weights)

External Memory

How to access external memories

External Input External output

Content Addressing:Weight adjustment based on the content on the each location.

Interpolation:Determines how much we use previous weight state.

Convolutional Shift and Sharping : Weight adjustment based on the location of the memory.

How to update weight

Application

Copy

Result of copy algorithm

・ NTM learns some form of copy algorithm.・ NTM performs better than LSTM(a kind of RNN).・ Even NTM copy algorithm makes some mistakes

for long length data(as indicated by the red arrow).

NTM

・ Outputs are supposed to be a copy of targets.

Result of copy algorithm

LSTM

・ Outputs are supposed to be a copy of targets.

・ NTM learns some form of copy algorithm.・ NTM performs better than LSTM(a kind of RNN).・ Even NTM copy algorithm makes some mistakes

for long length data(as indicated by the red arrow).

How NTM uses an external memory for copy algorithm

・ All weight focus on a single location.・ Read locations exactly match the write locations.

ExternalInputs/Outputs

Adds/ReadsVectors toMemory

Write/Read Weightings

Repeat Copy

How NTM uses an external memory for repeat copy algorithm

・ All weights focus on a single location.・ Read locations are repeatedly referred by the write head.

Result of repeat copy algorithm

・ NTMs learns some form of repeated copy algorithm.

Associative Recall

Results of associate recall algorithm

・ NTM correctly produces the red box item after they see the green box item.

Dynamic N-Grams(Predicts the next bit from

N previous bits)

Results of Dynamical N-grams

・ NTM predicts the next bit almost as well as Optimal estimator.

Optimal:(N1, N0 is the number of 1,0 seen in the previous c bits)

Priority Sort

Results of Priority Sort

・Write head writes to locations according to a linear function of priority ・Read head reads from locations in increasing order.

Conclusion

・”Neural Turing Machines” are, in a single phrase, Neural Networks having the capability of coupling to external memories.

Conclusion

・ We see the capability of using external memories through the application of copy, repeat copy, associative recall, dynamical N-grams,Priority sort.

・ I refer the readers who are really interested in this summary tothe original paper(http://arxiv.org/abs/1410.5401).