multi layer perceptrons (mlp) course website: the back-propagation algorithm following hertz...

24
Multi Layer Perceptrons (MLP) Course website: http://horn.tau.ac.il/course06.html The back-propagation algorithm Following Hertz chapter 6

Post on 21-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6

Multi Layer Perceptrons (MLP)

Course website: http://horn.tau.ac.il/course06.html

The back-propagation algorithm

Following Hertz chapter 6

Page 2: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6

Feedforward Networks A connection is allowed from a node in layer i only to

nodes in layer i + 1. Most widely used architecture.

Conceptually, nodes at higher levels successively abstract features from preceding layers

Network Architecture

Page 3: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6

Examples of binary-neuron feed-forward networks

Page 4: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6

MLP with sigmoid transfer-functions

Page 5: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6
Page 6: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6
Page 7: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6
Page 8: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6
Page 9: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6
Page 10: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6
Page 11: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6

The backprop algorithm

Initialize weights to small random numbers Choose pattern and apply to input layer Propagate signals forward through the network Compute deltas for output layer by comparing actual outputs with

desired ones. Compute deltas for preceding layers by backpropagating errors Update all weights Repeat from step 2 for next pattern

Page 12: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6

Application to data

Data divided into training-set and test-set BP is based on minimizing error on train-set Generalization error is the error on the test-set Further training may lead to an increase in

generalization error – over-training Know when to stop… can use cross-validation

set (mini-test-set chosen out of the train-set) Constrain number of free parameters. This

helps minimizing over-training

Page 13: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6
Page 14: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6

The sun-spots problem

Page 15: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6

Time-series in lag-space

Page 16: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6

The MLP network and the cost function with complexity term

Page 17: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6
Page 18: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6
Page 19: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6
Page 20: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6
Page 21: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6

First hidden layer – the resulting ‘receptive fields’

Page 22: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6

The second hidden layer

Page 23: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6
Page 24: Multi Layer Perceptrons (MLP) Course website:  The back-propagation algorithm Following Hertz chapter 6

Exercise No 1. Submit answers electronically to Roy by April 21st.

Consider a 2-dimensional square divided into 16 black and white sub- squares, like a 4X4 chessboard (e.g. the plane of 0<x<1 and 0<y<1 is divided into sub-squares like 0<x<.25 0<y<.25 etc).

Build a feed-forward neural network whose input is composed of the coordinate values x and y, and whose output is a binary variable corresponding to the color associated with the input point.

Suggestion: use a sigmoid function throughout the network, even for the output, upon which you are free to later impose a binary decision.

1. Explain why one needs many hidden neurons to solve this problem.2. Show how the performance of the network improves as function of the

number of training epochs. 3. Show how it improves as function of the number of input points.4. Display the 'visual fields' of the hidden neurons for your best solution.

Discuss this result. 5. Choose a random set of training points and a random set of test points.

These sets should have moderate sizes. Compute both the training error and the generalization error as function of the number of training epochs.

6. Comment on any further insights you may have from this exercise.