feed-forward mapping networks - kaistraphe.kaist.ac.kr/lecture/2012springbis427/07feed... · 2013....

51
Feed-Forward mapping networks KAIST 바이오및뇌공학과 정재승

Upload: others

Post on 21-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Feed-Forward mapping networks

KAIST

바이오및뇌공학과

정재승

Page 2: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

How much energy do we need for brain functions?

Page 3: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Information processing: Trade-off between energy consumption and wiring cost

Page 4: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Trade-off between energy consumption (wiring cost) and maximal information processing

Page 5: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Optical character recognition (OCR)

• OCR is the process of optically scanning an image and interpreting this digital image so that the computer understands its meaning.

Page 6: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Two feed-forward process

• The perception of a letter, the physical sensing of an image of a letter.

• Process attaching meaning to such an image.

Page 7: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

The perception of a letter

• Sensory feature vector: the number of feature values defines the dimensionality of the feature space.

Page 8: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Mapping functions: the recognition process to a vector function

Page 9: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Perceptron

• The perceptron is a type of artificial neural network invented in 1957 at the Cornell Aeronautical Laboratory by Frank Rosenblatt. It can be seen as the simplest kind of feed-forward neural network: a linear classifier.

Page 10: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Population node as perceptron

Linear perceptron

Page 11: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Output manifold of population node with two input channels

Page 12: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Boolean algebra

• Boolean algebra (or Boolean logic) is a logical calculus of truth values, developed by George Boole in the 1840s.

• These turn out to coincide with the set of all operations on the set {0,1} that take only finitely many arguments; there are 22n such operations when there are n arguments.

Page 13: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Boolean algebra

• It resembles the algebra of real numbers, but with the numeric operations of multiplication xy, addition x + y, and negation −x replaced by the respective logical operations of conjunction x∧y, disjunction x∨y, and negation ¬x.

• The Boolean operations are these and all other operations that can be built from these, such as x∧(y∨z).

Page 14: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Boolean functions: the threshold node

Page 15: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Look-up table, graphical representation, and single threshold population nodes

Page 16: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

The history of perceptron

• A feed-forward neural network with two or more layers (i.e., a multilayer perceptron) had far greater processing power than perceptrons with one layer (i.e., a single layer perceptron).

• Single layer perceptrons are only capable of learning linearly separable patterns; in 1969 a famous book entitled Perceptrons by Marvin Minsky and Seymour Papert showed that it was not possible for these classes of network to learn an XOR function.

Page 17: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

The history of perceptron

• Both Minsky and Papert already knew that multi-layer perceptrons were capable of producing an XOR Function.

• Three years later, Stephen Grossberg published a series of papers introducing networks capable of modelling differential, contrast-enhancing and XOR functions.

Page 18: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

The history of perceptron

• Nevertheless the often-miscited Minsky/Papert text caused a significant decline in interest and funding of neural network research.

• It took ten more years until neural network research experienced a resurgence in the 1980s.

• This text was reprinted in 1987 as "Perceptrons - Expanded Edition" where some errors in the original text are shown and corrected.

Page 19: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

• The number of nonlinear separable functions grows rapidly with the dimension of the feature space and soon outgrows the number of linear separable functions

Page 20: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Learning delta rule

• The delta rule is a gradient descent learning rule for updating the weights of the artificial neurons in a single-layer perceptron.

• The weight matrix will be changed by small amounts in an attempt to find a better answer.

Page 21: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Supervised learning

• Given examples

RN 0,1

x1 y1

x2 y2

x3 y3

• Find perceptron such that

ya H wT xa

Page 22: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Example: handwritten digits

• Find a perceptron that detects “two”s.

Page 23: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Delta rule

• Learning from mistakes.

• “delta”: difference between desired and actual output.

• Also called “perceptron learning rule”

w yH wT x x

Page 24: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Two types of mistakes

• False positive

– Make w less like x.

• False negative

– Make w more like x.

• The update is always proportional to x.

y 0, H wT x 1

y 1, H wT x 0

w x

w x

Page 25: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Objective function

• Gradient update

• Stochastic gradient descent on

• E=0 means no mistakes.

e w,x,y yH wT x wT x

w e

w

E w e w,x,y

Page 26: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

If examples are nonseparable

• The delta rule does not converge.

• Objective function is not equal to the number of mistakes.

• No reason to believe that the delta rule minimizes the number of mistakes.

Page 27: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Contrast with Hebb rule

• Assume that the teacher can drive the perceptron to produce the desired output.

• What are the objective functions?

w yx

w y y x

Perceptron learning rule Hebb rule

Page 28: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Is the delta rule biological?

• Actual output: anti-Hebbian

• Desired output: Hebbian

• Contrastive

w yx

w H wT x x

Page 29: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer
Page 30: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Objective function

• Hebb rule

– distance from inputs

• Delta rule

– error in reproducing the output

Page 31: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Multi-layer Perceptrons

Page 32: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Multilayer Perceptrons: Architecture

Input

layer

Output

layer

Hidden Layers

Page 33: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

A solution for the XOR problem

1

1

-1

-1

x1

x2

x1 x2 x1 xor x2

-1 -1 -1

-1 1 1

1 -1 1

1 1 -1

+1

+1 +1

+1 -1

-1

x1

x2

1 if v > 0

(v) =

-1 if v 0

is the sign function.

-1

-1

0.1

Page 34: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

NEURON MODEL

• Sigmoidal Function

• induced field of neuron j

• Most common form of activation function

• a threshold function

• Differentiable

jave

1

1

j)(v

-10 -8 -6 -4 -2 2 4 6 8 10

jv

)( jv 1

Increasing a

i

,...,0

jijv ywmi

jv

Page 35: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Learning Algorithms

• Back-propagation algorithm

• It adjusts the weights of the NN in order to minimize the average squared error.

Function signals Forward Step

Error signals Backward Step

Page 36: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Average Squared Error

• Error signal of output neuron j at presentation of n-th training example:

• Total energy at time n:

• Average squared error:

• Measure of learning

performance:

• Goal: Adjust weights of NN to minimize EAV

(n)y-(n)d(n)e jjj

(n)eE(n)Cj

2

j2

1

N

1nN

1

AV (n)EE

C: Set of neurons in output layer

N: size of training set

Page 37: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Notation

je

jy

i

,...,0

jijv ywmi

Induced local field of neuron j

Error at output of neuron j

Output of neuron j

Page 38: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Weight Update Rule

ji

jiw

-w

E Step in direction opposite to the gradient

Update rule is based on the gradient descent method take a step in the direction yielding the maximum decrease of E

With weight associated to the link from neuron i to neuron j

jiw

Page 39: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Definition of the Local Gradient of neuron j

j

jv

-

E Local Gradient

)v(e jjj We obtain because

)v(')1(ev

y

y

e

evj

j

j

j

j

jj

j

EE

Page 40: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Update Rule

ij y jiw

i

jy

v

jiw

ji

j

jji w

v

vw

EE

We obtain

because

j

j

E

v

Page 41: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Error ej of output neuron

• Single layer Perceptron: j output neuron

jjj y-de

)(v')y-d( jjj j

Then

Page 42: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer

Multi-layer Perceptron

Page 43: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer
Page 44: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer
Page 45: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer
Page 46: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer
Page 47: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer
Page 48: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer
Page 49: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer
Page 50: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer
Page 51: Feed-Forward mapping networks - KAISTraphe.kaist.ac.kr/lecture/2012springbis427/07Feed... · 2013. 10. 9. · •A feed-forward neural network with two or more layers (i.e., a multilayer