nn – cont

Post on 12-Jan-2016

62 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

NN – cont. Alexandra I. Cristea. USI intensive course “Adaptive Systems” April-May 200 3. We have seen how the neuron computes, let’s see What it can compute? How it can learn?. What does the neuron compute?. Perceptron, discrete neuron. First, simple case: no hidden layers - PowerPoint PPT Presentation

TRANSCRIPT

NN – cont.

Alexandra I. CristeaUSI intensive course “Adaptive Systems” April-May 2003

• We have seen how the neuron computes, let’s see– What it can compute?– How it can learn?

What does the neuron compute?

Perceptron, discrete neuron

• First, simple case: – no hidden layers– Only one neuron

– Get rid of threshold – b becomes w0

– Y – Boolean function : > 0 fires 0 doesn’t fire

Threshold function f

f

(w0 = - t = -1)

t=1

1f

Y = X1 or X2

W1=1 W2= 1

X1X2

0 0 1

1 1 1

0 1Y

X1 X2

t=1

1f

Y = X1 and X2

W1= 0,5 W2= 0,5

X1X2

0 0 0

1 0 1

0 1Y

X1 X2

t=1

1f

Y = or(x1,…,xn)

w1=w2=…=wn=1t=1

1f

Y = and(x1,…,xn)

w1=w2=…=wn=1/nt=1

1f

What are we actually doing?

X1X2

0 -1 1

1 1 1

0 1Y

X1X2

0 0 0

1 0 1

0 1Y

X1X2

0 0 1

1 1 1

0 1Y

w0+w1*X1+w2*X2

W 0=-1; W1 = 7; W2= 9

W 0=-1; W1 = 0,7; W2= 0,9

W 0=1; W1 = 7; W2= 9

X1

X2

x1

x2

w0+w1*x1+w2*x2

w0= - 1w1= - 0,67w2= 1

Linearly Separable Set

w0+w1*x1+w2*x2

Linearly Separable Set

x1

x2

w0= - 1w1= 0,25w2= - 0,1

w0+w1*x1+w2*x2

Linearly Separable Set

x1

x2

w0= - 1w1= 0,25w2= 0,04

w0+w1*x1+w2*x2

Linearly Separable Set

x1

x2

w0= - 1w1= 0,167w2= 0,1

Non-linearly separable Set

w0+w1*x1+w2*x2

Non Linearly Separable Set

x1

x2

w0=w1=w2=

w0+w1*x1+w2*x2Non Linearly Separable Set

x1

x2

w0=w1=w2=

w0+w1*x1+w2*x2Non Linearly Separable Set

x1

x2

w0=w1=w2=

w0+w1*x1+w2*x2Non Linearly Separable Set

x1

x2

w0=w1=w2=

Perceptron Classification Theorem

A finite set X can be classified correctly by a one-layer perceptron if and only if it is linearly separable.

w0+w1*x1+w2*x2

Typical non-linearly separable set: Y=XOR(x1,x2)

x1

x20,0 1,0

0,1 1,1

Y=1Y=0

How does the neuron learn?

Learning: weight computation• W1* ( X1 =1)+ W 2 * ( X2= 1)>=(t=

1)• W1* ( X1 =0)+ W 2 * ( X2= 1)<(t=1)• W1* ( X1 =1)+ W 2 * ( X2= 0)<(t=1)• W1* ( X1 =0)+ W 2 * ( X2= 0)<(t=1)

X2

X1

W1*X1 + W2*X2

Perceptron Learning Ruleincremental version

FOR i:= 0 TO n DO wi:=random initial value ENDFOR;

REPEAT select a pair (x,t) in X; (* each pair must have a positive probability of

being selected *) IF wT * x' > 0 THEN y:=1 ELSE y:=0 ENDIF; IF y t THEN

FOR i:= 0 TO n DO wi:= wi + (t-y) xi' ENDFOR ENDIF;

UNTIL X is correctly classified

ROSENBLATT (1962)

Idea Perceptron Learning Rule

w

x’

wnew wnew=w + x’ t=1y=0 (wTx’0)

wniew

x’

w

x’ x’

wnew=w - x’

wi:= wi + (t-y) xi'

w changes in the w changes in the direction of the input direction of the input

+ -

t=0y=1 (wTx’>0)

For multi-layered perceptrons w. continuous neurons, a simple and successful learning algorithm exists.

BKP:ErrorBKP:Error

Input Output

Hidden layery1、 d 1

y2、 d 2

y3、 d 3

y4、 d 4

e1=d1 - y1

e2=d2 - y2

e3=d3 - y3

e4=d4 - y4

Hidden Hidden layerlayererrorerror ??

Synapse

W : weight

neuron1 neuron2

y1value

y2 = w*y1value

Value (y1,y2)= Internal activation

Forward propagation

Weight serves as amplifier!Weight serves as amplifier!

Inverse Synapse

W : weight

neuron1 neuron2

e1=????value

e2value

Value(e1,e2)= Error

Backward propagation

Weight serves as amplifier!Weight serves as amplifier!

Inverse Synapse

W : weight

neuron1 neuron2

e1=ww ** e2e2value

e2value

Value(e1,e2)= Error

Backward propagation

Weight serves as amplifier!Weight serves as amplifier!

BKP:ErrorBKP:Error

Input Output

Hidden layery1、 d 1

y2、 d 2

y3、 d 3

y4、 d 4

e1=d1 - y1

e2=d2 - y2

e3=d3 - y3

e4=d4 - y4

Hidden Hidden layerlayererrorerror ??

O2 O1I1 O2,I2

Backpropagation to hidden layerBackpropagation to hidden layer

w1

w3

w2Input

I1Output

O1

Hidden layer

ee [ j ] = ie [ i ]w[ j,i ]Backpropagation :

e 1

e 2

e 3O2,I2

Update rule for 2 weight typesUpdate rule for 2 weight types

• ① I2 ( hidden layer ) , O1 ( system output )• ② I1 ( system input ) , O2 ( hidden layer )

① Δ w =α(d[i]-y[i]) f’(S[i])f(S[i]) = =αe[i] f(S[i]) (simplification (simplification f’f’=1 for repeater, e.g.)=1 for repeater, e.g.)

S[i] = jw[j, i ](t)h[j]

② Δ w =α ( ie[i] w [j,i] ) f’(S[j])f(S[j]) =α ee[j]f(S[j]) S[j] = kw[k,j](t)x[k]

Backpropagation algorithmFOR s := 1 TO r DO Ws := initial matrix(often random);

REPEAT

select a pair (x,t) in X; y0:=x; # forward phase: compute the actual output ys of the network with input x

FOR s := 1 TO r DO ys := F(Ws ys-1) END; # yr is the output vector of the network # backpropagation phase: propagate the errors back through the network # and adapt the weights of all layers

dr := Fr’ (t - yr) ;

FOR s := r TO 2 DO ds-1 := Fs-1' WsT ds;

Ws := Ws + ds ys-1T; END;

W1 := W1 + d1 y0T

UNTIL stop criterion

Conclusion

• We have seen binary function representation with single layer perceptron

• We have seen a learning algorithm for SLP

• We have seen a learning algorithm for MLP (BP)

• So, neurons can represent knowledge AND learn!

top related