neural networks - iitkhome.iitk.ac.in/~lbehera/files/lecture5_rbfn.pdf · neural networks radial...

27
Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian Institute of Technology, Kanpur Intelligent Control – p.1/??

Upload: phungphuc

Post on 15-Sep-2018

238 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Neural Networks

Radial Basis Function Networks

Laxmidhar Behera

Department of Electrical EngineeringIndian Institute of Technology, Kanpur

Intelligent Control – p.1/??

Page 2: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Radial Basis Function Networks

Radial Basis Function Networks (RBFN) consists of 3 layers

an input layer

a hidden layer

an output layer

The hidden units provide a set of functions that constitutean arbitrary basis for the input patterns.

hidden units are known as radial centers andrepresented by the vectors c1, c2, · · · , ch

transformation from input space to hidden unit spaceis nonlinear whereas transformation from hidden unitspace to output space is linear

dimension of each center for a p input network is p × 1Intelligent Control – p.2/??

Page 3: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Network Architecture

c1

c2

ch

x1

x2

x3

x4

xp

φ1

φh

φ2

w1

w2

wn

y

Inputlayer

Hidden layerof Radial

Basis functions

Outputlayer

..................

��

��

Figure 1: Radial Basis Function Network

Intelligent Control – p.3/??

Page 4: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Radial Basis functions

The radial basis functions in the hidden layer produces asignificant non-zero response only when the input fallswithin a small localized region of the input space.

Each hidden unit has its own receptive field in input space.

An input vector xi which lies in the receptive field for centercj, would activate cj and by proper choice of weights thetarget output is obtained. The output is given as

y =h∑

j=1

φjwj, φj = φ(‖x − cj‖)

wj: weight of jth center, φ: some radial function.Intelligent Control – p.4/??

Page 5: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Contd...

Different radial functions are given as follows.

Gaussian radial function φ(z) = e−z2/2σ2

Thin plate spline φ(z) = z2logzQuadratic φ(z) = (z2 + r2)1/2

Inverse quadratic φ(z) = 1(z2+r2)1/2

Here, z = ‖x − cj‖

The most popular radial function is Gaussianactivation function.

Intelligent Control – p.5/??

Page 6: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

RBFN vs. Multilayer Network

RBF NET MULTILAYER NETIt has a single hidden layer It has multiple hidden layers

The basic neuron model as The computational nodes

well as the function of the of all the layers are similar

hidden layer is different

from that of the output layer

The hidden layer is nonlinear All the layers are

but the output layer is linear nonlinear

Intelligent Control – p.6/??

Page 7: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Contd...

RBF NET MULTILAYER NETActivation function of the hidden Activation function com-

unit computes the Euclidean putes the inner product

distance between the input vec- of the input vector and

tor and the center of that unit the weight of that unit

Establishes local mapping, Constructs global appro-

hence capable of fast learning ximations to I/O mapping

Two-fold learning. Both the cen- Only the synaptic

ters (position and spread) and weights have to be

the weights have to be learned learnedIntelligent Control – p.7/??

Page 8: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Learning in RBFN

Training of RBFN requires optimal selection of theparameters vectors ci and wi, i = 1, · · ·h.

Both layers are optimized using different techniquesand in different time scales.

Following techniques are used to update the weightsand centers of a RBFN.

Pseudo-Inverse Technique (Off line)

Gradient Descent Learning (On line)

Hybrid Learning (On line)

Intelligent Control – p.8/??

Page 9: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Pseudo-Inverse Technique

This is a least square problem. Assume a fixed radialbasis functions e.g. Gaussian functions.

The centers are chosen randomly. The function isnormalized i.e. for any x,

φi = 1.

The standard deviation (width) of the radial function isdetermined by an adhoc choice.

The learning steps are as follows:1. The width is fixed according to the spread of centers

φi = e(− h

d2‖x−ci‖2), i = 1, 2, · · ·h

where h: number of centers, d: maximum distancebetween the chosen centers. Thus σ = d√

2h. Intelligent Control – p.9/??

Page 10: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Contd...

2. From figure 1, Φ = [φ1, φ2, · · · , φh]

w = [w1, w2, · · · , wh]T

Φw = yd, yd is the desired output

3. Required weight vector is computed as

w = (ΦTΦ)−1

ΦT yd = Φ

′yd

Φ′ = (ΦT

Φ)−1Φ

T is the pseudo-inverse of Φ.

This is possible only when ΦTΦ is non-singular. If this

is singular, singular value decomposition is used tosolve for w.

Intelligent Control – p.10/??

Page 11: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Illustration: EX-NOR problem

The truth table and the RBFN architecture are given below:

x1 x2 yd

0 0 10 1 01 0 01 1 1

c1

c2

φ1

φ2

����

��

x1

x2

w1

w2y

bias = +1

θ

Choice of centers is made randomly from 4 input patterns.

c1 = [0 0]T and c2 = [1 1]T

φ1 = φ(‖x − c1‖) = e−‖x−c1‖2

Similarly, φ2 = e−‖x−c2‖2

, x = [x1 x2]T

Intelligent Control – p.11/??

Page 12: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Contd...

Output y = w1φ1 + w2φ2 + θ

Applying 4 training patterns one after another

w1 + w2e−2 + θ = 1 w1e

−1 + w2e−1 + θ = 1

w1e−1 + w2e

−1 + θ = 1 w1e−2 + w2 + θ = 1

Φ =

1 0.1353 1

0.3679 0.3679 1

0.3679 0.3679 1

0.1353 1 1

, w =

w1

w2

θ

, yd =

1

0

0

1

Using w = Φ′yd, we get w =

[

2.5031 2.5031 −1.848]T

Intelligent Control – p.12/??

Page 13: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Gradient Descent Learning

One of the most popular approaches to update c and w, issupervised training by error correcting term which isachieved by a gradient descent technique. The update rulefor center learning is

cij(t + 1) = cij(t) − η1∂E

∂cij

, for i = 1 to p, j = 1 to h

the weight update law is

wi(t + 1) = wi(t) − η2∂E

∂wi

where the cost function is E = 12

(yd − y)2

Intelligent Control – p.13/??

Page 14: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Contd...

The actual response is

y =h∑

i=1

φiwi

the activation function is taken as

φi = e−z2

i /2σ2

where zi = ‖x − ci‖, σ is the width of the center.Differentiating E w.r.t. wi, we get

∂E

∂wi

=∂E

∂y×

∂y

∂wi

= −(yd − y)φi

Intelligent Control – p.14/??

Page 15: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Contd...

Differentiating E w.r.t. cij, we get

∂E

∂cij

=∂E

∂y×

∂y

∂φi

×∂φi

∂cij

= −(yd − y) × wi ×∂φi

∂zi

×∂zi

∂cij

Now,∂φi

∂zi

= −zi

σ2φi

and∂zi

∂cij

=∂

∂cij

(∑

j

(xj − cij)2)1/2

= −(xj − cij)/ziIntelligent Control – p.15/??

Page 16: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Contd...

After simplification, the update rule for center learning is:

cij(t + 1) = cij(t) + η1(yd − y)wi

φi

σ2(xj − cij)

The update rule for the linear weights is:

wi(t + 1) = wi(t) + η2(yd − y)φi

Intelligent Control – p.16/??

Page 17: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Example: System identification

The same Surge Tank system has been taken forsimulation. The system model is given as

h(t + 1) = h(t) + T

(

−√

2gh(t)√

3h(t) + 1+

u(t)√

3h(t) + 1

)

t : discrete time stepT : sampling timeu(t) : input flow, can be positive or negativeh(t) : liquid level of the tank (output)g : the gravitational acceleration

Intelligent Control – p.17/??

Page 18: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Data generation

Sampling time is taken as 0.01 sec, 150 data have beengenerated using the system equation. The nature of inputu(t) and y(t) is shown in the following figure.

0 50 100 1500

1

2

3

4

5

6

time step

I/O

da

ta

input u(t)output h(t)

Intelligent Control – p.18/??

Page 19: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Contd...

The system is identified from the input-output datausing a radial basis function network

Network parameters are given below:

Number of inputs : 2 [(u(t), h(t)]

Number of outputs : 1 [target: h(t + 1)]

Units in the hidden layer : 30

Number of I/O data : 150

Radial Basis Function : Gaussian

Width of the radial function (σ) : 0.707

Center learning rate (η1) : 0.3

Weight learning rate (η2) : 0.2

Intelligent Control – p.19/??

Page 20: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Result: Identification

After identification, the root mean square error error isfound to be < 0.007. Convergence of mean square error isshown in the following figure.

0 200 400 600 800 10000

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

epochs

Me

an

sq

ua

re e

rro

r

Intelligent Control – p.20/??

Page 21: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Model Validation

After identification, the model is validated through a set of100 input-output data which is different from the data setused for training. The result is shown in the following figurewhere left figure represents the desired and network outputand right figure shows the corresponding input.

0 10 20 30 40 50 60 70 80 90 1000

0.2

0.4

0.6

0.8

1

time step

Ou

tpu

t y

d,

ya

DesiredActual

1 20 40 60 80 100−4

−2

0

2

4

6

8

time step (t)

Inp

ut

u(t

)

Intelligent Control – p.21/??

Page 22: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Hybrid Learning

In hybrid learning, the radial basis functions relocate theircenters in a self-organized manner while the weights areupdated using supervised learning.

When a pattern is presented to RBFN, either a new centeris grown if the pattern is sufficiently novel or the parametersin both layers are updated using gradient descent.

The test of novelty depends on two criteria:

Is the Euclidean distance between the input patternand the nearest center greater than a threshold δ(t)?

Is the mean square error at the output greater than adesired accuracy?

A new center is allocated when both criteria are satisfied.Intelligent Control – p.22/??

Page 23: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Contd...

Find out a center that is closest to x in terms of Euclideandistance. This particular center is updated as follows:

ci(t + 1) = ci(t) + α(x − ci(t))

Thus the center moves closer to x.

While centers are updated using unsupervised learning,the weights can be updated using least mean squares(LMS) or recursive least squares (RLS) algorithm. We willpresent the RLS algorithm.

Intelligent Control – p.23/??

Page 24: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Contd...

The ith input of the RBFN can be written as

x̂i = φTθi, i = 1, · · · , n

where φ ∈ Rl, the output vector of the hidden layer;θi ∈ Rl, the connection weight vector from hiddenunits to ith output unit. The weight update law:

θ̂i(t + 1) = θ̂i(t) + P (t + 1)φ(t)[xi(t + 1) − φT (t)θ̂i(t)

P (t + 1) = P (t) − P (t)φ(t)[1 + φT (t)P (t)φ(t)]−1φT (t)P (t)

where P (t) ∈ Rl×l. This algorithm is more accurateand fast compared to LMS algorithm.

Intelligent Control – p.24/??

Page 25: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Example: System Identification

The same surge tank model is identified with thesame input-output data set. The parameters ofRBFN are also same as that of gradient descenttechnique.

Center training is done using unsupervised K-meanclustering with a learning rate of 0.5.

Weight training is done using gradient descentmethod with a learning rate of 0.1.

After identification root mean square is found to be< 0.007.

Intelligent Control – p.25/??

Page 26: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Contd...

Mapping of input data and centers are shown in thefollowing figure. Left figure shows how the centers areinitialized and right figure shows the spreading of centerstowards the nearest inputs after learning.

−4 −2 0 2 4 6 8 100

0.5

1

1.5

2

2.5InputCenter

Before Training

−4 −2 0 2 4 6 8 100

0.5

1

1.5

2

2.5InputCenter

After Training

Intelligent Control – p.26/??

Page 27: Neural Networks - IITKhome.iitk.ac.in/~lbehera/Files/Lecture5_RBFN.pdf · Neural Networks Radial Basis Function Networks Laxmidhar Behera Department of Electrical Engineering Indian

Comparison of Results

Schemes RMS error No. of iterations

Back-propagation 0.00822 ∼ 5000

RBFN (Gradient Descent) 0.00825 ∼ 2000

RBFN (Hybrid Learning) 0.00823 ∼ 1000

It is seen from the table that RBFN learns faster comparedto a simple feedforward network.

Intelligent Control – p.27/??