neural networks basics using matlab
TRANSCRIPT
1
NEURAL NETWORKSBasics using MATLAB
Neural Network Toolbox
2
Multilayer Layer Perceptron (MLP)• A MLP consists of an input layer, several
hidden layers, and an output layer.• Node i, also called a neuron,It includes a
summer and a nonlinear activation function g.
• ni is the input to the activation function g.
3
Multilayer Layer Perceptron (MLP)• activation function was mathematical convenience a
hyberbolic tangent (tanh) or a sigmoid function are most commonly used.
• Connecting several nodes in parallel and series, a MLP network is formed.
4
Basics using MATLAB Neural Network Toolbox
• The MATLAB commands used in the procedure are newff (type of architecture ,size and type of training algorithm ) , train and sim.
• newff : create a feed-forward backpropagation network• The MATLAB command newff generates a MLPN neural network,
which is called net.
For R input Network training function,
5
Activation function
transfer function:• Hardlim
• hardlims
• Purelin
6
Activation function• Satlin:
• Satlins:
• Logsig:
• Tansig:
7
Training• To create a network that can handle noisy input
vectors it is best to train the network on both ideal and noisy vectors. To do this, the network is first trained on ideal vectors until it has a low sum-squared error .
• To test the result sim command is applied. The output of the MLP network is called a.
8
Basic flow diagram
9
Example-1
• Consider humps function in MATLAB. It is given by :
y = 1 ./ ((x-.3).^2 + .01) + 1 ./ ((x-.9).^2 + .04) – 6
but in MATLAB can be called by humps. Here we like to see if it is possible to find a neural networkto fit the data generated by humps-function between [0,2]:
• a) Fit a multilayer perceptron network on the data. Try different network sizes and different teaching algorithms.
• b) Repeat the exercise with radial basis function networks.
10
solution (a)
• To obtain the data use the following commands:
x = 0:.05:2; y=humps(x);P=x; T=y;plot(P,T,'x')grid; xlabel('time (s)'); ylabel('output'); title('humps function')
Step-1
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-20
0
20
40
60
80
100
time (s)
outp
ut
humps function
11
Design the network
Step-2
% DESIGN THE NETWORK% ==================%First try a simple one – feedforward (multilayer perceptron network(net=newff([0 2], [5,1], {'tansig','purelin'},'traingd');% Here newff defines feedforward network architecture.% The first argument [0 2] defines the range of the input and initializes the network parameters.% The second argument the structure of the network. There are two layers.% 5 is the number of the nodes in the first hidden layer,% 1 is the number of nodes in the output layer,% Next the activation functions in the layers are defined.% In the first hidden layer there are 5 tansig functions.% In the output layer there is 1 linear function.% ‘learngd’ defines the basic learning scheme – gradient method% traingd is a network training function that updates weight and bias values according to gradient descent.
12
Design the network% Define learning parameters
net.trainParam.show = 50; % The result is shown at every 50th iteration (epoch)
net.trainParam.lr = 0.05; % Learning rate used in some gradient schemesnet.trainParam.epochs =1000; % Max number of iterationsnet.trainParam.goal = 1e-3; % Error tolerance; stopping criterion
%Train network
net1 = train(net, P, T); % Iterates gradient type of loop
% Resulting network is strored in net1
• The goal is still far away after 1000 iterations (epochs).
Step-3
13
solution (a)
0 100 200 300 400 500 600 700 800 900 100010
-4
10-3
10-2
10-1
100
101
102
103
1000 Epochs
Trai
ning
-Blu
e G
oal-
Bla
ckPerformance is 28.8647, Goal is 0.001
14
solution (a)% Simulate how good a result is achieved: Input is the same input vector P.% Output is the output of the neural network, which should be compared with output data
a= sim(net1,P);
% Plot result and compare
plot(P,a-T, P,T); grid;
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-20
0
20
40
60
80
100
Step-4Step-4
15
solution (a)The fit is quite bad, especially in the beginning.Increase the size of the network: Use 20 nodes in the first hidden layer.net=newff([0 2], [20,1], {'tansig','purelin'},'traingd');Otherwise apply the same algorithm parameters and start the training process.net.trainParam.show = 50; % The result is shown at every 50th iteration (epoch)net.trainParam.lr = 0.05; % Learning rate used in some gradient schemesnet.trainParam.epochs =1000; % Max number of iterationsnet.trainParam.goal = 1e-3; % Error tolerance; stopping criterion
%Train network
net1 = train(net, P, T); % Iterates gradient type of loop
Step-5
16
solution (a)
• The error goal of 0.001 is not reached now either, but the situation has improved significantly.
0 100 200 300 400 500 600 700 800 900 100010
-4
10-3
10-2
10-1
100
101
102
103
1000 Epochs
Trai
ning
-Blu
e G
oal-
Bla
ckPerformance is 0.349306, Goal is 0.001
17
solution (a)
Step-6
% Simulate how good a result is achieved: Input is the same input vector P.% Output is the output of the neural network, which should be compared with output data
a= sim(net1,P);
% Plot result and compare
plot(P,a-T, P,T); grid;
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-20
0
20
40
60
80
100
18
solution (a)• Try Levenberg-Marquardt – trainlm. Use also smaller
size of network – 10 nodes in the first hidden layer. net=newff([0 2], [10,1], {'tansig','purelin'},'trainlm');
%Define parametersnet.trainParam.show = 50;net.trainParam.lr = 0.05;net.trainParam.epochs =1000;net.trainParam.goal = 1e-3;%Train networknet1 = train(net, P, T);
Step-7
0 50 100 150 200 250 300 350 400 45010
-4
10-3
10-2
10-1
100
101
102
103
495 Epochs
Trai
ning
-Blu
e G
oal-
Bla
ck
Performance is 0 .00099148, Goal is 0 .001
19
solution (a)• Performance is now
according to the tolerance specification.
%Simulate resulta= sim(net1,P);%Plot the result and the errorplot(P,a-T,P,T)xlabel('Time (s)'); ylabel('Output of network and error'); title('Humps function')
Step-8
0 0.5 1 1.5 2-20
0
20
40
60
80
100
Time (s)
Out
put o
f net
wor
k an
d er
ror
Humps function
20
solution (a)
It is clear that L-M algorithm is significantly faster and preferable method to back-propagation. Note that depending on the initialization the algorithm converges slower or faster.
21
solution (b)
• RADIAL BASIS FUNCTION NETWORKS• Here we would like to find a function, which fits the 41 data
points using a radial basis network. A radial basis network is a network with two layers. It consists of a hidden layer of radial basis neurons and an output layer of linear neurons. Here is a typical shape of a radial basis transfer function used by the hidden layer:
p = -3:.1:3;a = radbas(p);plot(p,a)
-3 -2 -1 0 1 2 30
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
22
solution (b)
• We can use the function newrb to quickly create a radial basis network, which approximates the function at these data points. Generate data as before:
x = 0:.05:2; y=humps(x);P=x; T=y;plot(P,T)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-20
0
20
40
60
80
100
P
T
Step-1
23
solution (b)
• The simplest form of newrb command is:
net1 = newrb(P,T);
Step-2
0 5 10 15 20 2510
1
102
103
104
105
25 Epochs
Trai
ning
-Blu
e
Performance is 822.585, Goal is 0
‘’net = newrb(P,T,GOAL,SPREAD)’’
24
solution (b)• For humps the network training leads to
singularity and therefore difficulties in training. Simulate and plot the result::
a= sim(net1,P);plot(P,T-a,P,T)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-20
0
20
40
60
80
100
25
solution (b)The plot shows that the network approximates humps but the error is quite large. The problem is thatthe default values of the two parameters of the network are not very good. Default values are goal -mean squared error goal = 0.0, spread - spread of radial basis functions = 1.0.In our example choose goal = 0.02 and spread = 0.1.
goal=0.02; spread= 0.1;net1 = newrb(P,T,goal,spread);
Simulate and plot the result
a= sim(net1,P);plot(P,T-a,P,T)xlabel('Time (s)'); ylabel('Output of network and error');title('Humps function approximation - radial basis function')
26
solution (b)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2-20
0
20
40
60
80
100
Time (s)
Out
put o
f net
wor
k an
d er
ror
Humps function approximation - radia l basis function
27
What is the significance of small value of spread. What about
large?
• The problem in the first case was too large a spread (default = 1.0), which will lead to too sparse a solution.
28
Example-2Consider a surface described by z = cos (x) sin (y) defined on a square−2 ≤x ≤ 2 x 2,−2≤ y ≤2.
a) Plot the surface z as a function of x and y. This is a demo function in MATLAB, so you can also findit there.
b) Design a neural network, which will fit the data. You should study different alternatives and test thefinal result by studying the fitting error.
29
solution Generate data:
x = -2:0.25:2; y = -2:0.25:2;z = cos(x)'*sin(y);
Draw the surface (here grid size of 0.1 has been used):
mesh(x,y,z)xlabel('x axis'); ylabel('y axis'); zlabel('z axis');title('surface z = cos(x)sin(y)');gi=input('Strike any key ...');
Step-1
30
solution
-2-1
01
2
-2
-1
0
1
2-1
-0.5
0
0.5
1
x axis
surface z = cos(x)sin(y)
y axis
z ax
is
31
solutionStore data in input matrix P and output vector T
P = [x;y]; T = z;Use a fairly small number of neurons in the first layer, say 25, 17 in the output.Initialize the network
net=newff([-2 2; -2 2], [25 17], {'tansig' 'purelin'},'trainlm');Apply Levenberg-Marquardt algorithm
%Define parametersnet.trainParam.show = 50;net.trainParam.lr = 0.05;net.trainParam.epochs = 300;net.trainParam.goal = 1e-3;%Train networknet1 = train(net, P, T);gi=input('Strike any key ...');
Step-2
32
solution
0 0.5 1 1.5 2 2.5 3 3.5 410
-4
10-3
10-2
10-1
100
101
4 Epochs
Trai
ning
-Blu
e G
oal-
Bla
ckPerformance is 0.000229301, Goal is 0.001
33
solutionSimulate the response of the neural network and draw the corresponding surface:
a= sim(net1,P);mesh(x,y,a)
-2-1
01
2
-2
-1
0
1
2-1
-0.5
0
0.5
1
Step-3
34
solutionThe result looks satisfactory, but a closer examination reveals that in certain areas the approximation isnot so good. This can be seen better by drawing the error surface.
% Error surfacemesh(x,y,a-z)xlabel('x axis'); ylabel('y axis'); zlabel('Error'); title('Error surface')
-2-1
01
2
-2
-1
0
1
2-0.06
-0.04
-0.02
0
0.02
0.04
0.06
x axis
Error surface
y axis
Erro
r
Step-4
35
solution
36
Example-3Consider Bessel functions Jα(t), which are solutions of the differential equation
Use backpropagation network to approximate first order Bessel function J1 , α=1, when t [0,20].∈
a) Plot J1(t).b) Try different structures to for fitting. Start with a two-layer network.
022 yyt )(tyt 2
37
solution )Plot J1(t)(First generate the data.MATLAB has Bessel functions as MATLAB functions.
t=0:0.1:20; y=bessel(1,t);plot(t,y)gridxlabel('time in secs');ylabel('y'); title('First order bessel function');
0 2 4 6 8 10 12 14 16 18 20-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6First order besse function
time in secs
y
Step-1
38
solutionNext try to fit a backpropagation network on the data. Try Levenberg-Marquardt.
P=t; T=y;%Define network. First try a simple onenet=newff([0 20], [10,1], {'tansig','purelin'},'trainlm');%Define parametersnet.trainParam.show = 50;net.trainParam.lr = 0.05;net.trainParam.epochs = 300;net.trainParam.goal = 1e-3;%Train networknet1 = train(net, P, T);
0 0.5 1 1.5 2 2.5 310
-4
10-3
10-2
10-1
100
101
3 Epochs
Trai
ning
-Blu
e G
oal-
Bla
ck
Performance is 0.0006501, Goal is 0 .001
39
solution
% Simulate resulta= sim(net1,P);%Plot result and compareplot(P,a,P,a-T)xlabel('time in secs');ylabel('Network output and error');title('First order bessel function'); grid
0 2 4 6 8 10 12 14 16 18 20-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
time in secs
Net
wor
k ou
tput
and
err
or
First order bessel function
40
solution
410
Since the error is fairly significant, let’s reduce it by doubling the nodes in the first hidden layer to 20 and decreasing the error tolerance to .
P=t; T=y%Define network. First try a simple onenet=newff([0 20], [20,1], {'tansig','purelin'},'trainlm');%Define parametersnet.trainParam.show = 50;net.trainParam.lr = 0.05;net.trainParam.epochs = 300;net.trainParam.goal = 1e-4;%Train networknet1 = train(net, P, T);
Step-20 0.5 1 1.5 2 2.5 3 3.5 4
10-5
10-4
10-3
10-2
10-1
100
101
4 Epochs
Trai
ning
-Blu
e G
oal-
Bla
ck
Performance is 5.13247e-005 , Goal is 0.0001
41
solution
0 2 4 6 8 10 12 14 16 18 20-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
time in secs
Net
wor
k ou
tput
and
err
or
First order bessel function
% Simulate resulta= sim(net1,P);%Plot result and compareplot(P,a,P,a-T)xlabel('time in secs');ylabel('Network output and error');title('First order bessel function'); grid
42
solution
P=t; T=y%Define network. First try a simple onenet=newff([0 20], [40,1], {'tansig','purelin'},'trainlm');%Define parametersnet.trainParam.show = 50;net.trainParam.lr = 0.05;net.trainParam.epochs = 300;net.trainParam.goal = 1e-6;%Train networknet1 = train(net, P, T);
The result is considerably better, although it would still require improvement. This is left as furtherexercise to the reader.
0 10 20 30 40 50 60 70 80 90 10010
-7
10-6
10-5
10-4
10-3
10-2
10-1
100
101
104 Epochs
Trai
ning
-Blu
e G
oal-
Bla
ck
Performance is 9.72949e-007, Goal is 1e-006
Step-3
43
solution
% Simulate resulta= sim(net1,P);%Plot result and compareplot(P,a,P,a-T)xlabel('time in secs');ylabel('Network output and error');title('First order bessel function'); grid
0 2 4 6 8 10 12 14 16 18 20-0.4
-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4
0.5
0.6
time in secs
Net
wor
k ou
tput
and
err
or
First order bessel function
44
Neural Network ToolboxSimulink
45
Block SetThe Neural Network Toolbox provides a set of blocks you can use to build neural networks in Simulink or which can be used by the function gensim to generate the Simulink version of any network you have created in MATLAB. Bring up the Neural Network Toolbox block set with this command.
neural
The result is a window that contains three blocks. Each of these blocks contains additional blocks.
46
Transfer Function BlocksDouble-click on the Transfer Functions block in the Neural window to bring up a window containing several transfer function blocks.
Each of these blocks takes a net input vector and generates a corresponding output vector whose dimensions are the same as the input vector
47
Net Input BlocksDouble-click on the Net Input Functions block in the Neural window to bring up a window containing two net-input function blocks
Each of these blocks takes any number of weighted input vectors, weight layer output vectors, and bias vectors, and returns a net-input vector
48
Weight Blocks
Double-click on the Weight Functions block in the Neural window to bring up a window containing three weight function blocks.
Each of these blocks takes a neuron's weight vector and applies it to an input vector (or a layer output vector) to get a weighted input value for a neuron.
It is important to note that the blocks above expect the neuron's weight vector to be defined as a column vector. This is because Simulink signals can be column vectors, but cannot be matrices or row vectors.
It is also important to note that because of this limitation you have to create S weight function blocks (one for each row), to implement a weight matrix going to a layer with S neurons.
49
50
Example-4
012 xxxx )(
1211 1 xxx )(x2
12 xx
if it is possible to find a neural network model, which produces the samebehavior as Van der Pol equation.
or in state-space form
Use different initial functions. Apply vector notation
51
solution