artificial neural network identification and control of the inverted
Post on 10-Feb-2017
224 Views
Preview:
TRANSCRIPT
Artificial Neural Network
identification and control
of the inverted pendulum
Tim Callinan
August 2003
2
Acknowledgements
I would like to thank my supervisor Jennifer Bruton for her help, guidance and support
throughout the project.
Thank you to Conor Maguire for helping me with the inverted pendulum rig and lending me
many of his manuals and books.
Thank you to Anthony Holohan for allowing me to experiment on the inverted pendulum rig.
Declaration
I hereby declare that, except where otherwise indicated, this document is entirely my own
work and has not been submitted in whole or in part to any other university.
Signed:……………………………………………………. Date:………………………
3
Abstract
This project takes the area of Artificial Neural Networks (ANN) and applies it to the inverted
pendulum control problem. The inverted pendulum is typically used to benchmark new control
techniques, as it’s a highly non-linear unstable system. Neural networks have unique
characteristics, which enable them to control non-linear systems. Feedforward and Recurrent
neural networks are used to model the inverted pendulum. Multi-output online identification
was also researched. A neuro-controller for the inverted pendulum was developed. Traditional
control methods were utilized to develop a control law to stabilize the inverted pendulum. A
feedforward network was trained to mimic the control law. Tbe neuro-control shows that if a
disturbance occurs in the system, the neural network learns to counteract this disturbance.
Finally the knowledge learned in identification and control was applied to the real time
inverted pendulum rig. An online adaptive neural network was developed to model the real
time system.
4
Table of Contents
1 Introduction ..........................................................................................5
Outline of the document ..........................................................................6
2 Inverted Pendulum...............................................................................7
3 Artificial Neural Networks...................................................................18
Advantages of ANN’s ............................................................................19
Types of Learning .................................................................................20
Neural network structures .....................................................................20
Multi-layered perceptrons......................................................................22
4 System Identification ..........................................................................24
System identification procedure............................................................25
Linear identification of the system.........................................................28
Non-Linear identification of the system.................................................36
Non-linear Identification using neural networks....................................37
Multi-output identification ......................................................................46
5 Neural control of the inverted pendulum............................................51
Neural-control in simulink:.....................................................................58
6 Real-time identification and control....................................................67
7 Conclusions........................................................................................75
Summary...............................................................................................75
Scope for future work ............................................................................79
8 Bibliography .......................................................................................80
5
11 IInnttrroodduuccttiioonn
The process used in this project is the inverted pendulum system. The inverted pendulum is a
highly nonlinear and open-loop unstable system. This means that standard linear techniques
cannot model the nonlinear dynamics of the system. When the system is simulated the
pendulum falls over quickly. The characteristics of the inverted pendulum make identification
and control more challenging. There are two main aims of the project. The first is to develop
an accurate model of the inverted pendulum system using neural networks. The second aim is
to develop a neural network controller which determines the correct control action to stabilize
the system, but can also learn from experience.
System identification is the procedure that develops models of a dynamic system based on the
input and output signals from the system. The input and output data must show some of the
dynamics of the process. The parameters of the model are adjusted until the output from the
model is similar to the output of the real system. In order to develop an accurate model of the
inverted pendulum, different methods (linear and nonlinear) of identification will be tested.
One of the problems encountered early in the project is collecting experimental data from the
inverted pendulum system. The output data from the unstable system does not show enough
information or dynamics of the system. Feedback controllers are developed which stabilize the
system before identification can take place.
Neural networks have shown great progress in identification of nonlinear systems. There are
certain characteristics in ANN which assist them in identifying complex nonlinear systems.
ANN are made up of many nonlinear elements and this gives them an advantage over linear
techniques in modelling nonlinear systems. ANN are trained by adaptive learning, the network
‘learns’ how to do tasks, perform functions based on the data given for training. The
knowledge learned during training is stored in the synaptic weights. The standard ANN
structures (feedforward and recurrent) are both used to model the inverted pendulum.
6
The main task of this project is to design a neural network controller which keeps the
pendulum system stabilized. There are 3 main types of neural control – supervised, direct
inverse and unsupervised.
Supervised learning uses an existing controller or human feedback in training the neural
network. In order to train the neural network to imitate an existing controller a vector of inputs
and control targets from the controller must be collected. With supervised control, a neural
network could be trained to imitate a robust controller. The robust controller can operate
correctly, if the process operates around a certain point. The neuro-controller operates
similarly to the robust controller but can also adapt if any disturbance occurs in the system.
Direct inverse control does not require an existing controller in training. A neural network is
trained to model the inverse of the process. The neural network is cascaded with the process.
Theoretically if the inverse model is very accurate, the nonlinearities in the ANN will cancel
out the nonlinearities in the process.
OOuuttlliinnee ooff tthhee ddooccuummeenntt
Chapter 2 details the research on the inverted pendulum system. The dynamic system
equations (linear and nonlinear) are derived. The simulink models of the linear and nonlinear
systems are developed. The development of the feedback controllers to stabilize the system is
also discussed. Chapter 3 covers the theory, structure and operation of artificial neural
networks. Chapter 4 covers the whole area of system identification. The procedure of system
identification is discussed first. Linear identification techniques are applied to the linear
system. Nonlinear identification using neural networks is then reported. Chapter 5 details the
development of the neuro-controller. Chapter 6 discusses the real time identification and
control using the inverted pendulum rig. Finally Chapter 7 provides a summary of the work
discussion of the results and scope for future work.
7
22 IInnvveerrtteedd PPeenndduulluumm
The inverted pendulum system is a classic control problem that is used in universities around
the world. It is a suitable process to test prototype controllers due to its high non-linearities
and lack of stability. The system consists of an inverted pole hinged on a cart which is free to
move in the x direction. In this chapter, the dynamical equations of the system will be derived,
the model will be developed in simulink and basic controllers will be developed. The aim of
developing an inverted pendulum in simulink is that the developed model will have the same
characteristics as the actual process. It will be possible to test each of the prototype controllers
in the simulink environment. Before the inverted pendulum model can be developed in
simulink, the system dynamical equations will be derived using ‘Lagrange Equations’. [1] The
Lagrangian equations are one of many methods of determining the system equations. Using
this method it is possible to derive dynamical system equations for a complicated mechanical
system such as the inverted pendulum. Figure. 1 is a free-bodied diagram of the pendulum
system.
The Lagrange equations use the kinetic and potential energy in the system to determine the
dynamical equations of the cart-pole system.
M – Mass of the cart
m – mass of the pole
l – length of the pole
f – control force
Fig. 1: Free body diagram of the inverted pendulum system
8
The kinetic energy of the system is the sum of the kinetic energies of each mass. The kinetic
energy, 1T of the cart is
The pole can move in both the horizontal and vertical directions so the pole kinetic energy is
From the free body diagram 2y and 2z are equal to
The total kinetic energy, T of the system is equal to
Equation 3 and 5 are inputted into equation 7 to give equation 8.
The potential energy, V of the system is stored in the pendulum so
The Lagrangian function is
)(21 2
22
22
••
+= zymT
++=+=
•••
)(21 2
22
22
21 zymyMTTT
θsin2 lyy +=
θθ cos2
•••
+= lyyθcos2 lz =
θθ sin2
••
−= lz
+++=
••••2222 cos2
21
21
θθθ llyymyMT
θcos2 mglmgzV ==
••••
−+++=−= θθθθ cos21
cos)(21 222 mglmlymlymMVTL
(Eq.1)
(Eq.2)
(Eq.3) (Eq.4)
(Eq.7)
(Eq.8)
(Eq.9)
(Eq.5) (Eq.6)
(Eq.10)
•
= 21 2
1yMT
9
The state-space variables of the system are y andθ , so the Lagrange equations are
But,
The above equations (Eq. 13-16) are inputted into the Lagrange equations (Eq. 11-12) and this
results in the non-linear dynamical equations for the inverted pendulum system, which are
shown below.
fmlmlymM =−++ θθθθ sincos)( 2&&&&&
0sin.sin.cos 2 =−+− θθθθθ mglmlymlyml &&&&&&
Some of the modelling and control techniques involved in the project are linear so these
equations must be linearized. It is possible to linearize these equations by approximating
cosθ =1 and sinθ =0. It is assumed that θ is kept small. The quadratic terms are also
negligible. Therefore the two linear system equations are
θMmg
Mf
y −=&&
θθ gMl
mMMlf
+
+−=&&
0=∂
∂−
∂
∂•
θθ
LLdtd
••
• ++=∂
∂θθcos)( mlymM
y
L
0=∂
∂
y
L
••
• +=∂
∂θθ
θ
2cos mlymlL
θθ
sinmglL
=∂
∂
0=∂
∂−
∂
∂•
y
L
y
Ldtd
(Eq.11)
(Eq.13)
(Eq.14)
(Eq.15)
(Eq.16)
(Eq.17)
(Eq.18)
(Eq.19)
(Eq.20)
(Eq.12)
10
At this stage, a set of equations (linear & non-linear) describing the inverted pendulum have
been developed. The next stage is constructing a simulnk model of the inverted pendulum
system. There is no procedure for developing simulink models from dynamical state
equations. The diagram below is the linear pendulum model. This model is constructed using
integrators, gain blocks, etc. The model (Fig. 2) is simply a simulink representation of the
linear state equations.
The non-linear pendulum system (Fig. 4) is shown in the next page. The non-linear system,
even though it is more complicated is developed in a similar manner. Both models are large so
it is possible to encapsulate them in subsystem blocks shown below (Fig. 3). Both the models
are set-up using a mask. The mask makes it possible to change the values of m, l, g, etc for
different simulations. The mass of the cart, M is set to 1.2 Kg, the mass of the pendulum is set
to 0.11 Kg and the length of the pendulum is 0.4 meters. These figures are taken from the real
time inverted pendulum rig.
Fig. 2 : Simulink model of the linear pendulum system
Fig. 3 : Simulink blocks of the pendulum systems
11
The following simulink diagram is the non-linear pendulum model.
Fig. 4 : Simulink model of the nonlinear pendulum system
12
Both pendulum models are simulated in simulink. The angle of the pendulum is shown below
(Fig. 5). The simulation shows that the pendulum goes unstable and falls over.
One of the requirments in system identification is the collection of ‘information rich’
input/output data. The graph above (Fig. 5) of the pendulum angle does not give us enough
information on the pendulum system.- The pendulum falls over too quickly. In order to
adequatly model the inverted pendulum it is necessary to stabilize it using a feedback
controller. Using a feedback controller, the output data will contain more information
describing the process. [2]
0 20 40 60 80 100 120-100
-90
-80
-70
-60
-50
-40
-30
-20
-10
0
Fig. 5: Open loop response of the inverted pendulum
Pendulum angle, deg.
Time
13
A full state feedback controller is developed to stabilize the linear pendulum system. The
linear system could have been stabilized using many different methods (PID,etc). The full-
state feedback controller stabilizes the system by positioning the closed loop poles in the
stable region. The simulink model with controller is shown below. (Fig.6)
The linear pendulum system is simulated, the angle of the pendulum is shown below. (Fig.7)
The stabilized system with controller keeps the pendulum angle stable. The pendulum can be
simulated for longer times. The data is also of better quality for system identification
purposes.
0 50 100 150 200 250 300 350-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
Fig. 6 : Simulink diagram of Linear Pendulum and controller
Fig. 7: Closed loop response of the inverted pendulum with controller
Pendulum angle, deg.
Time
14
Developing a controller for the non-linear pendulum is more difficult. Linear control
techniques such as the PID, full-state feedback were tested but had no success in controlling
the non-linear pendulum. A feedback linearisation controller was developed to control the
non-linear pendulum system. Feedback linearisation cancels the non-linearities in the
pendulum system so that the closed loop system is more linear.
The following equations are a control law developed for the inverted pendulum controller. The
first four equations (Eq. 21-24) are entered into the main equation. The main equation (Eq. 25)
calculates the required force, U to keep the pendulum stable.
For the simulations M, m, l, g are set to the values of the pendulum model. The following
numeric values are used: M = 1.2 Kg, m = 0.1 Kg, l = 0.4 m, g = 9.81 m/s, 1k = 25, 2k = 10,
1C = 1, 2C = 2.6. Also dx = 0 meters and dθ = 0 rad, which are the desired position of the cart
and angle of the pendulum respectively. For details on all the parameters see [4]. A simulink
model of the above control law was developed and is shown in Figure 8.
θsin43
1 gl
h = θcos43
2 lh =
••
−
−= xfglmf θθθ 2sin
83
sin2
1
−+= θ2
2 cos43
1mMf
1212112
2 )()( fxcxxckkhhf
u dd −
+−++−+=
••
θθθ
(Eq.22)
(Eq.23)
(Eq.25)
(Eq.21)
(Eq.24)
15
The inputs to this controller are the 4 output states of the non-linear pendulum model. The
correct magnitude and force to keep the pendulum stable is calculated by the control law.
Fig. 8 : Simulink model of the nonlinear control law.
16
The following diagram (Fig.9) shows the set-up of the non-linear pendulum with control law.
Figure 10 is the closed loop pendulum angle plotted by matlab. The closed loop response is
stable and shows that the control law is working.
0 100 200 300 400 500 600
-5
-4
-3
-2
-1
0
1
2
3
4x 10
-3
Fig. 9: Simulink diagram of the nonlinear system with control law.
Fig. 10: Closed loop response of the nonlinear pendulum with controller
Pendulum angle, deg.
Time
17
The linear and nonlinear models of the cart-pole system have been developed and simulated. It
was found that the system is open loop unstable. For accurate system identification the process
must be stable, because of this, standard feedback controllers were developed and tested. The
next chapter in the report discusses the theory and operation of artificial neural networks.
18
33 AArrttiiffiicciiaall NNeeuurraall NNeettwwoorrkkss
The science of artificial neural networks is based on the neuron. In order to understand the
structure of artificial networks, the basic elements of the neuron should be understood.
Neurons are the fundamental elements in the central nervous system. The diagram below (Fig.
11) shows the components of a neuron. [5]
A neuron is made up of 3 main parts -dendrites, cell body and axon. The dendrites receive
signals coming from the neighbouring neurons. The dendrites send their signals to the body of
the cell. The cell body contains the nucleus of the neuron. If the sum of the received signals is
greater than a threshold value, the neuron fires by sending an electrical pulse along the axon to
the next neuron. The following model is based on the components of the biological neuron
(Fig. 12). The inputs X0-X3 represent the dendrites. Each input is multiplied by weights W0-
W3. The output of the neuron model, Y is a function, F of the summation of the input signals.
Fig. 11: The diagram shows the basic elements of a neuron
Fig. 12: Diagram of neuron model
19
AAddvvaannttaaggeess ooff AANNNN’’ss
1. The main advantage of neural networks is that it is possible to train a neural network to
perform a particular function by adjusting the values of connections (weights) between
elements. For example, if we wanted to train a neuron model to approximate a specific
function, the weights that multiply each input signal will be updated until the output
from the neuron is similar to the function.
2. Neural networks are composed of elements operating in parallel. Parallel processing
allows increased speed of calculation compared to slower sequential processing.
3. Artificial neural networks (ANN) have memory. The memory in neural networks
corresponds to the weights in the neurons. Neural networks can be trained offline and
then transferred into a process where adaptive learning takes place. In our case, a neural
network controller could be trained to control an inverted pendulum system offline say
in the simulink environment. After training, the network weights are set. The ANN is
placed in a feedback loop with the actual process. The network will adapt the weights to
improve performance as it controls the pendulum system.
The main disadvantage of ANN is they operate as black boxes. The rules of operation in
neural networks are completely unknown. It is not possible to convert the neural structure into
known model structures such as ARMAX, etc. Another disadvantage is the amount of time
taken to train networks. It can take considerable time to train an ANN for certain functions.
Fig. 13: Diagram shows the parallelism of neural networks
20
TTyyppeess ooff LLeeaarrnniinngg
Neural networks have 3 main modes of operation – supervised, reinforced and unsupervised
learning. [6] In supervised learning the output from the neural network is compared with a set
of targets, the error signal is used to update the weights in the neural network. Reinforced
learning is similar to supervised learning however there are no targets given, the algorithm is
given a grade of the ANN performance. Unsupervised learning updates the weights based on
the input data only. The ANN learns to cluster different input patterns into different classes.
NNeeuurraall nneettwwoorrkk ssttrruuccttuurreess
There are 3 main types of ANN structures -single layer feedforward network, multi-layer
feedforward network and recurrent networks. [7] The most common type of single layer
feedforward network is the perceptron. Other types of single layer networks are based on the
perceptron model. The details of the perceptron are shown below (Fig. 14).
Inputs to the perceptron are individually weighted and then summed. The perceptron computes
the output as a function F of the sum. The activation function, F is needed to introduce non-
linearities into the network. This makes multi-layer networks powerful in representing
nonlinear functions.
Fig. 14: Diagram of the perceptron model
0x
1x
2x
21
There are 3 main types of activation function -tan-sigmoid, log-sigmoid and linear. [8]
Different activation functions affect the performance of an ANN.
The output from the perceptron is
The weights are dynamically updated using the back propagation algorithm. The difference
between the target output and the actual output (error) is calculated.
The errors are back propagated through the layers and the weight changes are made. The
formula for adjusting the weights is
Once the weights are adjusted, the feed-forward process is repeated. The weights are adapted
until the error between the target and actual output is low. The approximation of the function
improves as the error decreases. Single-layer feedforward networks are useful when the data
to be trained is linearly separable. If the data we are trying to model is not linearly separable or
the function has complex mappings, the simple perceptron will have trouble trying to model
the function adequately.
Log-sigmoid function Tan-sigmoid function Linear function
(Eq.26)
(Eq.27)
(Eq.28)
])[].[(][ kxkwfky T=
][][][ kykTke −=
][].[.][]1[ kxkekwkw µ+=+
22
MMuullttii--llaayyeerreedd ppeerrcceeppttrroonnss
Neural networks can have several layers. There are 2 main types of multi-layer networks-
feedforward and recurrent. In feedforward networks the direction of signals is from input to
output, there is no feedback in the layers. The diagram below (Fig. 15) shows a 3-layered
feedforward network.
Increasing the number of neurons in the hidden layer or adding more hidden layers to the
network allows the network to deal with more complex functions. Cybenko’s theorem states
that, “A feedforward neural network with a sufficiently large number of hidden neurons with
continuous and differentiable transfer functions can approximate any continuous function over
a closed interval.” [9] The weights in MLP’s are updated using the backpropagation learning.
[10] There are two passes before the weights are updated.
In the first pass (forward pass) the outputs of all neurons are calculated by multiplying the
input vector by the weights. The error is calculated for each of the output layer neurons.
In the backward pass, the error is passed back through the network layer by layer. The weights
are adjusted according to the gradient decent rule, so that the actual output of the MLP moves
closer to the desired output. A momentum term could be added which increases the learning
rate with stability.
IInnppuutt llaayyeerr
HHiiddddeenn llaayyeerr
OOuuttppuutt llaayyeerr
Fig. 15: Diagram of a multi-layered perceptron
23
The second type of multi-layer networks are recurrent (Fig.16). Recurrent networks have at
least one feedback loop. This means an output of a layer feeds back to any proceeding layer.
This gives the network partial memory due to the fact that the hidden layer receives data at
time t but also at time t-1. This makes recurrent networks powerful in approximating functions
depending on time. [11] The simulink model for the nonlinear inverted pendulum shows that
there are many feedback loops. This means the next state of the model depends on previous
states. It is expected that to accurately model this type of dynamic system, a recurrent neural
network with feedback loops will perform better than a static feedforward network.
Fig. 16: Diagram of a recurrent neural network
IInnppuutt llaayyeerr
HHiiddddeenn llaayyeerr
OOuuttppuutt llaayyeerr
24
44 SSyysstteemm IIddeennttiiffiiccaattiioonn
System identification is the process of developing a mathematical model of a dynamic system
based on the input and output data from the actual process. [12] This means it is possible to
sample the input and output signals of a system and using this data generate a mathematical
model. An important stage in control system design is the development of a mathematical
model of the system to be controlled. In order to develop a controller, it must be possible to
analyse the system to be controlled and this is done using a mathematical model. Another
advantage of system identification is evident if the process is changed or modified. System
identification allows the real system to be altered without having to calculate the dynamical
equations and model the parameters again.
System identification is concerned with developing models. The diagram below (Fig. 17)
shows the inputs and output of a system.
The mathematical model in this case is the black box, it describes the relationship between the
input and output signals. The inverted pendulum system is a non-linear process. To adequately
model it, non-linear methods using neural networks must be used. Previous studies in system
identification have demonstrated that neural networks are successful in modelling many non-
linear systems. [13] Before neural networks are investigated for identification, linear
techniques such as auto regressive with exogenous input (ARX) and auto regressive moving
average with exogenous input (ARMAX) will be applied to the linear inverted pendulum
model.
Fig. 17: System showing input, disturbance and output signals
OUTPUT INPUT
DISTURBANCE
25
SSyysstteemm iiddeennttiiffiiccaattiioonn pprroocceedduurree
Basically system identification is achieved by adjusting the parameters of the model until the
model output is similar to the output of the real system. Below is a diagram (Fig. 18)
explaining the system identification procedure. [14]
There are three main steps in the system identification procedure.
1. The first step is to generate some experimental input/output data from the process we
are trying to model. In the case of the inverted pendulum system this would be the
input force on the cart and the output pendulum angle.
2. The next step is to choose a model structure to use. For example the following model
structure is the ARX.
3. The parameters A and B will be adjusted until this model output is similar to the output
of the process. In identification, there is no perfect model structure to use. Models can
be developed using engineering intuition or a priori knowledge of the process we are
trying to model.
)()()(. tetButyA +=
Fig. 18: Diagram of the system identification procedure
(Eq.29)
26
The best solution in choosing a model structure is to pick a number of different models, test
them all and use the model which yields the closest output to the process. The standard linear
models (ARX,ARMAX,etc) used in system identification were researched. [15] Below is the
diagram for the ARX model and the ARX model equation. (Fig. 19) The ARX is a simple
linear difference equation, which describes the input-output relationship. The input-output
relationship is modelled using a transfer function block B/A. It is assumed that the noise
spectrum and the input-output model have the same characteristic dynamics. This could
attribute to some modelling error.
The ARMAX model contains an extra C parameter in the noise spectrum model. (Fig. 20)
This gives the ARMAX model more accuracy than the ARX. The input-output block and the
noise spectrum block still have the same denominator.
)(.)()( teAC
tuAB
ty +=
)(1
)()( teA
tuAB
ty +=
Noise spectrum model Input-output model
Fig. 19: ARX model Eq.30: ARX equation
Fig. 20: ARMAX model Eg.31 ARMAX equation
27
The next type of linear model is the output-error model (Fig.21). The main difference between
this model and the previous models is that the output-error assumes that the disturbances are
white noise so there is no noise spectrum model. The input-output relationship is defined in
the transfer block B/F.
The last model used in linear system identification is called the Box-Jenkins model. (Fig. 22)
This model has separate transfer functions for the input-output relationship and the noise
spectrum. This is an advantage compared to the ARMAX models where the noise model and
input-output relationship have the same denominator.
The next stage in the identification procedure is to generate the parameters of the model.
Matlab uses the least squares algorithm update the model parameters. The least square
algorithm takes the model structure and input/output data from the process and estimates the
model parameters. It also generates the residuals, which is the error between the model and the
process. If the residuals are too high, another model structure could be used or the
experimental data might not show the true dynamics of the system to be modelled. There are
many different linear methods of estimating the parameters of the model. Least squares is the
main algorithm used in system identification. The next section of the report details the linear
identification experiments on the inverted pendulum system.
)()()( tetuFB
ty +=
)()()( teDC
tuFB
ty +=
Fig. 21: Output-error model Eq. 32: OE equation
Fig. 22: Box-Jenkins model Eq.33: BJ equation
28
LLiinneeaarr iiddeennttiiffiiccaattiioonn ooff tthhee ssyysstteemm
In order to generate an ARX, ARMAX or Box-Jenkins model of the inverted pendulum
system, the input-output data from the linear system must be collected. The diagram below
(Fig. 23) shows the simulink model with the feedback control system. The noise input is an
excitation signal. It is used to obtain a unique input-output response. Using ‘to workspace’
blocks the data is exported to the Matlab console. The data is split up into estimation and
validation data. Half of the generated data is used to generate the model and half will be used
to test the performance of the model.
The diagram above shows that the pendulum block is a SIMO (single-input multi-output)
system. Previous linear models that were developed are SISO (single-input single-output)
systems which model the pendulum angle θ . The first type of models to be developed are
ARX.
Fig. 23: Linear Pendulum with feedback controller
29
The following code is an example of ARX estimation in matlab. The input/output data is split
up into estimation and validation data. The ARX function uses least-squares to estimate the
parameters of the model. The model can be converted into transfer function format using the
command th2tf (tetha to transfer function).
The nn matrix defines the orders and delay of the ARX model.
nn = [na nb nk]
na = number of parameters to be estimated in the denominator
nb = number of parameters in the numerator
nk = time delay in the model
To develop the best ARX model possible, different orders of na, nb will be tested. Note that
ARX221 means na=2, nb=2 nk=1. The following table (table 1) shows the different orders of
na,nb used and the mean squared error between the model output and the target output. The
compare function is used to compare the model output with the validation data. The models
were tested with two sets of validation data. The second validation data was generated using a
different initial seed in the input signal. The mean squared error increased slightly using the
new data but this shows that the generated models can predict the output of the process. The
number of parameters in the models were increased. It was expected that as the complexity of
the models increased, the mean squared error would decrease. This was not the case, from the
results the models with the lowest mean squared error are ARX221 and ARX421.
z1 = [y(1:500) u(1:500)]; z2 = [y(501:1000) u(501:1000)]; nn = [5 3 1]; th = arx(z1,nn); [yh,fit1] = compare(z2,th); [num,den] = th2tf(th); sysarx = tf(num,den);
30
ARX [na nb nk] Data 1 Data2 1 1 1 0.0053 0.0057 2 2 1 0.0040 0.0054 3 2 1 0.0042 0.0057 3 3 1 0.1105 0.1158 4 2 1 0.0041 0.0052 4 3 1 0.0113 0.0175 4 4 1 0.0104 0.0167 5 2 1 0.0043 0.0058 5 3 1 0.0124 0.0152 5 5 1 0.3143 0.3249
Table 1: Error associated with ARX models
The following two plots (Fig.24,25) show the model output with the actual process output.
The mean squared error between the two is low. Also the models seem to track the shape of
the actual output. These results for the ARX models are good but the best method to test the
quality of a model is to generate a transfer function block of the model in simulink and input a
noise signal with a different seed (Fig. 26).
0 50 100 150 200-0.015
-0.01
-0.005
0
0.005
0.01
0.015
Blue: Model output, Black: Measured output
Output # 1 Fit: 0.0040831
0 20 40 60 80 100-0.015
-0.01
-0.005
0
0.005
0.01
0.015
0.02
Blue: Model output, Black: Measured output
Output # 1 Fit: 0.0051914
Fig. 24: ARX 421 model output with validation data 1 Fig. 25: ARX 421 model output with validation data 2
Pole Angle rad.
Pole Angle rad.
31
It’s expected that the model output would be similar to the actual output but the above models
completely failed to predict the actual output.
The low error results of the models above are computed using the ‘compare’ function. The
results from the compare function indicate that the models are accurate but when a transfer
function of the models is used in simulink they go completely unstable. The reason for this is
the models are developed using closed loop data. A controller must be used to keep the
pendulum stable but the problem with this is the input/output data contains too much of the
controller dynamics. To adequately model the process, the input and output data must show
the dynamics of the pendulum system. This can be achieved by using a de-tuned controller.
This is a controller that just keeps the pendulum stable but more of the process dynamics can
be seen. A de-tuned PID controller is developed by adjusting the P, I, D parameters until the
output becomes more unstable.
Fig. 26: Testing the linear models using transfer function block
32
The diagram below shows the difference between a normal controller and a de-tuned
controller (Fig.27). Using a de-tuned controller more of the pendulum dynamics can be seen.
The testing resumed using the de-tuned controller. A new method of online identification in
simulink was found. There is a toolbox of system identification blocks in simulink. The four
types of models that can be generated are ARX, ARMAX, Box-Jenkins and Output-error
(Fig.28). Using this toolbox the different types of models, order of the models, etc can be
changed easily.
0 100 200 300 400 500 600-0.1
-0.08
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
0.08
0.1
Red: PID
Blue: Detuned PID
Fig. 27: Closed-loop response with normal and de-tuned controllers
Fig. 28: Simulink model using the system identification toolbox
Pendulum Angle, rad
Time
33
The models developed using ARX and ARMAX couldn’t adequately predict the pendulum
system output even using the de-tuned controller. All orders of ARX, ARMAX models were
tested. The plot below is an example output from one of the ARX models. (Fig. 29)
Many of the journals on closed-loop identification indicate the best model structures to use are
Box-Jenkins and Output-Error. [16] ARX and ARMAX both make an assumption that the
noise spectrum model and the input-output model have the same characteristic dynamics. This
explains why the ARX transfer functions were unable to model the linear inverted pendulum.
Models were generated using the BJ and OE simulink blocks. These models were converted
into transfer function blocks and simulated. (Fig. 30)
Fig. 29: ARX model response compared to the real output
Fig. 30: Testing the accuracy of the Box-Jenkins and Output-error models
Blue: Model output Black: Process output
Pendulum Angle, rad
0 100 200 300 400 500-9
-8
-7
-6
-5
-4
-3
-2
-1
0
1x 10
5 Output # 1 Fit: 140322.7567
34
The following tables show the results from the Box–Jenkins and Output-Error models. The
BJ/OE parameters were changed for each simulation to determine the best model. The model
BJ53331 has the lowest mean squared error. The Output-Error models have a slightly higher
error than the Box-Jenkins models. These results indicate that
1. When trying to model an unstable system a feedback controller must be used to keep
the system stable. If the controller is de-tuned this will allow more of the pendulum
dynamics to be seen. This will make the model more accurate.
2. The Box-jenkins/Output error models are the only structures that can adequately model
the pendulum using the closed loop data.
3. The best way to test the quality of a model is to construct a transfer block of the model
and simulate it using a different initial input seed.
BJ [na nc nd nf nk] MSE 2 1 1 1 0 3.6214e-004 2 1 1 1 1 4.1615e-004 2 1 2 1 0 4.7409e-004 2 1 2 2 0 8.7144e-004 3 1 1 1 0 2.8155e-004 3 1 2 2 0 1.2681e-004 3 2 2 2 0 1.2606e-004 4 3 3 3 1 5.3353e-005 5 3 3 3 1 4.6758e-005
Table 2: Error associated with Box-Jenkins models
OE [nb nf nk] MSE 2 1 1 0.0014 2 2 1 0.0069 2 2 2 0.0104 3 1 1 0.0151 3 2 1 0.0074 3 2 2 0.0128 4 1 1 0.0039 4 2 1 0.0084 4 2 2 0.0126 5 1 1 0.0089
Table 3: Error associated with Output-error models
35
The results from the best models developed are shown below.
36 38 40 42 44 46
-0.015
-0.01
-0.005
0
0.005
0.01
Time (secs)
Actual Output (Red Line) vs. The Predicted Model output (Blue Line)
36 38 40 42 44 46
-5
0
5
x 10-3Actual Output (Red Line) vs. The Predicted Model output (Blue Line)
Fig. 31 : Box Jenkins Model [5 3 3 3 1] tested on validation data
Fig. 32 : Output Error [2 2 1] tested on validation data
Pendulum Angle, rad
Pendulum Angle, rad
36
NNoonn--LLiinneeaarr iiddeennttiiffiiccaattiioonn ooff tthhee ssyysstteemm
The previous system identification is based on linear systems. To achieve a better
approximation of the inverted pendulum system, the non-linear model must be used. Before
neural networks are utilized in identification, linear methods of identification such as Box-
Jenkins were applied to the nonlinear pendulum. The diagram below shows the nonlinear
pendulum with the feedback controller (Fig.33). The input and output signals from the system
are directed to the BJ algorithm which develops the model.
It is expected that linear methods such as BJ will not be able to capture the nonlinearitys of the
pendulum, however as the diagram below shows (Fig.34) the Box-Jenkins model can actually
model the nonlinear function. This is because the process is in a closed loop. The control law
keeps the pendulum stable by removing some of the non-linearities of the pendulum system.
Even using a detuned controller the Box-Jenkins can model the dynamics well.
36 38 40 42 44 46
-0.4
-0.2
0
0.2
0.4
0.6
Time (secs)
Actual Output (Red Line) vs. The Predicted Model output (Blue Line)
Error In Predicted Model
Fig. 33: Linear identification on the nonlinear system
Fig. 34: The Box-Jenkins output Vs. Process output
Pendulum Angle, rad
37
NNoonn--lliinneeaarr IIddeennttiiffiiccaattiioonn uussiinngg nneeuurraall nneettwwoorrkkss
This section discusses the different methods of identifying the pendulum process using neural
networks. The most common method of neural network identification is called forward
modelling (Fig. 35). [17] During training both the process and ANN receive the same input,
the outputs from the ANN and process are compared, this error signal is used to update the
weights in the ANN. This is an example of supervised learning- the teacher (pendulum
system) provides target values for the learner (the neural network).
There are 2 main types of networks (Feed-forward and Elman) that will be used for
identification. In order to provide a set of targets for the network to learn, the simulink model
with feedback control is used. (Fig. 36)
Fig. 35: Neural network forward modelling method
Fig. 36: The non-linear pendulum provides the neural targets
38
To emphasise the pendulum dynamics, the feedback controller was de-tuned. The controller is
based on a control law which basically cancels any non-linearities in the pendulum system.
The mass of the cart, M is set to 0.2 kg (In the control law). This makes the controller use less
control force making the pendulum go more unstable. More of the pendulum dynamics will be
seen at the output. (Fig. 38)
Initially single-input single-output networks were developed, the input being the control force
and the output pole angle. The first type of neural network to be developed are feed-forward.
Using matlab it is possible to develop multi-layer perceptrons (MLP).
The following matlab code creates a feed-forward network.
net = newff([-10 10],[4 1],{'tansig' 'purelin'},'trainlm'); net.trainParam.epochs = 400; net.trainParam.lr = 0.001; net = train(net,in(1:1000)',tethain(1:1000)'); The ‘newff’ function allows a user to specify the number of layers, the number of neurons in
the hidden layers and the activation functions used. This network contains 4 neurons in its
hidden layer. The hidden layer contains tan-sigmoid activation functions and the output layer
contains a linear function. This is the standard set-up of activation functions in MLP.
0 500 1000 1500 2000 2500
-3
-2
-1
0
1
2
3
x 10-3
0 500 1000 1500 2000-0.2
-0.15
-0.1
-0.05
0
0.05
0.1
0.15
Fig. 37: Closed loop response with normal controller Fig. 38: Closed loop response with detuned controller
Pole Angle rad
Pole Angle rad
39
The type of training used is also set here. These networks use the back-propagation learning
rule to update the weights. The number of epochs for this example is set to 400. During
training, the input vector will be passed through the neural network and the weights will be
adjusted 400 times. The learning rate of the network is also set. The ‘train’ function adjusts the
weights of the network so the output of the network will be similar to the non-linear
pendulum.
The following diagram shows the training of the MLP. (Fig. 39) At the start of the training,
the error between the network and the pendulum is high. As the number of epochs increases
the mean squared error decreases. As the curve begins to converge, very little learning is
taking place. Looking at the training diagram below it is possible to determine the correct
amount of epochs for training.
0 50 100 150 200 250 300 350 400 45010
-4
10-3
10-2
10-1
100 Performance is 0.000411685, Goal is 0
475 Epochs
Tra
inin
g-B
lue
Fig. 39: The training error decreases as the neural network ‘learns’
40
When the training is finished, the neural network is exported to simulink using the ‘gensim’
command. The diagram below (Fig. 40) shows the neural network in simulink. Notice that the
neural model and the process receive the same input signal. To adequately test the quality of
the model the initial seed of the input signal must be changed.
The quality of the neural model is tested by calculating the MSE (mean squared error). The
MSE gives a good indication of the accuracy of the model. The MSE between the model and
the process should be low. A model could have a low MSE but not predict any of the
dynamics of the pendulum system. The output from the model and process is plotted to
compare the dynamics. Basically we want to see whether the model predicts the movement of
the inverted pendulum. Increasing the number of hidden layer neurons allows for more
complex functions to be modelled. During testing, neural networks with a range of hidden
layer neurons were simulated. It was expected that as the number of hidden neurons increased
the more accurate the model would become. The following graphs show the process output
plotted against the model outputs. (Fig. 41-44) The initial seed is kept the same for all
simulations to show the difference between the models.
Fig. 40: The quality of the model is tested in simulink
41
0 10 20 30 40 50 60-0.06
-0.04
-0.02
0
0.02
0.04
0.06
0.08Feed-forward 4 hidden neurons
Blue: NN output
Red: Process
0 10 20 30 40 50 60-0.06
-0.04
-0.02
0
0.02
0.04
0.06Feed-forward 10 hidden neurons
Blue NN output
Red Process output
Fig. 41: Feed-Forward Network, 1 Hidden layer, 4 hidden neurons
Pendulum Angle, rad
Pendulum Angle, rad
Time
Time
Fig. 42: Feed-Forward Network, 1 Hidden layer, 10 hidden neurons
42
0 10 20 30 40 50 60-0.06
-0.04
-0.02
0
0.02
0.04
0.06
0.08Feed-forward 20 hidden neurons
Blue: NN output
Red: Process
0 20 40 60 80 100 120-0.3
-0.2
-0.1
0
0.1
0.2
0.3
0.4Feed-forward 50 hidden neurons
Blue NN output
Red: Process
Fig. 43: Feed-Forward Network, 1 Hidden layer, 20 hidden neurons
Fig. 44: Feed-Forward Network, 1 Hidden layer, 50 hidden neurons
Pendulum Angle, rad
Pendulum Angle, rad
Time
Time
43
The feedforward networks model the process well. The MSE error is low and the neural model
predicts the pendulum angle. The results indicate that increasing the number of hidden
neurons does not improve the MSE between the model and the process.
Type ANN
Neurons in Hidden Layers
Training Epochs
Learning Rates
MSE
FF 50 500 0.0001 3.2622-6 FF 20 500 0.0001 3.4465e-6 FF 10 500 0.0001 1.1103e-5 FF 4 500 0.0001 1.54e-5
Most of the research in system identification uses open loop identification. Increasing the
number of hidden layer neurons will have a direct influence on the accuracy of the model. In
the closed loop case, it was found that using a de-tuned controller had more of an influence on
the model accuracy than increasing the number of hidden layer neurons.
The next type of ANN tested are Elman networks. These networks are discussed in Chapter 3.
Elman networks with their built in feedback loops enable them to have dynamic memory.
This makes them more suitable in predicting dynamic systems. Elman networks are setup in
matlab similar to feed-forward. Using the command ‘newelm’ the number of hidden layers,
number of hidden layer neurons and the type of activation functions can be set.
net= newelm([-10 10],[4 1],{'tansig' 'purelin'},'trainlm'); net.trainParam.epochs = 400; net.trainParam.lr = 0.001; net = train(net,in(1:1000)',tethain(1:1000)');
Elman networks with different sizes of hidden layers were tested. The learning rate, training
epochs and setup of the activation functions were all tested to determine the most accurate
model using Elman networks. The following graphs show the process output plotted against
the Elman model outputs. (Fig. 45-47)
Table 4: Setup/Results of the Feedforward networks
44
0 500 1000 1500 2000 2500-1
-0.5
0
0.5
1
1.5
2Elman 4 hidden neurons
Red: NN output
Blue: Process output
0 500 1000 1500 2000 2500-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6Elman 10 hidden neurons
Blue: NN output
Red: Process
Fig. 45: Elman Network, 1 Hidden layer, 4 hidden neurons
Fig. 46: Elman Network, 1 Hidden layer, 10 hidden neurons
Pendulum Angle, rad
Pendulum Angle, rad
Time
Time
45
Type ANN
Neurons in Hidden Layers
Training Epochs
Learning Rates
MSE
Elman 20 500 0.0001 0.4086 Elman 10 500 0.0001 0.2233 Elman 4 500 0.0001 1.2804
The results indicate that the elman neural models were not accurate. The maximum possible
size of hidden layer to be trained was approximately 20 neurons. It was found that when
training elman networks with more than 20 neurons in the hidden layer matlab crashes. This is
due to the high memory requirements when training elman networks in matlab. It was
expected that elman networks would approximate a dynamic process such as the pendulum
system better than the static feed-forward networks. The poor results of the elman networks is
due to the fact that the training data is from a closed loop system.
0 500 1000 1500 2000 2500-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8Elman 20 hidden neurons
Red: NN output
Blue: Process
Fig. 47: Elman Network, 1 Hidden layer, 20 hidden neurons
Table 5: Setup/Results of the Elman networks
Pendulum Angle, rad
Time
46
MMuullttii--oouuttppuutt iiddeennttiiffiiccaattiioonn
All of the previous neural models that have been developed have been single-input single-
output systems. The previous neural networks have modelled just the pendulum angle output.
A neural network is going to be developed which models the 1 input and 4 output states. The
diagram below shows the nonlinear pendulum with control law. (Fig. 48) In this example
feed-forward networks will be utilised to model the multi-output system.
The process and the neural model will receive the same input. Instead of having just one
target, the neural network will have 4 targets to learn. The neural network is trained by
presenting the 4 targets together at each time interval. A cell array combines the 4 different
targets into 1 input vector. Two sizes of feedforward networks (50 and 100 neurons) will
model the multi-output process.
clear tempP for k = 1:200, P = [y(k);ydot(k);tetha(k);tethadot(k)]; tempP = [tempP P]; end net = newff([-10 10],[50 4],{'tansig' 'purelin'},'trainlm'); net.trainParam.epochs = 500; net = train(net,in(1:200)',tempP);
Fig. 48: Nonlinear pendulum/controller is used as for training the ANN
47
The neural network is trained in matlab and when the training is over the neural network is
generated in simulink. The quality of the neural model is checked by comparing the 4 outputs
from the neural network to the 4 outputs of the process.
The following diagrams (Fig 50-53) show the response from the process and model outputs
for each state. The blue signal is the process output. The neural model with 100 neurons in the
hidden layer is more consistently more accurate that the neural model with 50 hidden neurons.
The neural model does a good job of modelling two of the output states however both neural
networks fail to model the velocity of the cart and angular velocity of the pendulum
accurately.
Fig. 49: The outputs from the ANN and model are compared
48
0 20 40 60 80 100 120 140 160 180
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
Displacement of the cart
Blue: Real OutputRed: Neural Model 50 Black: Neural Model 100
0 20 40 60 80 100 120 140 160 180
-10
-8
-6
-4
-2
0
2
4
6
8
10Velocity of the cart
Blue: Process OutputRed: Neural Model 50Black Neural Model 100
Fig. 50: Displacement of the cart
Meters, m
Fig. 51: Velocity of the cart
m/s
Time
Time
49
0 50 100 150 200 250 300-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1Angle of the pendulum
Blue: Process OutputRed: Neural model 50Black: Neural model 100
0 20 40 60 80 100 120 140 160
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
Angular velocity of the pendulum
Blue: Process outputRed: Neural model 50Black: Neural model 100
Fig. 52: Angle of the pendulum
Fig. 53: Angular velocity of the pendulum
Time
Time
Pendulum Angle, rad
rad/sec
50
System identification techniques have been applied to the inverted pendulum system. The
results from linear identification indicate that using closed loop techniques, its possible to
generate accurate transfer block models. Box-Jenkins and output-error models are developed
which have similar dynamics to the pendulum process and low mean squared error.
Neural networks were utilised to model the nonlinear pendulum. Before the nonlinear
penudulum is identified, the process is stabilised using feedback control. The feedback control
removes some of the nonlinearitys of the process so detuned controllers were used which
allows the process to exhibit more of its dynamics. This improved the quality of the data used
in the system identification. Different sizes of static feedforward networks were trained using
the input/output data from the nonlinear pendulum. The FF networks were generated in
simulink for testing. The FF networks showed similar dynamics to the pendulum model and a
low MSE.
Recurrent neural networks were also trained to identify the inverted pendulum. The recurrent
networks have built in feedback loops which enable them to model dynamic systems more
accurately than static feedforward networks. In practice, this was not the case. The results
from the Elman networks were not as accurate as the feedforward networks. Previous research
using recurrent networks have all been used in identification of open-loop stable systems.
When identifying open-loop unstable systems, feedback control must stabilise the system.
Unfortunately this also removes some of the dynamics of the process and makes accurate
identification more difficult. This is one of the main reasons why the results from the Elman
networks are not as accurate.
51
55 NNeeuurraall ccoonnttrrooll ooff tthhee iinnvveerrtteedd ppeenndduulluumm
The main task of this project is to design a controller which keeps the pendulum system
inverted. There are a few important points to remember when designing a controller for the
inverted pendulum. The inverted pendulum is open-loop unstable, non-linear and a multi-
output system. To show the advantages of using neural-control in this project a comparison
between astandard PID control and neuro-control is made.
Nonlinear system: Standard linear PID controllers cannot be used for this system because they
cannot map the complex nonlinearities in the pendulum process. ANN’s have shown that they
are capable of identifying complex nonlinear systems. They should be well suited for
generating the complex internal mapping from inputs to control actions.
Multi-output system: The inverted pendulum has four outputs, in order to have full state
feedback control four PID controllers would have to be used. Neural networks have a big
advantage here due to their parallel nature. One ANN could be used instead of four PID’s.
Open-loop unstable: The inverted pendulum is open-loop unstable. As soon as the system is
simulated the pendulum falls over. Neural networks take time to train so the pendulum system
will have to be stabilized somehow before a neural network can be trained.
Before the actual neuro-controller is developed in matlab, the main types of neuro-control are
discussed. The five types of neural network control methods that have been researched are
supervised, model refernce control, direct inverse, internal model control and unsupervised.
52
SSuuppeerrvviisseedd CCoonnttrrooll::
It is possible to teach a neural network the correct actions by using an existing controller or
human feedback. This type of control is called supervised learning. But why would we want to
copy an existing controller that already does the job? Most tradional controllers (feedback
linearisation, rule-based control) are based around an operating point. This means that the
controller can operate correctly if the plant/process operates around a certain point. These
controllers will fail if there is any sort of uncertainty or change in the unknown plant. The
advantages of neuro-control is if an uncertainty in the plant occurs the ANN will be able to
adapt it’s parameters and maintain controlling the plant when other robust controllers would
fail. In supervised control, a teacher provides correct actions for the neural network to learn.
(Fig. 54) In offline training the targets are provided by an existing controller, the neural
network adjusts its weights until the output from the ANN is similar to the controller. [18]
When the neural network is trained, it is placed in the feedback loop. Because the ANN is
trained using the existing controller targets, it should be able to control the process.
Fig. 54: Supervised learning using an existing controller
53
At this stage, there is a ANN which controls the process similar to the existing controller. The
real advantage of neuro-control is the ability to be adaptive online.(Fig.55) An error signal
(desired signal – real output signal) is calculated and used to adjust the weights online.
If a large disturbance/uncertainty occurs in the process- the large error signal is feedback into
the ANN and this adjusts the weights so the system remains stable.
MMooddeell RReeffeerreennccee CCoonnttrrooll
In the diagram above (Fig. 55) the error signal is generated by subtracting the output signal
from the desired system response. In model reference control the desired closed loop response
is specified through a stable reference model (Fig.56). [19] The control system attempts to
make the process output similar to the reference model output.
Fig. 55: Adaptive neural control
Fig. 56: Model reference control
54
DDiirreecctt IInnvveerrssee CCoonnttrrooll::
The next type of control technique researched is direct inverse control. The advantage of using
inverse control over supervised control is that inverse control does not require an existing
controller in training. Inverse control utilises the inverse of the system model. The diagram
below (Fig.57) is a simple example of direct inverse control. A neural network is trained to
model the inverse of the process. [20] When the inverse controller is cascaded with the
process the output of the combined system will be equal to the setpoint. The inverse
nonlinearities in the controller cancel out the nonlinearities in the process. For the
nonlinearities to be effectively cancelled, the inverse model must be very accurate.
Setpoint Output
[ ] 1mod )()( −= sGsG elc
Inverse modelling is used to generate the inverse of the process. The system output is used as
an input to the network. The ANN output is compared with the training signal (the system
input) and this error signal is used to train the network. (Fig. 58) This training method will
force the neural network to represent the inverse of the system.
)(sGc )(sG p
Fig. 57: Direct inverse control
Fig. 58: Inverse modelling of a process
55
There are certain problems associated with direct inverse control. In the case of the inverted
pendulum, the process may not be invertible. The inverted pendulum is open-loop unstable,
the training data would not show the dynamics of the system as the pendulum falls over
quickly. There also can be process-model mismatches. The training of the ANN for an inverse
model might lead to the model being not strictly proper. This will lead to unknown
disturbances in the system.
IInntteerrnnaall mmooddeell ccoonnttrrooll::
Internal model control is based on direct inverse control. The problems associated with direct
inverse control such as process-model mismatch, etc are reduced using IMC. The diagram
below shows the set-up of IMC (Fig. 59). [21]
A neural network model is placed in parallel with the real system. The controller is an inverse
model of the process. The filter makes the system robust to process-model mismatch. With the
IMC scheme, the aim is to eliminate the unknown disturbance affecting the system. The
difference between the process and the model d(s) is determined. If the ANN model is a good
approximate of the process then the d(s) is equal to the unknown disturbance. The signal d(s)
is the information that is missing from the NN-model and can be used to improve the control.
The d(s) signal is subtracted from the input setpoint Uin. In theory, using this method it is
possible to achieve perfect control.
Fig. 59: Diagram of Internal model control
56
UUnnssuuppeerrvviisseedd CCoonnttrrooll::
The previous neural control methods are all trained using a priori knowledge such as
an explicit teacher providing correct actions. In unsupervised learning set-up, no existing
controllers can be imitated and the ANN doesn’t have a target to compare to its output. The
ANN must try different states and determine which state produces a good output. Learning
from experience during periods of no performance feedback is difficult.
Anderson et al [22,23] developed an unsupervised controller for an inverted pendulum system.
Modifications to this controller are based on a failure signal. The failure signal occurs when
the pole falls past a certain angle or when the cart reaches the end of the track. A long period
of the pole being inverted can occur before the failure signal occurs and the controller must
decide which actions in the sequence contributed to the failure. The graph below shows the
results of the unsupervised controller. (Fig. 60)
It takes over 5000 failures until the total time of pendulum inverted increases above 1000. The
research by Anderson et al shows that the learning time in unsupervised control is very high
but the unsupervised ANN can deal with uncertainty and the complexities of nonlinear
control.
Fig. 60: Graph of Unsupervised controller
57
To determine the type of neural control to be used, the pros and cons of each of the control
methods was determined specifically relating to the pendulum system. To develop a
supervised neural controller for the inverted pendulum an existing controller is required. A
nonlinear controller has already been developed using feedback linearisation for the inverted
pendulum. This controller could be used as a teacher. The main disadvantage with this method
is the neural controller is based on a control law. The control law can only effectively control
the inverted pendulum model developed earlier. If this controller was applied to a more
complex pendulum model (Eg: including friction) the controller would fail to keep the
pendulum stable.
Inverse control and internal model control are both based on developing an accurate inverse
model of the inverted pendulum system. The problem with this method is the pendulum is
open loop unstable. To develop an inverse model, the pendulum must be stabilised using a
controller. When a feedback controller is used, the inverse model contains some of the
dynamics of the controller. The inverse model developed using a feedback controller would
never be accurate enough to be used in direct inverse control. This type of neural control is
suited to robotic applications and control of stable open-loop systems.
The unsupervised control of the inverted pendulum developed by Anderson is the only neural
control method that does not require some sort of existing controller for training. The results
using unsupervised learning are promising and show that its possible for the controller to
‘learn’ to keep the pendulum upright. Realistically the unsupervised method covered by
Anderson is too complex for the project time frame.
58
NNeeuurraall--ccoonnttrrooll iinn ssiimmuulliinnkk::
The first type of neural-control developed uses supervised learning. There is an existing
feedback controller for the nonlinear pendulum already developed. A feed-forward neural
network will be trained to imitate this controller. The neural controller will be developed
similar to the identification methods covered earlier. Below is a diagram (Fig.61) of the
nonlinear pendulum model and the control law. The four inputs (y, ydot, tetha, tethadot)
signals are stored in matlab. The target for the neural network is the output from the controller.
The four input signals and the target output are exported to the matlab workspace. The
following matlab code trains the neural network. The first section of code generates the ‘cell
array’. The cell array combines the 4 different inputs into 1 input vector. The FF network has
50 neurons in the hidden layer. The activation functions in the hidden layer are tan-sigmoid
and the output layer is a linear function. clear tempP for k = 1:500, P = [y(k);ydot(k);tetha(k);tethadot(k)]; tempP = [tempP P]; end net= newff([-2 2;-2 2;-2 2;-2 2],[50 1],{'tansig','purelin'},'trainlm'); net.trainParam.epochs = 500; net = train(net,tempP,out(1:500)');
Fig. 61: Supervised learning using the control law.
59
When the training is finished, the weights are set and a simulink ANN is generated. The
network is placed in the feedback loop instead of the existing controller. (Fig.62)
The plot on the left (Fig.63) shows the MSE error difference between the neural network and
the original controller. The error between the ANN and the controller is in the order of 710− so
the network is an accurate approximation of the controller. The diagram on the right (Fig.64)
is a plot of the pendulum angle with the above system.
The ANN above (Fig.62) is developed using the neural network toolbox. This toolbox allows
for the weights to adjusted in the matlab environment but when a simulink neural network is
created the weights are set and cannot be adjusted.-online learning isn’t possible.
0 100 200 300 400 500 600 700 8000
0.5
1
1.5
2
2.5x 10
-7
0 100 200 300 400 500 600 700 800
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
Fig. 62: The original controller is replaced by the neural network
Squared Error
Time
Pendulum Angle, rad
Time
Fig. 63: MSE between the original controller and ANN Fig. 64: Closed loop response with neural control
60
This is not a problem if the parameters of the pendulum system are fixed and there is no
disturbance to the system. The ANN cannot adapt its weights if a disturbance or uncertainty
occurs. This is shown below. (Fig.65) Using the set-up below, the mass of the cart is going to
be changed from 1.2 Kg to 1.28 Kg midway through the simulation using the switch.
The plot below shows the angle of the pendulum. (Fig. 66) The disturbance to the system is
added at 200 time units (approx). As the disturbance occurs the pendulum goes unstable. This
is because the ANN cannot adapt its weights using an error signal to counteract the
disturbance.
0 100 200 300 400 500 600 700-2
-1.5
-1
-0.5
0
0.5
1
1.5
Fig. 65: Introducing a disturbance to the system
Fig. 66: Plot of pendulum angle
Pendulum Angle, rad
61
This problem was solved by using an adaptive neural toolbox which is an add-on for simulink.
[24] This toolbox basically allows for online neural learning to occur. The block diagram of
the toolbox is shown below. (Fig.67) This toolbox contains an Adaline and MLP (Multi-
layered perceptron) simulink blocks.
All of the blocks have the same interface so its possible to try out many different networks
quickly and easily.
There is an interface for each block so that the user can set the network parameters such as
learning rate, number of neurons in each layer, etc.
The inputs to each block are:
x: The input vector to the neural network.
e: The error between the real output and the network approximation.
LE: A logic signal that enables or disables the learning.
The outputs of each block are:
Ys: The value of the approximated function.
X: All the “states” of the network, namely the weights and all the
parameters that change during the learning process.
Fig. 67: Simulink toolbox of adaptive neural networks
62
The first type of adaptive network to be used is the adaline. The adaline is used to approximate
‘almost linear’ functions. The adaline will be trained offline using the original feedback
controller. The diagram below (Fig.68) shows the simulink set-up of the adaptive offline
learning. The error signal is equal to the output of the controller minus the output of the
network. This signal is fed back into the adaline. The weights in the adaline are updated using
the steepest descent gradient, which minimises the square error between measurements and
estimates. The learning rate of the adaline is set to 0.01 and the sample rate to 0.05.
Fig. 68: Adaptive network trained using the original controller
error
Learning switch
63
Figure 69 shows the four weight values of the adaline as the training progresses. Figure 70
shows the error between the controller output and the ANN output. As the error goes to zero,
the network weights converge on their final values.
When the error converges to zero, the network is trained. The network can be now placed in
the feedback loop instead of the original controller.
0 2000 4000 6000 8000 10000 12000-5
0
5
10
15
20Network weights
0 2000 4000 6000 8000 10000 12000-40
-30
-20
-10
0
10
20
30Error signal
Fig. 69: Plot of the network weights
Fig.70: Plot of the error signal
64
The diagram below (Fig.71) is the simulink set-up of the adaptive neural controller. The
previous weights that are trained offline are now used as the initial weights the adaline. The
adaline network has an input error signal which is equal to the desired pendulum angle minus
the actual pendulum angle. The desired pendulum angle is produced by the stable linear
pendulum model block. This error signal is inputted into the adaline, which adapts the weights
online. This improves the performance of the network.
Figure 72 shows the pendulum angle from the simulation. The results indicate that the neural
controller keeps the pendulum angle stable.
0 1000 2000 3000 4000 5000 6000 7000 8000-0.01
-0.008
-0.006
-0.004
-0.002
0
0.002
0.004
0.006
0.008
0.01Pendulum Angle
Fig.71: Adaptive online neural controller using MRAC
Pendulum Angle, rad
Fig.72: Pendulum angle
65
The previous neural controller developed shows that a neural network can be trained offline
using another controller as a teacher. The neural controller can be then placed online where it
will continuously update its weights. The advantage of using adaptive control can be shown if
a disturbance occurs during operation. Using the set-up below, the mass of the cart is going to
be changed from 1.2 Kg to 1.28 Kg midway through the simulation using the switch.
Figure 74 shows the pendulum angle during the simulation. The pendulum angle oscillates
until 2000 time units. When the mass is changed, the oscillations increase to 0.1 radians. The
error signal increases and this adjusts the weights. The pendulum angle decreases to normal
oscillation.
Fig.73: Introducing a disturbance to the system
66
All of the previous work on adaptive neural control has used the Adaline network. It has been
proven that a MLP has greater accuracy than the Adaline in approximating nonlinear
functions. The reason for the adaptive research using the Adaline is because of the failure of
the MLP in controlling the inverted pendulum. The MLP was trained offline in the same
method as the Adaline to model the control law. The error between the MLP and the control
law was to the order of 310− . The MLP was placed in the feedback loop instead of the existing
controller. During simulations, the inverted pendulum goes unstable every time. As the MLP
was trained to model the control law there is no reason why the MLP should not control the
inverted pendulum.
0 1 2 3 4 5 6
x 104
-1.5
-1
-0.5
0
0.5
1
1.5x 10
-1 Pendulum Angle
Fig.74: Effect of the disturbance on the pendulum angle
Pendulum Angle, rad
Time
67
66 RReeaall--ttiimmee iiddeennttiiffiiccaattiioonn aanndd ccoonnttrrooll
The previous research that has been covered in identification and control has been based on a
nonlinear pendulum model. The dynamics of this model might be similar to the real system
but implementing identification and control is much more complicated in the real system. This
chapter discusses any of the practical work on the inverted pendulum rig.
The pendulum rig consists of a simple cart which runs along a track. The cart is restricted to
travelling in the track axis. The position of the cart is controlled by a DC motor and drive belt.
A pole with mass on the end is pivoted on the cart and is free to swing in the same axis. The
outputs from this system are the position of the cart along the track and the angle of the
pendulum. These are both measured using optical encoder sensors. The two output signals are
sent to a control algorithm in matlab via a data acquisition card. The control algorithm
determines a control action to keep the pendulum inverted. A DC signal controls the speed and
magnitude of the motor which determines the position of the cart. Figure 75 shows the digital
pendulum system.
Fig. 75: Setup of the digital pendulum system.
68
At this stage there is a system that measures the position of the cart and the angle of the
pendulum. There is also an interface which makes it possible to control the position of the
cart. The next important part of the pendulum system is the control algorithm in matlab. The
diagram below shows the real time kernal (RTK) in the matlab environment. The RTK is an
encapsulated block which covers all the control tasks. The input to the RTK block is the
desired cart position. The outputs of the real time task is a vector which contains information
about the pendulum angle, angular velocity, the cart position, cart velocity and the control
value for the DC drive.
There is no feedback control loop because the controller is embedded in the RTK. Two PID
controllers are utilized to stabilize the inverted pendulum. The first PID controls the angular
position of the pendulum. The second is used to control the position of the cart. The outputs of
the PID controllers are added to produce the final DC control signal. Figure 77 shows the
structure of the PID contollers.
Fig. 76: Real time task in simulink environment
69
Previously in the report it has been stated that it is not possible to control the nonlinear
pendulum using PID control. In this case, it is possible for the RTK to utilize PID control
because the pendulum is placed in the upright position in the linearised region before the
experiment starts. During the experiment the inverted pendulum can sometimes swing past the
linearised region and fall over. If the pendulum falls down during the experiment, it has to be
turned up manually. Figure 78 shows the pendulum angle during the simulation. It is clear that
the PID control stabilizes the inverted pendulum.
0 200 400 600 800 1000 1200 1400-0.02
-0.015
-0.01
-0.005
0
0.005
0.01Process pendulum angle
Fig. 78: Plot of the pendulum angle
Fig. 77: Structure of the PID controller
Pendulum Angle, rad
Time
70
The next stage is to develop a neural network which identifies the pendulum system online. A
multi-layered perceptron (MLP) is used to model the angle of the pendulum, θ and the
position of the cart, y. Figure 79 shows the setup of the identification process. The PID
controller is used to stabilize the inverted pendulum. Closed loop identification is necessary
for open loop unstable systems. The control signal input to the pendulum system is inputted
into the ANN. The error signal, yee ,θ between the process output and the ANN output is
backpropagated to adjust the weights of the MLP.
Figure 80 shows the simulink setup of the real time identification. A multi-layered perceptron
with 100 neurons in the hidden layer and a learning rate of 0.05 is utilized. The input into the
MLP is the control signal from the PID controller. The MLP has 2 outputs modelling the
pendulum angle and the cart position.
1θ
2θ
1y
2y
θe
ye
Fig. 79: Setup of the online identification process
71
Figure 81 shows the plot of the process pendulum angle and the neural network angle. The
blue graph is the inverted pendulum angle and the red is the neural model output. The MLP
shows that it is possible to identify the pendulum angle online.
0 200 400 600 800 1000 1200 1400-0.02
-0.015
-0.01
-0.005
0
0.005
0.01
Fig. 80: Simulink setup of the online identification
Fig. 81: Plot of the pendulum angle
Pendulum Angle, rad
Time
72
Figure 82 shows the plot of the process cart position. The blue graph is the cart position and
the red is the neural model output.
The real time kernal for the inverted pendulum system contains many different types of
controller- PID, nonlinear control law, etc. It is possible to develop and test prototype
controllers using the external controller function. The external controller is a file which
contains the control routine which is accessed at the interrupt time. The control algorithm eg:
neural, fuzzy, adaptive, etc must be written in C code. The develop a correct external
controller the input/output architecture and the limits of the signals must be obeyed.
In order to develop an external neural controller for the real time inverted pendulum it will
have to be written in C. The online neural toolbox used previously in the project contains
multiplayer perceptrons (MLP) which are all written in C. Instead of writing a neural
controller from scratch, a MLP could be adapted to the structure of external controller format.
Before the external controller is developed, a MLP must be trained to control the inverted
pendulum. It is possible to train a neural network to imitate the existing PID controller.
0 100 200 300 400 500 600 700 800-0.08
-0.06
-0.04
-0.02
0
0.02
0.04
0.06
0.08
0.1
Fig. 82: Plot of the cart position.
Time
Meters, m
73
Figure 83 shows the training of a neural network. The inputs to the MLP are the pendulum
angle and the cart position. The error signal between the output of the MLP and the PID
control signal is backpropagated to adjust the neural weights.
Figure 84 shows the plot of the real control signal and the output from the MLP. The blue
signal is the PID control and the red is the neural controller output.
0 50 100 150 200 250 300-1.5
-1
-0.5
0
0.5
1
1.5
Fig. 83: Training a MLP to imitate the PID controller
Fig. 84: Plot showing the MLP output and PID output
Time
Control signal
74
At this stage, there is a trained MLP which imitates the PID controller. The weights in the
MLP are stored. The next step is developing a neural controller in C and adapting it to external
controller format. Unfortunately there was not enough time in the project to implement the
real time neural controller. If a neural controller was implemented in C, the neural weights
from the trained network in Fig. 83 could be transferred to the new network and set as initial
weights. In theory this ANN should be able to control the real time inverted pendulum.
75
77 CCoonncclluussiioonnss
SSuummmmaarryy
This research has applied artificial neural networks to the identification and control of the
inverted pendulum. Before identification techniques could be tested, a model representing the
inverted pendulum was developed in simulink. Some of the modelling and control techniques
involved in the project are linear so a linearized version of the inverted pendulum was
developed. Open loop identification was initially tested but it was found that the inverted
pendulum is open loop unstable. One of the requirements for accurate identification is
experimental input-output data that shows the dynamics of the system. It was decided that
system identification would be performed in closed-loop so stabilizing feedback controllers
had to be developed for the linear and nonlinear inverted pendulum. A simple full-state
feedback controller stabilized the linear pendulum and a control law was developed to
stabilize the nonlinear pendulum. The closed loop data is stable and the inverted pendulum can
be simulated for longer times so more data can be collected.
Linear identification techniques were applied to the linear pendulum. The aim was to develop
a transfer function block that accurately modelled the inverted pendulum. An accurate model
will have a low MSE in relation to the process and the model will show some of the dynamics
of the process. The four types of model tested are ARX, ARMAX, Box-Jenkins and Output-
Error. It was found that the ARX and ARMAX could not model the inverted pendulum at all.
The Box-Jenkins and Output-Error models had low MSE but did not show any of the
dynamics of the inverted pendulum system. The few journals on closed loop identification
have all indicated that the best linear model structures are Box-Jenkins and Output-Error.
ARX and ARMAX both make assumptions that the noise spectrum models and the input-
output models have the same characteristic dynamics. This explains why the ARX and
ARMAX could not model the linear inverted pendulum. One of the reasons why the Box-
Jenkins and Output-Error models did not show any of the process dynamics is due to the
closed loop identification. Closed loop identification must be used on open-loop unstable
76
systems such as the inverted pendulum but one of the disadvantages is that feedback
controllers mask some of the dynamics of the system.
A detuned controller was used to control the process. This type of controller keeps the
inverted pendulum barely stable but more of the process dynamics can be seen. The Box-
Jenkins and Output-Error identification resumed with much better results. The main
conclusions from the linear identification are:
1. When trying to model an unstable system a feedback controller must be used to keep the
system stable. If the controller is de-tuned this will allow more of the pendulum dynamics
to be seen. This will make the model more accurate.
2. The Box-jenkins/Output error models are the only structures that can adequately model the
pendulum using the closed loop data.
3. The best way to test the quality of a model is to construct a transfer block of the model and
simulate it using a different initial input seed.
To achieve a better approximation of the inverted pendulum, the nonlinear system must be
used. The linear identification techniques were applied to the nonlinear pendulum system and
were found to be inadequate in modelling the nonlinear nature of the system. The nonlinear
nature of neural networks gives them an advantage over linear models in the prediction of
non-linear systems. Before the inverted pendulum system is identified, the process is
stabilized using the control law. The control law removes some of the nonlinearities from the
process so a detuned control law was used which allows the process to exhibit more of its
dynamics. This improves the quality of the data used in the system identification.
Initially single-input single-output networks were developed, the input being the control force
and the output pendulum angle. The first type of neural network to be developed are
feedforward. Feedforward networks with a range of hidden layer neurons were tested. The
feedforward networks modelled the inverted pendulum well. The MSE between the process
and the neuron model is low and the model predicts the dynamics of the pendulum angle. In
open-loop identification, increasing the number of hidden layer neurons will have a direct
77
influence on the accuracy of the model. In the closed loop case, it was found that using a
detuned controller had more of an influence on the model accuracy than increasing the number
of hidden layer neurons.
Recurrent ‘Elman’ networks are the second type of neural networks to be developed. Elman
networks have built in feedback loops which enable them to model dynamic systems such as
the inverted pendulum more accurately than static feedforward networks. Elman networks
with different sizes of hidden layer were tested. The results indicate that Elman networks
were not as accurate as feedforward in approximating the inverted pendulum. It was found that
when training the Elman networks to model the inverted pendulum, the training would get
stuck at a local minima. This affected the accuracy of the models developed. The poor results
of the Elman networks are due to the fact that the training data is from a closed loop system.
The next stage in system identification was to develop a multi-output neural network which
models the four outputs of the inverted pendulum. A feedforward network with 100 hidden
layer neurons was used to model the process. The neural network developed could model the
pendulum angle and the cart position accurately but completely fails to model the velocity of
the cart and angular velocity of the pendulum.
The main task in the project was to design a controller which keeps the pendulum system
inverted. The four main types of neural control (supervised, unsupervised, direct inverse and
internal model control) were researched to determine which control technique would be the
most efficient to implement. The earliest application of neural networks to the inverted
pendulum is by Widrow and Smith [25] and Widrow [26]. They used traditional control
methods to derive a control law to stabilize the linearized system. They then trained a neural
network to mimic the output of the control law. It was decided that supervised control would
be the least complex to implement. It was not possible to develop direct inverse control
because this control method requires that the process to be controlled is already open-loop
stable. The unsupervised control technique developed by Anderson was just too complex for
the project time frame. The first neuro-controller was developed by training a feedforward
network to model the control law. Elman networks were also used here to model the control
78
law but were not as accurate. When the training was finished the neural network was exported
into simulink and the network was placed in the feedback loop instead of the existing control
law. The neural network controlled the inverted pendulum similar to the control law.
An experiment was set-up which creates a disturbance to the process during the simulation.
The neural network lost control of the inverted pendulum because it was unable to adjust its
weights to counteract this disturbance. This problem was solved by using the adaptive neural
toolbox. This toolbox makes it possible for online neural learning to occur. Two types of
neural network were used – Adaline and multi-layered perceptron (MLP). The ANN was
trained offline using the control law. The advantage of using this type of network is if a
disturbance occurs during operation, the error signal is fed back into the Adaline which adjusts
the weights of the network and this counteracts the disturbance. The Adaline adaptive block is
designed for approximating ‘almost linear’ functions. It was found that the Adaline could
approximate the control law very accurately.
It was decided to test some of the identification and control techniques on the real time
inverted pendulum rig. The real-time inverted pendulum is also open-loop unstable. The real
time kernal (RTK) uses standard PID controllers to stabilize the system. Online identification
was possible using the adaptive neural toolbox. It was not possible to develop a neural
controller for the real time system but significant progress was made.
79
SSccooppee ffoorr ffuuttuurree wwoorrkk
The results from the Elman networks were not as accurate as the feedforward networks. The
dynamic Elman networks should have been more accurate when modelling a dynamic system
such as the inverted pendulum. This could be investigated. When modelling the inverted
pendulum closed loop identification must be used. One of the faults of closed loop
identification is the controller removes some of the dynamics of the process. More research is
needed in developing models from closed loop data.
The neural network controllers developed in the project were all based on the traditional
control law developed. When training an ANN using supervised learning there must be an
existing controller to copy. In order to develop a control law the dynamics of the process must
be known. If it is not possible to develop a control law or the dynamics of the process are not
known then there is no way to train a neural network. A solution to this problem is developing
an unsupervised controller. Unsupervised control does not require an accurate model of the
system dynamics or the systems desired behaviour. The only feedback signal to the controller
is a failure signal when the pendulum falls past a certain angle. The control signal must learn
through experience by trying various actions. The work done by Anderson [23] in
unsupervised control gives practical guidelines in developing a controller. The next possible
future research could be on unsupervised control of the inverted pendulum. Supervised control
with neural networks has been done a thousand times now and unsupervised control is a more
difficult but interesting problem.
80
88 BBiibblliiooggrraapphhyy
[1] Friedland, Bernard. (1987), Control System design, New York: McGraw-Hill,
pp 30-52
[2] Ljung, L (1987) System Identification-Theory for the user, Prentice Hall
[3] Nechyba and Xu, (1994), “Neural network approach to control system identification
with variable activation functions”, Robotics Institute, Carnegie-Mellon University
[4] Guez, A., Selinsky,J., “A trainable neuromorphic controller, Journal of robotic
systems”, Vol 5, No.4, pp 363-388, 1988.
[5] Davalo, Naim Neural Networks, Macmillan.
[6] Hunt and Sbarbaro, “Neural Networks for Control System - A Survey”,
Automatica, Vol. 28, 1992, pp. 1083-1112.
[7] Neural Network Toolbox Users Guide, October 1998, The Mathworks Inc.
[8] Pham and Liu, Neural Networks for Identification, Prediction and Control, Springer
[9] Cybenko,G “Approximation by superposition of a Sigmoidal Function, Mathematics
of Control, Signals and Systems”, Vol 2, No. 4, pp 303-314, 1989.
[10] Saerens M., Soquet A., “Neural Controller based on back-propagation algorithm”, IEE
Proceedings –F, Vol. 138, No.1, pp 55-62,1991
81
[11] Narendra K.S., Parthasarathy K., “Identification an control of dynamical systems using
neural networks, IEEE transactions on neural networks”, Vol.1, No.1, pp 4-27.
[12] Ljung, L System Identification-Theory for the user, Prentice Hall
[13] Billings, S.A., “Introduction to nonlinear system analysis and identification”. In K
Godfrey and P Jones, Signal Processing for control, Springer-Verlag, Berlin.
[14] Ljung, L System Identification-Theory for the user, Prentice Hall
[15] Johansson, Rolf – System modelling and identification, Prentice Hall
[16] Snow,W. and Emigholz,K., “Increase model predictive control (MPC) project
efficiency by using a modern identification method.”, ERTC Computing, Paris, France
[17] Hunt and Sbarbaro, “Neural Networks for Control System - A Survey”,
Automatica, Vol. 28, 1992, pp. 1083-1112.
[18] Hagan,M and Demuth,H- Neural Network Design, Boston,PWS, 1996
[19] Marco,P and Raul,L “Application of several neurocontrolschemes to a 2 DOF
manipulator”.
[20] Magnus Norgaard, Neural Network Design Toolkit,
http://www.iau.dtu.dk/research/control/nnlib/manual.pdf
[21] Hunt and Sbarbaro, “Neural Networks for Control System - A Survey”,
Automatica, Vol. 28, 1992, pp. 1083-1112.
82
[22] Barto, Sutton and Anderson, “Neuronlike adaptive elements that can solve difficult
learning control problems”, IEEE Trans on Systems, Man and Cybernetics,
Vol SMC-13, pp834-846, Sept-Oct 1983
[23] C.W. Anderson. “Learning to control an inverted pendulum using neural networks”,
IEEE Controls Systems Magazine, 9:31-37, 1989.
[24] Campa, Fravolini, Napolitano- “A library of Adaptive neural networks for control
purposes.” The simulink library can be downloaded from the Mathworks file
exchange website in the ANN section.
http://www.mathworks.com/matlabcentral/fileexchange/
[25] Widrow, B. and Smith,F., “Pattern-Recognising Control Systems,”1963 Computer and
Information Sciences (COINS) Symp. Proc., Washington DC: Spartan, pp288-317,
1964
[26] Widrow, B., “The Original Adaptive Neural Net Broom-Balancer,” Int. Symp. Circuits
and Systems, Vol.5, no.4, pp. 363-388, Aug. 1988.
top related