optimal nonlinear neural network controllers for aircraft joint university program meeting october...
TRANSCRIPT
Optimal Nonlinear Neural Network Controllers for
AircraftJoint University Program Meeting
October 10, 2001
Nilesh V. Kulkarni
Advisors
Prof. Minh Q. Phan
Dartmouth College, Hanover, NH
Prof. Robert F. StengelPrinceton University, NJ
Presentation Outline:
Research goals. Definition of the problem. Parametric optimization approach. Modified Approach. Neural Network implementation. Linear System Implementation. Nonlinear System Implementation. Conclusions
Research Goals
To come up with a control approach that is: Optimal or approaching optimality in the
limit Applicable to both linear and non-linear
systems Data-based (no need for an explicit
analytical model of the system) Adaptive to account for slowly time-varying
dynamics and operating conditions. Application to aircraft.
General Problem Statement:
For the system dynamics:
Find a control law:
To maximize a performance index (minimize a cost function)
]),(),(),([)( kkpkukxfkx 1
)]([)( kxuku
1
112
1
i
TT ikRuikuikQxikxJ )]()()()([
)](),([)( kukxhky
Approaches:
Dynamic optimization approaches: Calculus of variations approach.
Euler-Lagrange equations. Dynamic programming.
Hamilton-Jacobi-Bellman equation. Specialization to Adaptive critic designs.
Static optimization approach: Parametric Optimization. Cost-to-go approach.
Direct Parametric Optimization Approach:
Methodology:
Find the unknown coefficients,’G’ , that minimize the cost-to-go function
Disadvantages:
r
i
TT ikRuikuikQxikxGkV1
112
1)]()()()([),(
)( 1kx))(),(( kukxf)),(( Gkxu)(kx
This approach reduces to solving a static optimization problem which is highly nonlinear even for linear systems
Easily gets stuck in spurious local minima even for the case of finding a linear optimal controller for a linear system
Chance of finding a workable optimal controller using such an approach in practice is very limited.
Illustrative Example:For a simple linear time invariant system,
and
we can write,
And so,
As seen the cost-to-go function expressed with a single parameter ‘G’ , is a highly nonlinear function of the parameter and as seen from several test examples was found to contain several minima.
)()()( kBukAxkx 1
)()( kGxku
)()()( kxBGAikx i
)(])()()()[()(
)]()()()([),(
kxBGARGGBGABGAQBGAkx
ikRuikuikQxikxGkV
r
i
iTTiiTiT
r
i
TT
1
11
1
2
1
112
1
]),([)(
...
]),([)(
]),([)(
rr Gkxurku
Gkxuku
Gkxuku
1
1 22
11
Modified Approach: Reformulate the control law:
Set up the cost-to-go function in terms of the
‘G’s’:
r
i
TTr ikRuikuikQxikxGGkV
11 11
2
1)]()()()([),..,,(
‘r’ represents the order of approximation of the cost-to-go function
Modified Approach… Find the G’s by imposing the stationarity conditions:
and
Solving the second set of equations is not as easy and even less implementable in terms of a control architecture.
x(k), the present state of the system appears as a coefficient in the stationarity conditions.
By solving the stationarity conditions for multiple x(k)’s, presents enough equations for solving for the unknown G’s without solving the second set of conditions.
0... 0 021
;;;
rGV
GV
GV
),,,(
...
),,,(
RQGfGG
RQGfGG
rrr 1
122
Illustrative Example:For a simple linear time invariant system,
we can write,
And so,
As seen the cost-to-go function now expressed with the parameters ‘G’s’ , is a quadratic function of the parameter and therefore has a single minimum
)()()( kBukAxkx 1
)()(
...
)()(
)()(
kxGrku
kxGku
kxGku
r
1
1 2
1
)()...()( kxBGABGBGAAikx iiii
111
)(]...)()(...
)...()...[()()(
kxRGGRGGBGAQBGA
BGABGAQBGABGAkxkV
Tr
Tr
T
rrrT
rrrT
1111
11
2
1
Role of Neural Networks
For a nonlinear system, the controller is typically nonlinear.
Cost-to-go function is a nonlinear function. Being universal function approximators, Neural
Networks present themselves as ideal tools for handling nonlinear systems in the proposed Cost-to-go design approach
Neural networks present a straightforward approach for making the design adaptive even in the case of a nonlinear system.
Parameterize the cost-to-go function using a Neural Network (CGA Neural Network)
Inputs to the CGA Network:x(k), u(k),…,u(k+r-1)
Use the analytical model, or a computer simulation or the physical model to generate the future states.
Use the ‘r’ control values and the ‘r’ future states to get the ideal cost-to-go function estimate.
Use this to train the CGA Neural Network
Formulation of the Control Architecture: NN Cost-to-go function Approximator
r
i
TT ikRuikuikQxikxkV1
112
1)]()()()([)(
CGA Neural Network Training
Actual Systemor
Simulation Model
Neural Net Cost-to-go
Approximator
x(k)u(k)
+
V(k)
Vnn(k)
Verr
Neural Network Cost-to-go Approximator Training
u(k+1)u(k+r-1)
Formulation of the Control Architecture: NN Controller
Instead of a single controller structure (G), we need ‘r’ controller structures.
The outputs of the ‘r’ controller structures, generate u(k) through u(k+r-1).
Parameterize the ‘r’ controller structures using an effective Neural Network.
x(k)Neural
Network
Controller
u(k)
u(k+1)…u(k+r-1)
Neural Network Controller Training
Gradient of V(k) with respect to the control inputs u(k) ,…, u(k+r-1) is calculated using back-propagation through the ‘CGA’ Neural Network.
These gradients can be further back-propagated through the Neural Network controller to get, through
Neural Network controller is trained so that1GkV
)(
rGkV
)(
riGkV
i
...,)(
10
x(k)
Neural
Network
Controller
TrainedCost-to-go
Neural Network
u(k)
x(k)
V(k)
1
u(k+1)…u(k+r-1)
iukV
)(
Advantages of the formulation The modified parametric optimization simplifies the
optimization problem. CGA Network training and the controller Network training is
decoupled. Implementation is system independent. So the basic
architecture remains the same for linear or nonlinear systems.
Implementation is data-based. No explicit analytical model needed.
Parameterization using Neural Networks makes the control architecture adaptive.
Order of approximation ‘r’ in the definition of V(k) serves as a tuning parameter.
Implementation for Linear Systems:
Motivation: Linear systems provide an easy way
to see the details of the implementation of the cost-to-go design.
Provides a means for comparison of the results with existing solutions.
Optimal Control of Aircraft Lateral Dynamics
a
r
kφ
kr
kp
kβ
kφ
kr
kp
kβ
0003.00003.0
0179.00024.0
0637.00504.0
0529.00
)(
)(
)(
)(
100098.00004.0
0001.09992.00035.00690.0
002.00075.09665.00848.0
0036.00099.009811.0
)1(
)1(
)1(
)1(
1
))(
)(
10
01)()(
)1(
)1(
)1(
)1(
500000
05000
00100
00010
)1()1()1()1(()(rk
ki r
ara iδ
iδiδiδ
iφ
ir
ip
iβ
iφiripiβkV
)()()()()(
)(kφkrkpkβG
k
k
a
r
Airplane State Variables:
β - Side slip angle
p- Roll rate
r - Yaw rate
φ - Roll angle
Airplane Input Variables:
r - Rudder Deflection
a - Aileron DeflectionPhoenix Hobbico
Hobbistar 60tm model
Order of approximation (‘r’)
Evaluatedcost (‘J’)
Control gain
25 10.1598
35 10.0935
50 10.0925
Gain obtained using the new data-based approach:
4.8498-5.0124-0.3318-7.0398-
19.39011.1945-3.41204.3039-G
1355.55207.44424.05535.6
6319.167872.07868.37907.3G
8351.49646.43411.09927.6
7413.181687.13616.32247.4G
8365.40102.53313.00389.7
3265.191967.14077.33002.4G
Optimal controller gains calculated using LQR optimal solution with perfect knowledge of the system:
Evaluated optimal cost = 10.0925
- State trajectories using the cost-to-go design (r = 35)*** - State trajectories using the cost-to-go design (r = 50) - State trajectories using LQR based optimal control
0 0.5 1 1.5 2 2.5 3-0.2
0
0.2
Sid
e s
lip (
rad)
0 0.5 1 1.5 2 2.5 3-0.2
0
0.2R
oll
rate
(ra
d/s
)
0 0.5 1 1.5 2 2.5 30
0.1
0.2
Yaw
rate
(ra
d/s
)
0 0.5 1 1.5 2 2.5 3-10
-5
0
5x 10
-3
time (sec)
Roll
angle
(ra
d)
Comparison of the state trajectories using the cost-to-go design and the LQR design
0 0.5 1 1.5 2 2.5 3-1
-0.5
0
0.5
Aile
ron
Inpu
t (r
ad)
0 0.5 1 1.5 2 2.5 3-1
-0.5
0
0.5
time (sec)
Rud
der
inpu
t (r
ad)
- Control trajectories using the cost-to-go design (r = 35)*** - Control trajectories using the cost-to-go design (r = 50)
- Control trajectories using LQR based optimal control
Comparison of the control trajectories using the cost-to-go design and the LQR design
Nonlinear Control of Aircraft in an approach configuration
h hglide
xglide
x
V
Aircraft in an approach configuration
e
TTT
Vg
mVL
mVT
gmD
mT
V
Vh
VX
)(
cossin
sincos
sin
cos
.
Equations of motion in the wind-axes system
Vnom (ft/s) nom (deg) nom (deg) Tnom (lb)
235 -3 3.6 16800
Nominal Flight Conditions
M (slugs)
Tmax (lb) Sref (ft2) CL0 CLα CD0 e
4660 42000 1560 1.36
5.04
0.064
4 0.067
Aircraft Parameters
Implementation Details Equations are written with a change of coordinates while maintaining the
nonlinearity.
X and h are transformed through a coordinate transformation,
so that now they represent perturbations along and perpendicular to the approach slope and we can now ignore the dynamics of the perturbation along the approach slope.
Tnomnom
nom
Tnomnomnomnomnom
nom
TtTt
ututu
TtTtVtVththtXtX
txtxtx
)()(
)()(
)()()()()()()(
)()()(
nomnomapproach
nomnomapproach
Xhh
XhX
sincos
cossin
Implementation Details… Equations of motion in terms of the nonlinear perturbation
dynamics:
Equations are discretized with a time step of 0.5 seconds. Specification of the cost function:
e
nomnomnom
nomnom
nom
nomnomnom
nomapproach
TTT
VVg
VVmL
VVmTT
gmD
mTT
V
VVh
)(
)cos()()(
)sin()(
)(
)sin()cos()(
)sin()(
.
t
t
i
TT
f
iuRiuixQixJ1
112
1)]()()()([
Tapproach T ΔΔTVhx ; T
])([
])[(8
824
101
1011010
diagR
diagQ
Control Architecture
Controller hidden layer
(sigmoid)
Controller output layer
(linear)
Controller present and
future outputs CGA hidden
layer (sigmoid)
CGA output layer
(linear)
X(k) V(k)
Combined Neural Network having the CGA Network in front of the Controller Network.
Forming a combined Neural Network Fix the weights of the CGA part of the Network Training inputs to the network: Random values of
x(k) Train the Network so that it gives the output value
of zero for all the input random x(k)
ControllerNetwork
Subnet 1
Subnet 2
Subnet r
x(k)
A LayerWith
QuadraticNeurons
V(k)
x(k+2)
x(k+1)
u(k)
x(k+r)
u(k+1)
u(k+r-1)
u(k)
u(k+1)
u(k+r-1)
A Control Architecture Proposed to Simplify the Neural Network Training Problem
CGA Network
Bringing Structure to the CGA Network
Implementation of the quadratic layer
1x
2x
1q
2q
0
0
211xq
222 xq
)( 222
211 xqxq
Layer withsquare neurons
1
1
Advantages of the new structure:
Guaranteed positive definiteness. Replaced training of a complex
function by the training of several simpler functions.
A good quality control ability. Allow for hybrid architecture
u(k),…,u(k+3)
Subnet 2
Subnet 3
Subnet 4
Subnet 5
Subnet 1
Subnet 3
Subnet 3
Subnet 4
Subnet 4
Subnet 5
Quadratic Layerof Neurons
x(k)
u(k)
u(k),…,u(k+4)
x(k+1)
x(k+3)
x(k+4)
x(k+5)
u(k+5),…,u(k+9)
u(k),…,u(k+9)
V(k)
x(k+6)
x(k+10)
u(k),u(k+1)
u(k),…,u(k+2)
u(k+3),…,u(k+5)
u(k+4),…,u(k+7)
u(k+5),…,u(k+8)
10
1
k
ki
TT RuuQxx )(
Implementation of the Hybrid CGA Network of order ‘r = 10’, using trained subnets of order 1 through 5
Internal Structure of the Neural Network Controller showing the separate Controller Subnets
x(k)
u(k)
u(k+i)
u(k+r-1)
Neural Network Controller training using the trained CGA Network
1
x(k)
ControllerNetwork
(G1,G2,…,Gr)
TrainedCGA
Network
u(k)x(k)
V(k)estu(k+1)
u(k+r-1)
iu
V
Combination Network with the controller Network before the Critic Network
Cumulative value of V(k) getting minimized with training of the Neural Network Controller Weights
. Aircraft response after an Initial perturbation with and without control
. -Open loop dynamics, - Response with the Optimal Nonlinear Neural Network Controller, ‘O’ – Response with the LQR
TX 7000-0.1530-2001 )(
Neural Network Controller Outputs
‘r’ ‘J’
5 410.8852
10 168.2999
15 93.6265
20 92.4814
25 88.5529
A comparison of the cost function, J, as a function of the order of approximation, ‘r’ of
V(k)
Nonlinear Optimization- Global or Local
Conclusions:
New Neural Network Control Architecture for optimal control.
Applicable to both linear and nonlinear systems
Data based. Systematic training procedure. Confirmation on a Nonlinear Aircraft
Model.