optimal control and applications - ubgerard/astronet-ii/astronet-ii-becerra.pdf · optimal control...

Optimal Control and Applications

V. M. Becerra - Session 1 - 2nd AstroNet-II Summer School

Optimal Control and ApplicationsSession 1

2nd AstroNet-II Summer School

Victor M. BecerraSchool of Systems Engineering

University of Reading

5-6 June 2013

1 / 85



Contents of Session 1

Introduction to optimal control

A brief on constrained nonlinear optimisation

Introduction to to calculus of variations

The optimal control problem

Optimality conditions

Pontryagin’s minimum principle

Equality path constraints

Singular arcs

Inequality path constraints

2 / 85




Direct collocation methods for optimal control

Local discretisations

Global (pseudospectral) discretisations

Practical aspects

Multi-phase problemsEfficient sparse differentiationScalingEstimating the discretisation accuracyMesh refinement

3 / 85




For the purposes of these lectures, a dynamical system can bedescribed by a system of ordinary differential equations, as follows:

x = f[x(t),u(t), t]

where x : [t0, tf ] :7→ Rnx is the state, u(t) : [t0, tf ] 7→ Rnu is thecontrol, t is time, and f : Rnx ×Rnu ×R 7→ Rnx . Suppose that weare interested in an interval of time such that t ∈ [t0, tf ].

4 / 85




We are also interested in a measure of the performance of thesystem along a trajectory.

We are going to quantify the performance of the system bymeans of a functional J, this is a rule of correspondence thatassigns a value to a function or set of functions.

The functional of interest depends on the state history x(t)and the control history u(t) over the period of interestt ∈ [t0, tf ], the initial time t0 and the final time tf .

J = J[x(·),u(·), t0, tf ]

5 / 85




The performance index J might include, for example, measures of:

control effort

fuel consumption

energy expenditure

time taken to reach a target

6 / 85




What is optimal control?

Optimal control is the process of finding control and state historiesfor a dynamic system over a period of time to minimise ormaximise a performance index.

7 / 85




Optimal control theory has been developed over a number ofcenturies.

It is closely related to the theory of calculus of variationswhich started in the late 1600’s.

Also related with developments in dynamic programming andnonlinear programming after the second world war.

The digital computer has enabled many complex optimalcontrol problems to be solved.

Optimal control is a fundamental tool in space missiondesign, as well as in many other fields.

8 / 85




As a form of motivation, consider thedesign of a zero propellant maneouvrefor the international space station bymeans of control moment gyroscopes(CMGs).

The example work was repored by Bhatt(2007) and Bedrossian and co-workers(2009).

The original 90 and 180 degreemaneouvres were computed using apseudospectral method.

Image courtesy of nasaimages.org.

9 / 85




Implemented on the International Space Station on 5November 2006 and 2 January 2007.

Savings for NASA of around US$1.5m in propellant costs.

The case described below corresponds with a 90 maneovurelasting 7200 seconds and using 3 CMG’s.

10 / 85




The problem is formulated as follows. Findqc(t) = [qc,1(t) qc,2(t) qc,3(t) qc,4]T , t ∈ [t0, tf ] and the scalarparameter γ to minimise,

J = 0.1γ +

∫ tf

t0

||u(t)||2dt

subject to the dynamical equations:

q(t) =1

2T(q)(ω(t)− ωo(q))

ω(t) = J−1 (τd(q)− ω(t)× (Jω(t))− u(t))

h(t) = u(t)− ω(t)× h(t)

where J is a 3× 3 inertia matrix, q = [q1, q2, q3, q4]T is thequaternion vector, ω is the spacecraft angular rate relative to aninertial reference frame and expressed in the body frame, h is themomentum, t0 = 0 s, tf = 7200 s

11 / 85




The path constraints:

||q(t)||22 = 1

||qc(t)||22 = 1

||h(t)||22 ≤ γ||h(t)||22 = h2

max

The parameter bounds

0 ≤ γ ≤ h2max

And the boundary conditions:

q(t0) = q0 ω(t0) = ωo(q0) h(t0) = h0

q(tf ) = qf ω(tf ) = ωo(qf ) h(tf ) = hf

12 / 85




T(q) is given by:

T(q) =

−q2 −q3 −q4

q1 −q4 q3

q4 q1 −q2

−q3 q2 q1

13 / 85




u is the control force, which is given by:

u(t) = J (KP ε(q, qc) + KD ω(ω, qc))

whereε(q,qc) = 2T(qc)Tq

ω(ω, ωc) = ω − ωc

14 / 85




ωo is given by:ωo(q) = nC2(q)

where n is the orbital rotation rate, Cj is the j column of therotation matrix:

C(q) =

1− 2(q23 + q2

4) 2(q2q3 + q1q4) 2(q2q4 − q1q3)2(q2q3 − q1q4) 1− 2(q2

2 + q24) 2(q3q4 + q1q2)

2(q2q4 + q1q3) 2(q3q4 − q1q2) 1− 2(q22 + q2

3)

τd is the disturbance torque, which in this case only incorporatesthe gravity gradient torque τgg , and the aerodynamic torque τa:

τd = τgg + τa = 3n2C3(q)× (JC3(q)) + τa

15 / 85




Image courtesy of nasaimages.org

Video animation of the zero propellant maneouvre using realtelemetry.http://www.youtube.com/watch?v=MIp27Ea9_2I

16 / 85

http://www.youtube.com/watch?v=MIp27Ea9_2I




Numerical results obtained by the author with PSOPT using apseudospectral discretisation method with 60 grid points,neglecting the aerodynamic torque.

-30

-20

-10

0

10

20

0 1000 2000 3000 4000 5000 6000 7000 8000

u (ft

-lbf)

time (s)

Zero Propellant Maneouvre of the ISS: control torques

u1u2u3

3000

4000

5000

6000

7000

8000

9000

10000

0 1000 2000 3000 4000 5000 6000 7000 8000

h (ft

-lbf-s

ec)

time (s)

Zero Propellant Maneouvre of the ISS: momentum norm

hhmax

17 / 85




Ingredients for solving realistic optimal control problems

18 / 85




Indirect methods

Optimality conditions of continuous problems are derivedAttempt to numerically satisfy the optimality conditionsRequires numerical root finding and ODE methods.

Direct methods

Continuous optimal control problem is discretisedNeed to solve a nonlinear programming problem

Later in these lectures, we will concentrate on some directmethods.

19 / 85




Consider a twice continuously differentiable scalar functionf : Rny 7→ R. Then y∗ is called a strong minimiser of f if thereexists δ ∈ R such that

f (y∗) < f (y∗ + ∆y)

for all 0 < ||∆y|| ≤ δ.

Similarly y∗ is called a weak minimiser if it is not strong and thereis δ ∈ R such that f (y∗) ≤ f (y∗ + ∆y) for all 0 < ||∆y|| ≤ δ. Theminimum is said to be local if δ <∞ and global if δ =∞.

20 / 85




The gradient of f (y) with respect to y is a column vectorconstructed as follows:

∇yf (y) =∂f

∂y

T

= f Ty =

∂f∂y1...∂f∂yny

21 / 85




The second derivative of f (y) with respect to y is an ny × nysymmetric matrix known as the Hessian and is expressed asfollows:

∇2yf (y) = fyy =

∂f 2/∂y21 · · · ∂f 2/∂y1∂yny

.... . .

...∂f 2/∂yny∂y1 · · · ∂f 2/∂yny

2

22 / 85




Consider the second order Taylor expansion of f about y∗:

f (y∗ + ∆y) ≈ f (y∗) + fy(y∗)∆y +1

2∆yT fyy(y∗)∆y

A basic result in unconstrained nonlinear programming is thefollowing set of conditions for a local minimum.

Necessary conditions:

fy(y∗) = 0, i.e. y∗ must be a stationary point of f .fyy(y∗) ≥ 0, i.e. the Hessian matrix is positive semi-definite.

Sufficient conditions:

fy(y∗) = 0, i.e. y∗ must be a stationary point of f .fyy(y∗) > 0, i.e. the Hessian matrix is positive definite.

23 / 85




For example, the parabola f (x) = x21 + x2

2 has a global minimum atx∗ = [0, 0]T , where the gradient fx(x∗) = [0, 0] and the Hessian ispositive definite:

fxx(x∗) =

[2 00 2

]

24 / 85




Constrained nonlinear optimisation involves finding a decisionvector y ∈ Rny to minimise:

f (y)

subject tohl ≤ h(y) ≤ hu,

yl ≤ y ≤ yu,

where f : Rny 7→ R is a twice continuously differentiable scalarfunction, and h : Rny 7→ Rnh is a vector function with twicecontinuously differentiable elements hi . Problems such as theabove are often called nonlinear programming (NLP) problems.

25 / 85




A particular case that is simple to analyse occurs when thenonlinear programming problem involves only equality constraints,such that the problem can be expressed as follows. Choose y tominimise

f (y)

subject toc(y) = 0,

where c : Rny 7→ Rnc is a vector function with twice continuouslydifferentiable elements ci , and nc < ny .

26 / 85




Introduce the Lagrangian function:

L[y, λ] = f (y) + λTc(y) (1)

where L : Rny ×Rnc 7→ R is a scalar function and λ ∈ <nc is aLagrange multiplier.

27 / 85




The first derivative of c(y) with respect to y is an nc × ny matrixknown as the Jacobian and is expressed as follows:

∂c(y)

∂y= cy(y) =

∂c1/∂y1 · · · ∂c1/∂yny...

. . ....

∂cnc/∂y1 · · · ∂cnc/∂yny

28 / 85




The first order necessary conditions for a point (y∗, λ) to be a localminimiser are as follows:

Ly[y∗, λ]T = fy(y∗)T + [cy(y∗)]Tλ = 0

Lλ[y∗, λ]T = c(y∗) = 0(2)

Note that equation (2) is a system of ny + nc equations withny + nc unknowns.

29 / 85




For a scalar function f (x) and a single equality constraint h(x), thefigure illustrates the collinearity between the gradient of f and thegradient of h at a stationary point, i.e. ∇x f (x∗) = −λ∇xc(x∗).

30 / 85




For example, considerthe minimisation ofthe function f (x) = x2

1 + x22

subject to thelinear equality constraintc(x) = 2x1 + x2 + 4 = 0.

The first ordernecessary conditions leadto a system of three linearequations whose solutionis the minimiser x∗.

31 / 85




In fairly simple cases, the necessary conditions may help find thesolution. In general, however,

There is no analytic solution

It is necessary to solve the problem numerically

Numerical methods approach the solution iteratively.

32 / 85




If the nonlinear programming problem also includes a set ofinequality constraints, then the resulting problem is as follows.Choose y to minimise

f (y)

subject toc(y) = 0,

q(y) ≤ 0

where q : Rny 7→ Rnq is a twice continuously differentiable vectorfunction.

33 / 85




The Lagrangian function is redefined to include the inequalityconstraints function:

L[y, λ, µ] = f (y) + λTc(y) + µTq(y) (3)

where µ ∈ Rnq is a vector of Karush-Kuhn-Tucker multipliers.

34 / 85



Definition

Active and inactive constraints An inequality constraint qi (y) issaid to be active at y if qi (y) = 0. It is inactive at y if qi (y) < 0

Definition

Regular point Let y∗ satisfy c(y∗) = 0 and q(y∗) ≤ 0, and letIc(y∗) be the index set of active inequality constraints. Then wesay that y∗ is a regular point if the gradient vectors:

∇yci (y∗), ∇yqj(y

∗), 1 ≤ i ≤ nc , j ∈ Ic(y∗)

are linearly independent.

35 / 85




In this case, the first order necessary conditions for a localminimiser are the well known Karush-Kuhn-Tucker (KKT)conditions, which are stated as follows.If y∗ is a regular point and a local minimiser, then:

µ ≥ 0

Ly[y∗, λ, µ] = fy(y∗) + [cy(y∗)]Tλ+ [qy(y∗)]Tµ = 0

Lλ[y∗, λ] = c(y∗) = 0

µTq(y∗) = 0

(4)

36 / 85




Methods for the solution of nonlinear programming problemsare well established.

Well known methods include Sequential QuadraticProgramming (SQP) and Interior Point methods.

A quadratic programming problem is a special type ofnonlinear programming problem where the constraints arelinear and the objective is a quadratic function.

37 / 85




SQP methods involve the solution of a sequence of quadraticprogramming sub-problems, which give, at each iteration, asearch direction which is used to find the next iterate througha line search on the basis of a merit (penalty) function.

Interior point methods involve the iterative enforcement ofthe KKT conditions doing a line search at each major iterationon the basis of a barrier function to find the next iterate.

38 / 85




Current NLP implementations incorporate developments innumerical linear algebra that exploit sparsity in matrices.

It is common to numerically solve NLP problems involvingvariables and constraints in the order of hundreds ofthousands.

Well known large scale NLP solvers methods include SNOPT(SQP) and IPOPT (Interior Point).

39 / 85



A quick introduction to calculus of variations

Definition (Functional)

A functional J is a rule of correspondence that assigns to eachfunction x a unique real number.

For example, Suppose that x is a continuous function of t definedin the interval [t0, tf ] and

J(x) =

∫ tf

t0

x(t) dt

is the real number assigned by the functional J (in this case, J isthe area under the x(t) curve).

40 / 85




The increment of a funcional J is defined as follows:

∆J = J(x + δx)− J(x) (5)

where δx is called the variation of x

41 / 85




The variation of a functional J is defined as follows:

δJ = lim||δx ||→0

∆J = lim||δx ||→0

J(x + δx)− J(x)

Example: Find the variation δJ of the following functional:

J(x) =

∫ 1

0x2(t) dt

Solution:∆J = J(x + δx)− J(x) = . . .

δJ =

∫ 1

0(2xδx) dt

42 / 85




In general, the variation of a differentiable functional of the form:

J =

∫ tf

t0

L(x)dt

can be calculated as follows for fixed t0 and tf :

δJ =

∫ tf

t0

[∂L(x)

∂x

]δx

dt

If t0 and tf are also subject to small variations δt0 and δtf , thenthe variation of J can be calculated as follows:

δJ = L(x(tf ))δtf − L(x(t0))δt0 +

∫ tf

t0

[∂L(x)

∂x

]δx

dt

43 / 85




If x is a vector function of dimension n, then we have, for fixed t0

and tf :

δJ =

∫ tf

t0

[∂L(x)

∂x

]δx

dt =

∫ tf

t0

n∑

i=1

[∂L(x)

∂xi

]δxi

dt

44 / 85




A functional J has a local minimum at x∗ if for all functions x inthe vecinity of x∗ we have:

∆J = J(x)− J(x∗) ≥ 0

For a local maximum, the condition is:

∆J = J(x)− J(x∗) ≤ 0

45 / 85




Important result

If x∗ is a local minimiser or maximiser, then the variation of J iszero:

δJ[x∗, δx ] = 0

46 / 85




A basic problem in calculus of variations is the minimisation ormaximisation of the following integral with fixed t0, tf , x(t0),x(tf ):

J =

∫ tf

t0

L[x(t), x(t), t]dt

for a given function L. To solve this problem, we need to computethe variation of J, for a variation in x . This results in:

δJ =

∫ tf

t0

[∂L∂x− d

dt

(∂L∂x

)]δxdt

47 / 85




Setting δJ = 0 results in the Euler-Lagrange equation:

∂L∂x− d

dt

(∂L∂x

)= 0

This equation needs to be satisfied by the function x∗ whichminimises or maximises J. This type of solution is called anextremal solution.

48 / 85




Example

Find an extremal for the following functional with x(0) = 0,x(π/2) = 1.

J(x) =

∫ π/2

0

[x2(t)− x2(t)

]dt

Solution: the Euler-Lagrange equation gives −2x − 2x = 0 suchthat x + x = 0

The solution of this second order ODE is x(t) = c1 cos t + c2 sin t.Using the boundary conditions gives c0 = 0 and c1 = 1, such thatthe extremal is:

x∗(t) = sin t

49 / 85



A general optimal control problem

Optimal control problem

Determine the state x(t), the control u(t), as well as t0 and tf , tominimise

J = Φ[x(t0), t0, x(tf ), tf ] +

∫ tf

t0

L[x(t),u(t), t]dt

subject toDifferential constraints:

x = f[x,u, t]

Algebraic path constraints

hL ≤ h[x(t), u(t), t] ≤ hUBoundary conditions

ψ[x(t0), t0, x(tf ), tf ] = 0

50 / 85




Important features include:

It is a functional minimisation problem, as J is a functional.

Hence we need to use calculus of variations to derive itsoptimality conditions.

All optimal control problems have constraints.

51 / 85




There are various classes of optimal control problems:

Free and fixed time problems

Free and fixed terminal/initial states

Path constrained and unconstrained problems

. . .

52 / 85




For the sake of simplicity, the first order optimality conditions forthe general optimal control problem will be given ignoring the pathconstraints.

The derivation involves constructing an augmented performanceindex:

Ja = Φ + νTψ +

∫ tf

t0

[L + λT (f − x)

]dt

where ν and λ(t) are Lagrange multipliers for the boundaryconditions and the differential constraints, respectively. λ(t) is alsoknown as the co-state vector. The first variation of Ja, δJa, needsto be calculated. We assume that u(t), x(t), x(t0), x(tf ), t0, tf ,λ(t), λ(t0), λ(tf ), and ν are free to vary.

53 / 85




Define the Hamiltonian function as follows:

H[x(t),u(t), λ(t), t] = L[x(t),u(t), t] + λ(t)T f[x(t),u(t), t]

After some work, the variation δJa can be expressed as follows:

δJa = δΦ + δνTψ + νT δψ + δ

∫ tf

t0

[H − λT x

]dt

where:

δΦ =∂Φ

∂x(t0)δx0 +

∂Φ

∂t0δt0 +

∂Φ

∂x(tf )δxf +

∂Φ

∂tfδtf

δψ =∂ψ

∂x(t0)δx0 +

∂ψ

∂t0δt0 +

∂ψ

∂x(tf )δxf +

∂ψ

∂tfδtf

δx0 = δx(t0) + x(t0)δt0, δxf = δx(tf ) + x(tf )δtf

54 / 85




and

δ

∫ tf

t0

[H − λx]dt = λT (t0)δx0 − λT (tf )δxf

+ H(tf )δtf − H(t0)δt0

+

∫ tf

t0

[(∂H

∂x+ λT

)δx +

(∂H

∂λ+ xT

)δλ+

∂H

∂uδu

]dt

55 / 85




Assuming that initial and terminal times are free, and that thecontrol is unconstrained, setting δJa = 0 results in the followingfirst order necessary optimality conditions.

x =[∂H∂λ

]T= f[x(t),u(t), t] (state dynamics)

λ = −[∂H∂x

]T(costate dynamics)

∂H∂u = 0 (stationarity condition)

56 / 85




In addition we have the following conditions at the boundaries:

ψ[x(t0), t0, x(tf ), tf ] = 0

[λ(t0) +

[∂Φ∂x(t0)

]T+[

∂ψ∂x(t0)

]Tν

]Tδx0 +

[−H(t0) + ∂Φ

∂t0+ νT ∂ψ

∂t0

]δt0 = 0

[−λ(tf ) +

[∂Φ∂x(tf )

]T+[

∂ψ∂x(tf )

]Tν

]Tδxf +

[H(tf ) + ∂Φ

∂tf+ νT ∂ψ

∂tf

]δtf = 0

Note that δx0 depends on δt0 so in the second equation above theterms that multiply the variations of δx0 and δt0 cannot be madezero independently. Something similar to this happens in the thirdequation.

57 / 85




The conditions in boxes are necessary conditions, so solutions(x∗(t), λ∗(t),u∗(t), ν∗) that satisfy them are extremalsolutions.

They are only candidate solutions.

Each extremal solution can be a minimum, a maximum or asaddle.

58 / 85




Note the dynamics of the state - co-state system:

x = ∇λHλ = −∇xH

This is a Hamiltonian system.

Boundary conditions are at the beginning and at the end ofthe interval, so this is a two point boundary value problem.

If H is not an explicit function of time, H = constant.

If the Hamiltonian is not an explicit function of time and tf isfree, then H(t) = 0.

59 / 85



Example: The brachistochrone problem

This problem was solved by Bernoulli in 1696. A mass m moves ina constant force field of magnitude g starting at rest at the origin(x , y) = (x0, y0) at time 0. It is desired to find a path of minimumtime to a specified final point (x1, y1).

60 / 85




The equations of motion are:

x = V cos θ

y = V sin θ

where V (t) =√

2gy . The performance index (we are afterminimum time), can be expressed as follows:

J =

∫ tf

01dt = tf

and the boundary conditions on the states are(x(0), y(0)) = (x0, y0), (x(tf ), y(tf )) = (x1, y1).

61 / 85




In this case, the Hamiltonian function is:

H = 1 + λx(t)V (t) cos θ(t) + λy (t)V (t) sin θ(t)

The costate equations are:

λx = 0

λy = − g

V(λx cos θ + λy sin θ)

and the stationarity condition is:

0 = −λxV sin θ + λyV cos θ =⇒ tan θ =λyλx

62 / 85




We need to satify the boundary conditions:

ψ =

x(0)− x0

y(0)− y0

x(tf )− x1

y(tf )− y1

=

0000

Also since the Hamiltonian is not an explicit function of time andtf is free, we have that H(t) = 0. Using this, together with thestationarity condition we can write:

λx = −cos θ

V

λy = −sin θ

V

63 / 85




After some work with these equations, we can write:

y =y1

cos2 θ(tf )cos2 θ

x = x1 +y1

2 cos2 θ(tf )[2(θ(tf )− θ) + sin 2θ(tf )− sin 2θ]

tf − t =

√2y1

g

θ − θ(tf )

cos θ(tf )

Given the current position (x , y) these three equations can be usedto solve for θ(t) and θ(tf ) and the time to go tf − t.

64 / 85




Using the above results, it is possible to find the states:

y(t) = a(1− cosφ(t))

x(t) = a(φ− sinφ(t))

whereφ = π − 2θ

a =y1

1− cosφ(tf )

The solution describes a cycloid in the x − y plane that passesthrough (x1, y1).

65 / 85




The figures illustrates numerical results with (x(0), y(0)) = (0, 0),(x(tf ), y(tf )) = (2, 2) and g = 9.8.

0

0.5

1

1.5

2

0 0.5 1 1.5 2

y

x

Brachistochrone Problem: x-y plot

y

0.4

0.6

0.8

1

1.2

1.4

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

cont

rol

time (s)

Brachistochrone Problem: control

u

66 / 85




Assume that the optimal control is u∗, and that the states andco-states are fixed at their optimal trajectories x∗(t), λ∗(t). wehave that for all u close to u∗:

J(u∗) ≤ J(u) =⇒ J(u)− J(u∗) ≥ 0

Then we have that J(u) = J(u∗) + δJ[u∗, δu].

67 / 85




From the variation in the augmented functional δJa, we have inthis case:

δJ[u∗, δu] =

∫ tf

t0

[∂H

∂u

]δudt

68 / 85




But, using a Taylor series we know that:

H[x∗,u∗ + δu, λ∗] = H[x∗,u∗, λ∗] +

[∂H

∂u

]u∗δu + . . .

So that, to first order:

δJ[u∗, δu] =

∫ tf

t0

(H[x∗,u∗ + δu, λ∗]− H[x∗,u∗, λ∗]) dt ≥ 0

69 / 85




This leads to:

H[x∗,u∗ + δu, λ∗]− H[x∗,u∗, λ∗] ≥ 0

for all admissible variations in u. This in turn leads to theminimum principle:

u∗ = arg minu∈U H[x∗,u, λ∗]

where U is the admissible control set.

70 / 85




Assuming that the admissible control set is closed then, if thecontrol lies in the interior of the set, we have that the minimumprinciple implies that:

∂H

∂u= 0

Otherwise, if the optimal control is at the boundary of theadmissible control set, the minimum principle requires that theoptimal control must be chosen to minimise the Hamiltonian:

u∗ = arg minu∈U

H[x∗,u, λ∗]

71 / 85



Example: Minimum time control with constrained input

Consider the following optimal control problem. Chose tf andu(t) ∈ U = [−1, 1], t ∈ [t0, tf ] to minimise:

J = tf =

∫ tf

01dt

subject tox1 = x2

x2 = u

with the boundary conditions:

x1(0) = x10, x1(tf ) = 0x2(0) = x20, x2(tf ) = 0

72 / 85




In this case, the Hamiltonian is:

H = 1 + λ1x2 + λ2u

Calculating ∂H/∂u gives no information on the control, so we needto apply the minimum principle:

u∗ = arg minu∈[−1,1]

H[x∗, u, λ∗] =

+1 if λ2 < 0

undefined if λ2 = 0−1 if λ2 > 0

This form of control is called bang-bang. The control switchesvalue when the co-state λ2 changes sign. The actual form of thesolution depends on the values of the initial states.

73 / 85




The costate equations are:

λ1 = 0 =⇒ λ1(t) = λ10 = constant

λ2 = −λ1 =⇒ λ2(t) = λ10t + λ20

There are four cases:

1 λ2 is always positive.

2 λ2 is always negative.

3 λ2 switches from positive to negative.

4 λ2 switches from negative to positive.

So there will be at most one switch in the solution. The minimumprinciple helps us to understand the nature of the solution.

74 / 85




Let’s assume that we impose equality path constraints of theform:

c[x(t),u(t), t] = 0

This type of constraint may be part of the model of thesystem.

The treatment of path constraints depends on the Jacobianmatrix of the constraint function.

75 / 85




If the matrix ∂c/∂u is full rank, then the system of differentialand algebraic equations formed by the state dynamics and thepath constraints is a DAE of index 1. In this case, theHamiltonian can be replaced by:

H = L + λT f + µTc

which results in new optimality conditions.

76 / 85




If the matrix ∂c/∂u is rank defficient, the index of the DAE ishigher than one.

This may require repeated differentiations of c with respect tou until one of the derivatives of c is of full rank, a procedurecalled rank reduction.

This may be difficult to perform and may be prone tonumerical errors.

77 / 85



Singular arcs

In some optimal control problems, extremal arcs satisfying thestationarity condition ∂H/∂u = 0 occur where the Hessianmatrix Huu is singular.

These are called singular arcs.

Additional tests are required to verify if a singular arc isoptimizing.

78 / 85



Singular arcs

A particular case of practical relevance occurs when theHamiltonian function is linear in at least one of the controlvariables.

In such cases, the control is not determined in terms of thestate and co-state by the stationarity condition ∂H/∂u = 0

Instead, the control is determined by repeatedly differentiating∂H/∂u with respect to time until it is possible to obtain thecontrol by setting the k-th derivative to zero.

79 / 85



Singular arcs

In the case of a single control u, once the control is obtainedby setting the k-th time derivative of ∂H/∂u to zero, thenadditional conditions known as the generalizedLegendre-Clebsch conditions must be checked:

(−1)k∂

∂u

d (2k)

dt(2k)

∂H

∂u

≥ 0, k = 0, 1, 2, . . .

If the above conditions are satisfied, then the singular arc willbe optimal.

The presence of singular arcs may cause difficulties tocomputational optimal control methods to find accuratesolutions if the appropriate conditions are not enforced a priori.

80 / 85




These are constraints of the form:

hL ≤ h[x(t),u(t), t] ≤ hU

with hL,i 6= hU,i .

Inequality constraints may be active (h at one of the limits) orinactive (h within the limits) at any one time.

As a result the time domain t ∈ [t0, tf ] may be partitionedinto constrained and unconstrained arcs.

Different optimality conditions apply during constrained andunconstrained arcs.

81 / 85




Three major complications:

The number of constrained arcs is not necessarily known apriori.

The location of the junction points between constrained andunconstrained arcs is not known a priori.

At the junction points it is possible that both the controls uand the costates λ are discontinuous.

Additional boundary conditions need to be satisfied at thejunction points ( =⇒ multipoint boundary value problem).

82 / 85



Summary

Optimal control theory for continuous time systems is basedon calculus of variations.

Using this approach, we formulate the first order optimalityconditions of the problem.

This leads to a Hamiltonian boundary value problem.

83 / 85



Summary

Even if we use numerical methods, a sound understanding ofcontrol theory is important.

It provides a systematic way of formulating the problem.It gives us ways of analysing the numerical solution.It gives us conditions that we can incorporate into theformulation of the numerical problem.

84 / 85



Further reading

D.E. Kirk (1970). Optimal Control Theory. Dover.

E.K. Chong and S.H. Zak (2008). An Introduction to Optimization.Wiley.

F.L. Lewis and V.L. Syrmos (1995). Optimal Control.Wiley-Interscience.

J.T. Betts (2010). Practical Methods for Optimal Control andEstimation Using Nonlinear Programming. SIAM.

A.E. Bryson and Y.C. Ho (1975). Applied Optimal Control. HalstedPress.

85 / 85



Optimal Control and ApplicationsSession 2

2nd AstroNet-II Summer School

Victor M. BecerraSchool of Systems Engineering

University of Reading

5-6 June 2013

1 / 89




Direct collocation methods for optimal control


Global (pseudospectral) discretisations

Practical aspects

Multi-phase problemsEfficient sparse differentiationScalingEstimating the discretisation accuracyMesh refinement

2 / 89



Direct Collocation methods for optimal control

This session introduces a family of methods for the numericalsolution of optimal control problems known as directcollocation methods.

These methods involve the discretisation of the ODSs of theproblem.

The discretised ODEs becomes a set of algebraic equalityconstraints.

The problem time interval is divided into segments and thedecision variables include the values of controls and statevariables at the grid points.

3 / 89




ODE discretisation can be done, for example, usingtrapezoidal, Hermite-Simpson, or pseudospectralapproximations.

This involves defining a grid of points covering the timeinterval [t0, tf ], t0 = t1 < t2 · · · < tN = tf .

4 / 89




The nonlinear optimization problems that arise from directcollocation methods may be very large.

For instance, Betts (2003) describes a spacecraft low thrusttrajectory design with 211,031 variables and 146,285constraints.

5 / 89




It is interesting that such large NLP problems may be easierto solve than boundary-value problems.

The reason for the relative ease of computation of directcollocation methods is that the associated NLP problem isoften quite sparse, and efficient software exist for its solution.

6 / 89




With direct collocation methods there is no need to explicitlyderive the necessary conditions of the continuous problem.

For a given problem, the region of attraction of the solutionis usually larger with direct methods than with indirectmethods.

Moreover, direct collocation methods do not require an apriori specification of the sequence of constrained arcs inproblems with inequality path constraints.

7 / 89




Then the range of problems that can be solved with directmethods is larger than the range of problems that can besolved via indirect methods.

Direct methods can be used to find a starting guess for anindirect method to improve the accuracy of the solution.

8 / 89




Consider the following optimal control problem:

Problem 1. Find all the free functions and variables (the controlsu(t), t ∈ [t0, tf ], the states x(t), t ∈ [t0, tf ], the parameters vectorp, and the times t0, and tf ), to minimise the followingperformance index,

J = ϕ[x(t0), t0, x(tf ), tf ] +

∫ tf

t0

L[x(t),u(t), t]dt

subject to:x(t) = f[x(t),u(t),p, t], t ∈ [t0, tf ],

9 / 89




Boundary conditions (often called event constraints):

ψL ≤ ψ[x(t0),u(t0), x(tf ),u(tf ),p, t0, tf ] ≤ ψU .

Inequality path constraints:

hL ≤ h[x(t),u(t),p, t] ≤ hU , t ∈ [t0, tf ].

10 / 89




Bound constraints on controls, states and static parameters:

uL ≤ u(t) ≤ uU , t ∈ [t0, tf ],

xL ≤ x(t) ≤ xU , t ∈ [t0, tf ],

pL ≤ p ≤ pU .

The following constraints allow for cases where the initial and/orfinal times are not fixed but are to be chosen as part of thesolution to the problem:

t0 ≤ t0 ≤ t0,

t f ≤ tf ≤ tf ,

tf − t0 ≥ 0.

11 / 89




Consider the continuous domain [t0, tf ] ⊂ R, and break thisdomain using intermediate nodes as follows:

t0 < t1 ≤ · · · < tN = tf

such that the domain has been divided into the N segments[tk , tk+1], j = 0, . . . ,N − 1.

t1 t2 tN = tftN−1t0

x(t0)

u(t0)

x(t1)

u(t1)

x(t2)

u(t2)

x(tN−1)

u(tN−1)

x(tN )

u(tN )

The figure illustrates the idea that the control u and state variablesx are discretized over the resulting grid.

12 / 89




By using direct collocation methods, it is possible to discretiseProblem 1 to obtain a nonlinear programming problem.

Problem 2. Choose y ∈ Rny to minimise:

F (y)

subject to: 0HL

ΨL

≤Z(y)

H(y)Ψ(y)

≤ 0

HU

ΨU

yL ≤ y ≤ yU

13 / 89




where

F is simply a mapping resulting from the evaluation of theperformance index,

Z is the mapping that arises from the discretisation of thedifferential constraints over the gridpoints,

H arises from the evaluation of the path constraints at each ofthe grid points, with associated bounds HL,HU ,

Ψ is simply a mapping resulting from the evaluation of theevent constraints.

14 / 89




The form of Problem 2 is the same regardless of the methodused to discretize the dynamics.

What may change depending on the discretisation methodused is the decision vector y and the differential defectconstraints.

15 / 89




Suppose that the solution of the ODE is approximated by apolynomial x(t) of degree M over each intervalt ∈ [tk , tk+1], k = 0, . . . ,N − 1:

x(t) = a(k)0 + a

(k)1 (t − tk) + · · ·+ a

(k)M (t − tk)M

where the polynomial coefficients

a(k)0 , a

(k)1 , . . . , a

(k)M

are chosen

such that the approximation matches the function at the beginningand at the end of the interval:

x(tk) = x(tk)

x(tk+1) = x(tk+1)

16 / 89




and the time derivative of the approximation matches the timederivative of the function at tk and tk+1:

dx(tk)

dt= f[x(tk),u(tk),p, tk ]

dx(tk+1)

dt= f[x(tk+1),u(tk+1),p, tk+1]

The last two equations are called collocation conditions.

17 / 89




tk

x(tk)

u(tk)

x(tk+1)

tk+1

u(tk+1)

polynomial approximation

true solution

The figure illustrates the collocation concept. The solution of theODE is approximated by a polynomial of degree M over eachinterval, such that the approximation matches at the beginningand the end of the interval.

18 / 89




Many of the local discretisations used in practice for solvingoptimal control problems belong to the class of so-called classicalRunge-Kutta methods. A general K -stage Runge-Kutta methoddiscretises the differential equation as follows:

xk+1 = xk + hk

K∑i=1

bi fki

where

xki = xk + hk

K∑l=1

ail fkl

fki = f[xki ,uki ,p, tki ]

where uki = u(tki ), tki = tk + hkρi , and ρi = (0, 1].

19 / 89



Trapezoidal method

The trapezoidal method is based on a quadratic interpolationpolynomial:

x(t) = a(k)0 + a

(k)1 (t − tk) + a

(k)2 (t − tk)2

Let: fk ≡ f[x(tk),u(tk),p, tk ] and fk+1 ≡ f[x(tk+1),u(tk+1),p, tk+1]

The compressed form of the trapezoidal method is as follows:

ζ(tk) = x(tk+1)− x(tk)− hk2

(fk + fk+1) = 0

where hk = tk+1 − tk , and ζ(tk) = 0 is the differential defectconstraint at node tk associated with the trapezoidal method.Allowing k = 0, . . . ,N − 1, this discretisation generates Nnxdifferential defect constraints.

20 / 89



Trapezoidal method

Note that the trapezoidal method is a 2-stage Runge-Kuttamethod with a11 = a12 = 0, a21 = a22 = 1/2, b1 = b2 = 1/2,ρ1 = 0, ρ2 = 1.

21 / 89



Trapezoidal method

Using direct collocation with the compressed trapezoidaldiscretisation, the decision vector for single phase problems is givenby

y =

vec(U)vec(X)

pt0tf

where vec(U) represents the vectorisation of matrix U, that is anoperator that stacks all columns of the matrix, one below theother, in a single column vector,

22 / 89



Trapezoidal method

The expressions for matrices U and X are as follows:

U = [u(t0) u(t1) . . .u(tN−1)] ∈ Rnu×N

X = [x(t0) x(t1) . . . x(tN)] ∈ Rnx×N+1

23 / 89



Hermite-Simpson method

The Hermite-Simpson method is based on a cubic interpolatingpolynomial:

x(t) = a(k)0 + a

(k)1 (t − tk) + a

(k)2 (t − tk)2 + a

(k)3 (t − tk)3

Let tk = (tk + tk+1)/2, and u(tk) = (u(tk) + u(tk+1))/2.

24 / 89




The compressed Hermite-Simpson discretisation is as follows:

x(tk) =1

2[x(tk) + x(tk+1)] +

hk8

[fk − fk+1]

fk = f[x(tk), u(tk),p, tk ]

ζ(tk) = x(tk+1)− x(tk)− hk6

[fk + 4fk + fk+1

]= 0

where ζ(tk) = 0 is the differential defect constraint at node tkassociated with the Hermite-Simpson method.

25 / 89




Allowing k = 0, . . . ,N − 1, the compressed Hermite-Simpsondiscretisation generates Nnx differential defect constraints.Using direct collocation with the compressed Hermite-Simpsonmethod, the decision vector for single phase problems is given by

y =

vec(U)vec(U)vec(X)

pt0tf

where

U = [u(t0) u(t1) . . .u(tN−1)] ∈ Rnu×N−1

26 / 89




Note that the Hermite-Simpson method is a 3-stage Runge-Kuttamethod witha11 = a12 = a13 = 0, a21 = 5/24, a22 = 1/3, a23 = −1/24,a31 = 1/6, a32 = 2/3, a33 = 1/6, b1 = 1/6, b2 = 2/3, b3 = 1/6,ρ1 = 0, ρ2 = 1/2, ρ3 = 1.

27 / 89



Local convergence

The error in variable zk ≡ z(tk) with respect to a solutionz∗(t) is of order n if there exist h, c > 0 such that

maxk||zk − z∗(tk)|| ≤ chn, ∀h < h

It is well known that the convergence of the trapezoidalmethod for integrating differential equations is of order 2,while the convergence of the Hermite-Simpson method forintegrating differential equations is of order 4.

Under certain conditions, these properties more or lesstranslate to the optimal control case, but detailed analysis isbeyond the scope of these lectures.

28 / 89



Global discretisations: Pseudospectral methods

Pseudospectral methods (PS) were originally developed forthe solution of partial differential equations that arise in fluidmechanics.

However, PS techniques have also been used as computationalmethods for solving optimal control problems.

While Runge-Kutta methods approximate the derivatives of afunction using local information, PS methods are, in contrast,global.

PS methods are the combination of using global polynomialswith orthogonally collocated points.

29 / 89




A function is approximated as a weighted sum of smooth basisfunctions, which are often chosen to be Legendre orChebyshev polynomials.

One of the main appeals of PS methods is their exponential(or spectral) rate of convergence, which is faster than anypolynomial rate.

30 / 89




Another advantage is that with relatively coarse grids it ispossible to achieve good accuracy.

Approximation theory shows that PS methods are well suitedfor approximating smooth functions, integrations, anddifferentiations, all of which are relevant to optimal controlproblems

31 / 89




PS methods have some disadvantages:

The resulting NLP is not as sparse as NLP problems derivedfrom Runge-Kutta discretizations.

This results in less efficient (slower) NLP solutions andpossibly difficulties in finding a solution.

They may not work very well if the solutions of the optimalcontrol problem is not sufficiently smooth, or if theirrepresentation require many grid points.

32 / 89



Approximating a continuous function using Legendrepolynomials

Let LN(τ) denote the Legendre polynomial of order N, which maybe generated from:

LN(τ) =1

2NN!

dN

dτN(τ2 − 1)N

Examples of Legendre polynomials are:

L0(τ) = 1

L1(τ) = τ

L2(τ) =1

2(3τ2 − 1)

L3(τ) =1

2(5τ3 − 3τ)

33 / 89




The figure illustrates the Legendre polynomials LN(τ) forN = 0, . . . , 3.

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1

−1

−0.8

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

0.8

1

τ

L N(τ

)

34 / 89




Let τk , k = 0, . . . ,N be the Lagrange-Gauss-Lobatto (LGL) nodes,which are defined as τ0 = −1, τN = 1, and τk , being the roots ofLN(τ) in the interval [−1, 1] for k = 1, 2, . . . ,N − 1.

There are no explicit formulas to compute the roots of LN(τ), butthey can be computed using known numerical algorithms.

35 / 89




For example, for N = 20, the LGL nodes τk , k = 0, . . . , 12 shownin the figure.

−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 1−1

0

1

36 / 89




Given any real-valued function f (τ) : [−1, 1]→ <, it can beapproximated by the Legendre pseudospectral method:

f (τ) ≈ f (τ) =N∑

k=0

f (τk)φk(τ) (1)

where the Lagrange interpolating polynomial φk(τ) is defined by:

φk(τ) =1

N(N + 1)LN(τk)

(τ2 − 1)LN(τ)

τ − τk(2)

It should be noted that φk(τj) = 1 if k = j and φk(τj)=0, if k 6= j ,so that:

f (τk) = f (τk) (3)

37 / 89




Regarding the accuracy and error estimates of the Legendrepseudospectral approximation, it is well known that forsmooth functions f (τ), the rate of convergence of f (τ) tof (τ) is faster than any power of 1/N.

The convergence of the pseudospectral approximations hasbeen analysed by Canuto et al (2006).

38 / 89




The figure shows the degree N interpolation of the functionf (τ) = 1/(1 + τ + 15τ2) in (N + 1) equispaced and LGL points forN = 20. With increasing N, the errors increase exponentially in theequispaced case (this is known as the Runge phenomenon) whereasin the LGL case they decrease exponentially.

−1 −0.5 0 0.5 1−1

−0.5

0

0.5

1

1.5equispaced points

max error = 18.8264

−1 −0.5 0 0.5 1−1

−0.5

0

0.5

1

1.5LGL points

max error = 0.0089648

39 / 89



Differentiation with Legendre polynomials

The derivatives of f (τ) in terms of f (τ) at the LGL points τk canbe obtained by differentiating Eqn. (1). The result can beexpressed as follows:

f (τk) ≈ ˙f (τk) =N∑i=0

Dki f (τi )

where

Dki =

LN(τk )LN(τi )

1τk−τi if k 6= i

−N(N + 1)/4 if k = i = 0N(N + 1)/4 if k = i = N

0 otherwise

(4)

which is known as the differentiation matrix.

40 / 89




For example, this is the Legendre differentiation matrix for N = 5.

D =

7.5000 −10.1414 4.0362 −2.2447 1.3499 −0.50001.7864 0 −2.5234 1.1528 −0.6535 0.2378−0.4850 1.7213 0 −1.7530 0.7864 −0.26970.2697 −0.7864 1.7530 0 −1.7213 0.4850−0.2378 0.6535 −1.1528 2.5234 0 −1.78640.5000 −1.3499 2.2447 −4.0362 10.1414 −7.5000

41 / 89




Legendre differentiation of f (t) = et sin(5t) for N = 10 andN = 20. Note the vertical scales.

−1 −0.5 0 0.5 1−4

−2

0

2f(t), N=10

−1 −0.5 0 0.5 1−0.04

−0.02

0

0.02 error in f’(t), N=10

−1 −0.5 0 0.5 1−4

−2

0

2f(t), N=20

−1 −0.5 0 0.5 1−1

0

1

2x 10−9 error in f’(t), N=20

42 / 89



Numerical integration with Legendre polynomials

Note that if h(τ) is a polynomial of degree ≤ 2N − 1, its integralover τ ∈ [−1, 1] can be exactly computed as follows:∫ 1

−1h(τ)dτ =

N∑k=0

h(τk)wk (5)

where τk , k = 0, . . . ,N are the LGL nodes and the weights wk aregiven by:

wk =2

N(N + 1)

1

[LN(τk)]2, k = 0, . . . ,N. (6)

If L(τ) is a general smooth function, then for a suitable N, itsintegral over τ ∈ [−1, 1] can be approximated as follows:∫ 1

−1L(τ)dτ ≈

N∑k=0

L(τk)wk (7)

43 / 89



Discretising an optimal control problem

Note that PS methods require introducing the transformation:

τ ← 2

tf − t0t − tf + t0

tf − t0,

so that it is possible to write Problem 1 using a new independentvariable τ in the interval [−1, 1]

44 / 89




After applying the above transformation, Problem 1 can beexpressed as follows.

Problem 3. Find all the free functions and variables (the controlsu(τ), τ ∈ [−1, 1], the states x(τ), t ∈ [−1, 1], the parametersvector p, and the times t0, and tf ), to minimise the followingperformance index,

J = ϕ[x(−1), t0, x(1), tf ] +

∫ 1

−1L[x(τ),u(τ), t(τ)]dτ

subject to:

x(τ) =tf − t0

2f[x(τ),u(τ),p, t(τ)], τ ∈ [−1, 1],

45 / 89




Boundary conditions:

ψL ≤ ψ[x(−1),u(−1), x(1),u(1),p, t0, tf ] ≤ ψU .

Inequality path constraints:

hL ≤ h[x(τ),u(τ),p, t(τ)] ≤ hU , τ ∈ [−1, 1].

46 / 89




Bound constraints on controls, states and static parameters:

uL ≤ u(τ) ≤ uU , τ ∈ [−1, 1],

xL ≤ x(τ) ≤ xU , τ ∈ [−1, 1],

pL ≤ p ≤ pU .

And the following constraints:

t0 ≤ t0 ≤ t0,

t f ≤ tf ≤ tf ,

tf − t0 ≥ 0.

47 / 89




In the Legendre pseudospectral approximation of Problem 3, thestate function x(τ), τ ∈ [−1, 1] is approximated by the N-orderLagrange polynomial x(τ) based on interpolation at theLegendre-Gauss-Lobatto (LGL) quadrature nodes, so that:

x(τ) ≈ x(τ) =N∑

k=0

x(τk)φk(τ) (8)

48 / 89




Moreover, the control u(τ), τ ∈ [−1, 1] is similarly approximatedusing an interpolating polynomial:

u(τ) ≈ u(τ) =N∑

k=0

u(τk)φk(τ) (9)

Note that, from (3), x(τk) = x(τk) and u(τk) = u(τk).

49 / 89




The derivative of the state vector is approximated as follows:

x(τk) ≈ ˙x(τk) =N∑i=0

Dki x(τi ), i = 0, . . .N (10)

where Dki is an element of D, which is the (N + 1)× (N + 1) thedifferentiation matrix given by (4).

50 / 89




Define the following nu × (N + 1) matrix to store the trajectoriesof the controls at the LGL nodes:

U =[u(τ0) u(τ1) . . . u(τN)

](11)

51 / 89




Define the following nx × (N + 1) matrices to store, respectively,the trajectories of the states and their approximate derivatives atthe LGL nodes:

X =[x(τ0) x(τ1) . . . x(τN)

](12)

and

X =[

˙x(τ0) ˙x(τ1) . . . ˙x(τN)]

(13)

52 / 89




Now, form the following nx × (N + 1) matrix with the right handside of the differential constraints evaluated at the LGL nodes:

F =tf − t0

2

[f[x(τ0), u(τ0),p, τ0] . . . f[x(τN), u(τN),p, τN ]

](14)

Define the differential defects at the collocation points as thenx(N + 1) vector:

ζ = vec(

X− F)

= vec(

XDT − F)

53 / 89




The decision vector y , which has dimensionny = (nu(N + 1) + nx(N + 1) + np + 2), is constructed as follows:

y =

vec(U)vec(X)

pt0tf

(15)

54 / 89



Multi-phase problems

Some optimal control problems can be convenientlyformulated as having a number of phases.

Phases may be inherent to the problem (for example, aspacecraft drops a section and enters a new phase).

Phases may also be introduced by the analyst to allow forpeculiarities in the solution of the problem.

Many real world problems need a multi-phase approach fortheir solution.

55 / 89




The figure illustrates the notion of multi-phase problems.

Phase 1

Phase 2

Phase 3

Phase 4

t(1)0 t

(1)f t

(2)f

t(2)0 t

(3)0

t(4)0

t(3)f

t(4)f

t

56 / 89




Typically, the formulation for multi-phase problems involve thefollowing elements:

1 a performance index which is the sum of individualperformance indices for each phase;

2 differential equations and path constraints for each phase.

3 phase linkage constraints which relate the variables at theboundaries between the phases;

4 event constraints which the initial and final states of eachphase need to satisfy.

57 / 89



Scaling

The numerical methods employed by NLP solvers are sensitiveto the scaling of variables and constraints.

Scaling may change the convergence rate of an algorithm, thetermination tests, and the numerical conditioning of systemsof equations that need to be solved.

Modern optimal control codes often perform automaticscaling of variables and constraints, while still allowing theuser to manually specify some or all of the scaling factors.

58 / 89



Scaling

A scaled variable or function is equal to the original valuetimes an scaling factor.

Scaling factors for controls, states, static parameters, andtime, are typically computed on the basis of known bounds forthese variables, e.g. for state xj(t)

sx ,j =1

max(|xL,j |, |xU,j |)For finite bounds, the variables are scaled so that the scaledvalues are within a given range, for example [−1, 1].

The scaling factor of each differential defect constraint isoften taken to be equal to the scaling factor of thecorresponding state:

sζ,j ,k = sx ,j

59 / 89



Scaling

Scaling factors for all other constraints are often computed asfollows. The scaling factor for the i-th constraint of the NLPproblem is the reciprocal of the 2-norm of the i-th row of theJacobian of the constraints evaluated at the initial guess.

sH,i = 1/||row(Hy, i)||2This is done so that the rows of the Jacobian of theconstraints have a 2-norm equal to one.

The scaling factor for the objective function can be computedas the reciprocal of the 2-norm of the gradient of the objectivefunction evaluated at the initial guess.

sF = 1/||Fy||2

60 / 89



Efficient sparse differentiation

Modern optimal control software uses sparse finitedifferences or sparse automatic differentiation libraries tocompute the derivatives needed by the NLP solver.

Sparse finite differences exploit the structure of the Jacobianand Hessian matrices associated with an optimal controlproblem

The individual elements of these matrices can be numericallycomputed very efficiently by perturbing groups of variables,as opposed to perturbing single variables.

61 / 89




Automatic differentiation is based on the application of thechain rule to obtain derivatives of a function given as anumerical computer program.

Automatic differentiation exploits the fact that computerprograms execute a sequence of elementary arithmeticoperations or elementary functions.

By applying the chain rule of differentiation repeatedly tothese operations, derivatives of arbitrary order can becomputed automatically and very accurately.

Sparsity can also be exploited with automatic differentiationfor increased efficiency.

62 / 89




It is advisable to use, whenever possible, sparse automaticdifferentiation to compute the required derivatives, as they aremore accurate and faster to compute than numericalderivatives.

When computing numerical derivatives, the structure may beexploited further by separating constant and variable parts ofthe derivative matrices.

The achievable sparsity is dependent on the discretisationmethod employed and on the problem itself.

63 / 89



Measures of accuracy of the discretisation

It is important for optimal control codes implementing directcollocation methods to estimate the discretisation error.

This information can be reported back to the user andemployed as the basis of a mesh refinement scheme.

64 / 89




Define the error in the differential equation as a function of time:

ε(t) = ˙x(t)− f[x(t), u(t),p, t]

where x is an interpolated value of the state vector given the gridpoint values of the state vector, ˙x is an estimate of the derivativeof the state vector given the state vector interpolant, and u is aninterpolated value of the control vector given the grid points valuesof the control vector.

65 / 89




The type of interpolation used depends on the collocation methodemployed. The absolute local error corresponding to state i on aparticular interval t ∈ [tk , tk+1], is defined as follows:

ηi ,k =

∫ tk+1

tk

|εi (t)|dt

where the integral is computed using an accurate quadraturemethod.

66 / 89




The local relative ODE error is defined as:

εk = maxi

ηi ,kwi + 1

where

wi =N

maxk=1

[|xi ,k |, | ˙xi ,k |

]The error sequence εk , or a global measure of the size of the error(such as the maximum of the sequence εk), can be analysed by theuser to assess the quality of the discretisation. This informationmay also be useful to aid the mesh refinement process.

67 / 89



Mesh refinement

The solution of optimal control problems may involve periodsof time where the state and control variables are changingrapidly.

It makes sense to use shorter discretisation intervals in suchperiods, while longer intervals are often sufficient when thevariables are not changing much.

This is the main idea behind automatic mesh refinementmethods.

68 / 89



Mesh refinement

Mesh refinement methods detect when a solution needsshorter intervals in regions of the domain by monitoring theaccuracy of the discretisation.

The solution grid is then refined aiming to improve thediscretisation accuracy in regions of the domain that require it.

The NLP problem is then solved again over the new mesh,using the previous solution as an initial guess (afterinterpolation), and the result is re-evaluated.

69 / 89



Mesh refinement

Typically several mesh refinement iterations are performeduntil a given overall accuracy requirement is satisfied.

With local discretisations, the mesh is typically refined byadding a limited number of nodes to intervals that requiremore accuracy, so effectively sub-dividing existing intervals. Awell known local method was proposed by Betts.

For pseudospectral methods, techniques have been proposedfor segmenting the problem (adding phases) and automaticallycalculating the number of nodes in each segment.

70 / 89



Mesh refinement

The figure illustrates the mesh refinement process.

71 / 89



Example: vehicle launch problem

This problem consists of thelaunch of a space vehicle(Benson, 2004).

The flight of the vehicle canbe divided into four phases,with dry masses ejectedfrom the vehicle at the endof phases 1, 2 and 3.

The final times of phases 1,2 and 3 are fixed, while thefinal time of phase 4 is free.

It is desired to find thecontrol u(t) and the finaltime to maximise the massat the end of phase 4.

Image courtesy of nasaimages.org.72 / 89




The dynamics are given by:

r = v

v = − µ

‖r‖3r +

T

mu +

D

m

m = − T

g0Isp

where r(t) =[x(t) y(t) z(t)

]Tis the position,

v =[vx(t) vy (t) vz(t)

]Tis the Cartesian ECI velocity, T is

the vacuum thrust, m is the mass, g0 is the acceleration of gravityat sea level, Isp is the specific impulse of the engine,

u =[ux uy uz

]Tis the thrust direction, and

D =[Dx Dy Dz

]Tis the drag force.

73 / 89




The vehicle starts on the ground at rest relative to the Earth attime t0, so that the initial conditions are

r(t0) = r0 =[

5605.2 0 3043.4]T

km

v(t0) = v0 =[

0 0.4076 0]T

km/s

m(t0) = m0 = 301454 kg

74 / 89




The terminal constraints define the target transfer orbit, which isdefined in orbital elements as

af = 24361.14 km,ef = 0.7308,if = 28.5 deg,

Ωf = 269.8 deg,ωf = 130.5 deg

75 / 89




The following linkage constraints force the position and velocity tobe continuous and also model the discontinuity in the mass due tothe ejections at the end of phases 1, 2 and 3:

r(p)(tf )− r(p+1)(t0) = 0,v(p)(tf )− v(p+1)(t0) = 0, (p = 1, . . . , 3)

m(p)(tf )−m(p)dry −m(p+1)(t0) = 0

where the superscript (p) represents the phase number.There is also a path constraint associated with this problem:

||u||2 = u2x + u2y + u2z = 1

76 / 89




This problem has been solved using the author’s open sourceoptimal control software PSOPT (see http://www.psopt.org)

Parameter values can be found in the code.

The problem was solved in eight mesh refinement iterationsuntil the maximum relative ODE error εmax was below1× 10−5, starting from a coarse uniform grid and resulting ina finer non-uniform grid.

77 / 89




6400

6450

6500

6550

0 100 200 300 400 500 600 700 800 900 1000

altit

ude

(km

)

time (s)

Multiphase vehicle launch

alt

Figure: Altitude for the vehicle launch problem

78 / 89




1000

2000

3000

4000

5000

6000

7000

8000

9000

10000

0 100 200 300 400 500 600 700 800 900 1000

spee

d (m

/s)

time (s)


speed

Figure: Speed for the vehicle launch problem

79 / 89




-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

0 100 200 300 400 500 600 700 800 900 1000

u (d

imen

sion

less

)

time (s)


u1u2u3

Figure: Controls for the vehicle launch problem

80 / 89



Example: Two burn orbit transfer

The goal of this problem is to compute a trajectory for anspacecraft to go from a standard space shuttle park orbit to ageosynchronous final orbit (Betts, 2010).

The engines operate over two short periods during the mission.

It is desired to compute the timing and duration of the burnperiods, as well as the instantaneous direction of the thrustduring these two periods

The objective is to maximise the final weight of the spacecraft.

The mission then involves four phases: coast, burn, coast andburn.

81 / 89




The problem is formulated as follows. Find

u(t) = [θ(t), φ(t)]T , t ∈ [t(1)f , t

(2)f ] and t ∈ [t

(3)f , t

(4)f ], and the

instants t(1)f , t

(2)f , t

(3)f , t

(4)f such that the following objective

function is minimised:

J = −w(tf )

subject to the dynamic constraints for phases 1 and 3:

y = A(y)∆g + b

the following dynamic constraints for phases 2 and 4:

y = A(y)∆ + b

w = −T/Isp

82 / 89




and the following linkages between phases

y(t(1)f ) = y(t

(2)0 )

y(t(2)f ) = y(t

(3)0 )

y(t(3)f ) = y(t

(4)0 )

t(1)f = t

(2)0

t(2)f = t

(3)0

t(3)f = t

(4)0

w(t(2)f ) = w(t

(4)0 )

where y = [p, f , g , h, k, L,w ]T is the vector of modified equinoctialelements, w is the spacecraft weight, Isp is the specific impulse ofthe engine, T is the maximum thrust.

83 / 89




Expressions for A(y) and b are given in (Betts, 2010). thedisturbing acceleration is ∆ = ∆g + ∆T , where ∆g is thegravitational disturbing acceleration due to the oblatness of Earth,and ∆T is the thurst acceleration.

∆T = QrQv

Ta cos θ cosφTa cos θ sinφ

Ta sin θ

where Ta(t) = g0T/w(t), g0 is a constant, θ is the pitch angleand φ is the yaw angle of the thurst.

84 / 89




Matrix Qv is given by:

Qv =

[v

||v||,

v × r

||v × r||,

v

||v||× v × r

||v × r||

]Matrix Qr is given by:

Qr =[ir iθ ih

]=[

r||r||

(r×v)×r||r×v||||r||

(r×v)||r×v||

]The results shown were obtained with PSOPT using localdiscretisations and 5 mesh refinement iterations until the maximumrelative ODE error εmax was below 1× 10−5.

85 / 89




86 / 89




Two burn transfer trajectory

-45000-40000

-35000-30000

-25000-20000

-15000-10000

-5000 0

5000 10000

x (km)-10000

-5000 0

5000 10000

15000 20000y (km)

-6000-4000-2000

0 2000

z (km)

87 / 89




Mesh refinement statistics

88 / 89



Further reading

V.M. Becerra (2010) PSOPT Optimal Control Solver User Manual.http://www.psopt.org.

Benson, D.A.: A Gauss Pseudospectral Transcription for OptimalControl. Ph.D. thesis, MIT, Department of Aeronautics andAstronautics, Cambridge, Mass. (2004)

J.T. Betts (2010). Practical Methods for Optimal Control andEstimation Using Nonlinear Programming. SIAM.

C. Canuto, M.Y. Hussaini, A. Quarteroni, and T.A. Zang. SpectralMethods: Fundamentals in Single Domains. Springer-Verlag, Berlin,2006.

Elnagar, G., Kazemi, M.A., Razzaghi, M.: “The PseudospectralLegendre Method for Discretizing Optimal Control Problems”. IEEETransactions on Automatic Control 40, 17931796 (1995)

89 / 89

optimal control and applications - ubgerard/astronet-ii/astronet-ii-becerra.pdf · optimal control...

Documents