[american institute of aeronautics and astronautics guidance, navigation, and control conference -...

13
Copyright ©1996, American Institute of Aeronautics and Astronautics, Inc. AIAA Meeting Papers on Disc, July 1996 A9635584, AIAA Paper 96-3699 A method for automatic costate calculation Hans Seywald Analytical Mechanics Associates, Inc., Hampton, VA Renjith R. Kumar Analytical Mechanics Associates, Inc., Hampton, VA AIAA, Guidance, Navigation and Control Conference, San Diego, CA, July 29-31, 1996 A method for the automatic calculation of costates using only the results obtained from direct optimization techniques is presented. The approach exploits the relation between the time-varying costates and certain sensitivities of the variational cost function, a relation that also exists between the Lagrange multipliers obtained from a direct optimization approach and the sensitivities of the associated NLP cost function. The complete theory for treating free, control constrained, interior-point constrained, and state constrained optimal control problems is presented. As a numerical example, a state constrained version of the Brachistochrone problem is solved, and the results are compared to the optimal solution obtained from Pontriagin's Minimum Principle. The agreement is found to be excellent. (Author) Page 1

Upload: renjith

Post on 15-Dec-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [American Institute of Aeronautics and Astronautics Guidance, Navigation, and Control Conference - San Diego,CA,U.S.A. (29 July 1996 - 31 July 1996)] Guidance, Navigation, and Control

Copyright ©1996, American Institute of Aeronautics and Astronautics, Inc.

AIAA Meeting Papers on Disc, July 1996A9635584, AIAA Paper 96-3699

A method for automatic costate calculation

Hans SeywaldAnalytical Mechanics Associates, Inc., Hampton, VA

Renjith R. KumarAnalytical Mechanics Associates, Inc., Hampton, VA

AIAA, Guidance, Navigation and Control Conference, San Diego, CA, July 29-31, 1996

A method for the automatic calculation of costates using only the results obtained from direct optimization techniquesis presented. The approach exploits the relation between the time-varying costates and certain sensitivities of thevariational cost function, a relation that also exists between the Lagrange multipliers obtained from a directoptimization approach and the sensitivities of the associated NLP cost function. The complete theory for treating free,control constrained, interior-point constrained, and state constrained optimal control problems is presented. As anumerical example, a state constrained version of the Brachistochrone problem is solved, and the results are comparedto the optimal solution obtained from Pontriagin's Minimum Principle. The agreement is found to be excellent.(Author)

Page 1

Page 2: [American Institute of Aeronautics and Astronautics Guidance, Navigation, and Control Conference - San Diego,CA,U.S.A. (29 July 1996 - 31 July 1996)] Guidance, Navigation, and Control

AIAA-96-3699

A Method for Automatic Costate Calculation

Hans Seywald* Renjith R. Kumar'1'Analytical Mechanics Associates, Inc.

17 Research Drive, Hampton, VA 23666

AbstractA method for the automatic calculation of costates

using only the results obtained from direct optimiza-tion techniques is presented. The approach exploitsthe relation between the time-varying costates and cer-tain sensitivities of the variational cost function, a re-lation that also exists between the Lagrange multipliersobtained from a direct optimization approach and thesensitivities of the associated NLP cost function. Thecomplete theory for treating free, control constrained,interior-point constrained, and state constrained opti-mal control problems is presented in this paper. As anumerical example, a state constrained version of theBrachistochrone problem is solved and the results arecompared to the optimal solution obtained from Pon-tryagin's Minimum Principle. The agreement is foundto be excellent.

IntroductionThe methods of solution for optimal control problems

are divided into two major classes, namely direct andindirect methods. Indirect methods are based on Pon-tryagin's Minimum Principle [1-5] and require the nu-merical solution of multi-point boundary value problems(MPBVPs) [6-8]. The advantages of these methods liein their fast convergence in the neighborhood of the op-timal solution, even if the cost gradients are very shal-low. Furthermore, the optimal solutions are obtainedwith extremely high precision and subtle properties ofthe optimal solution can be clearly identified- On theother hand, indirect optimization techniques usually re-quire excellent initial guesses before convergence canbe achieved at all. This requirement is especially re-strictive as these methods involve Lagrange multiplierswhose physical meaning is non-intuitive and provideslittle help for generating reasonable initial guesses onan ad hoc basis. Furthermore, the switching structure,that is, the sequence in which different control logics be-come active along the optimal solution has to be guessedin advance.

Direct optimization techniques rely on restricting theinfinite dimensional space of admissible candidate tra-jectories to a finite dimensional subspace of the orig-inal function space [9]. The optimal solution withinthis subspace is then determined by directly optimizing

'Supervising Engineer, Member AIAA,t Senior Engineer, Member AIAA,

the cost criterion through nonlinear programming [10].The convergence radius of such methods is usually muchlarger than that of indirect methods. Furthermore, ini-tial guesses have to be provided only for physically in-tuitive quantities such as states and possibly controls.Finally, the switching structure of the optimal solutionneed not be guessed at all.

In an attempt to get the best of both worlds, thepresent paper introduces a method to approximatelycalculate the costates associated with optimal controlproblems, based only on the results obtained from adirect optimization approach. The starting point ofthis paper is the well-known relation between the time-varying costates and certain sensitivities of the varia-tional cost function [2, 11]. These sensitivities are rep-resented approximately by the cost sensitivities of thedirect optimization approach, for which expressions interms of the Lagrange multipliers obtained from thenonlinear programming approach (NLP) are derived.

Numerically, the method presented here requires onlythe calculation of a single near-optimal trajectory, whichrepresents a significant improvement over the methodpresented in [11]. However, the analytical results of thispaper are tailored to a specific discretization scheme,namely collocation with a simple Trapezoidal integra-tion rule. The expressions for costate estimates wouldhave to be rederived if this discretization scheme werechanged. For most schemes this task should be straight-forward as long as the near-optimal trajectory is calcu-lated as the solution of a single NLP problem. It shouldbe noted, however, that concatenated or sequential ap-proaches such as the ones introduced in [12] and [13] cannot be treated with an approach of the general naturepresented here.

Other methods for automatic costate calculation arediscussed, for example, in [14, 15, 16]. In [14] the finalcostate values associated with the prescribed boundaryconditions are integrated backwards along the frozensolution obtained through direct optimization. Thisapproach requires the explicit implementation of thecostate dynamics. In [15], a "best match" discretizedcostate function of time associated with the frozentrajectory obtained through direct optimization is ob-tained by minimizing a least squares error comprisingthe transversality conditions and discretized costate dy-namics. The linearity of the costate dynamics and thetransversality conditions makes this method nonitera-

Copyright © 1996 by the American Institute of Aeronautics andAstronautics, Inc. All rights reserved.

Page 3: [American Institute of Aeronautics and Astronautics Guidance, Navigation, and Control Conference - San Diego,CA,U.S.A. (29 July 1996 - 31 July 1996)] Guidance, Navigation, and Control

tive. In [16] a relation between the discrete NLP multi-pliers and the time-varying Lagrange multipliers is de-rived and exploited for automatic costate calculation.

The present paper is structured as follows: Section Idefines a general optimal control problem and summa-rizes the associated variational optimality conditions.In Section II, the discretized problem formulation isintroduced, and relations for certain cost sensitivitiesare derived in terms of the Kuhn-Tucker multipliers ob-tained from the NLP solution. In Section III, the resultsof Section II are used to generate the desired costate es-timates. Section IV gives some remarks and relates thepresent paper to previously obtained results. In SectionV, a state constrained version of the Brachistochroneis treated as a numerical example and the results arecompared to the optimal solution obtained from. Pon-tryagin's Minimum Principle. The agreement is foundto be excellent.

I. Optimal Control ProblemIn this section of the paper we introduce a somewhat

general optimal control problem, and in Section 1.2 wepresent the associated necessary conditions for optimal-ity. This defines the nomenclature for the various con-stant and time- varying multipliers used for the remain-der of the paper.

Note that we do not consider interior-point con-straints in the original problem formulation. However,such constraints arise naturally in our treatment of stateinequality constraints.

1. Problem FormulationLet us consider the following optimal control problem

stated in Mayer form:

m n

subject to the conditions

9i(x(t),u(t),t)<Q,ke(x(t),t) = Q,

hi(x(t),t)<Q.

(2)

(3)(4)(5)(6)(7)(8)

Here, t <= R, x(t) € Rn, and «(f) £ R7™ are time, statevector, and control vector, respectively. The functions4 : Rn+1 -* R, / : R"+m+1 -» Rn, r/>0 : R"+1 -* Rk°,

n + I , kf < n, ge : R»+™+i _* R*»-, he : Rn+m -» R*"",

hi : R"+m -*• R*", are assumed to be sufficientlysmooth with respect to their arguments of whatever or-der is required in this paper. (PWC[to,tj]}m denotes

the set of all piecewise continuous functions defined onthe interval [to,tf] into Rm. Conditions (2-8) representthe differential equations of the underlying dynamicalsystem, the initial conditions, the final conditions, thecontrol constraints, and the state constraints, respec-tively.

For the remainder of this paper we assume that an op-timal solution to problem (1-8) does exist and that J£has full rank for all times t £ [to,tf] along the optimalsolution. Here, g represents the vector of control con-straints (including those arising from active state con-straints) that are active at any given time t.

2. Necessary Conditions for OptimalityLet us assume that a solution to problem (1-8) exists,

and, for simplicity, let us first consider the case where nostate constraints (7-8) are active. Then, under certainnormality and regularity conditions (see [1-5]), it can beshown that there are constant multiplier vectors z/o GRfc°, Vj 6 RA/ and a time varying multiplier vectorA(t) £ R™ which is non-zero for all times t <E [*o,*/]such that

dg_dx r dx'

• T _ dH" — — — — — ~~ (9)

(10)

(11)

-"T^, (12)

E(x(tf),

whereH ( x , X , u,t) =

^r _ T ** fj ('\y\

• (14)denotes the Hamiltonian, and g denotes the vector ofactive constraint functions. At each instant of time theoptimal control u* satisfies the Pontryagin MinimumPrinciple, i.e.

u = argminuH(x, A, u, t)

subject tog ( x , u, t) = 0

(15)

The (time-varying) multiplier vector n has the same di-mension as g and is obtained from the KKT-conditionsapplied to the optimization problem (15). As long as nostate constraints (7, 8) are active, the Hamiltonian (14)is continuous throughout the time interval, including attimes where control constraints become active or inac-tive.

To discuss the optimality conditions associated withstate constrained arcs, let us consider the case of a sin-gle, scalar state inequality constraint, h(x,i) < 0 andlet us assume that in the optimal solution to problem (1,

Page 4: [American Institute of Aeronautics and Astronautics Guidance, Navigation, and Control Conference - San Diego,CA,U.S.A. (29 July 1996 - 31 July 1996)] Guidance, Navigation, and Control

8) this state inequality constraint becomes active in thefollowing form:

< 0 for t e [t0, to= Ofor i€ [Wi<0 for *£(*&,*/

(16)

The generalization of the results below to vector-valuedstate constraints and to other switching structures isthen straightforward.

In the variational approach to state constrained opti-mal control problems it is most customary to transformthe active state constraint, say

into an equivalent combination of an interior-point con-straint and a control constraint. Explicitly, (17) holdsif and only if

M(x(t,),ta) = = 0, (18)

and

= 0 on* e (19)

Here, q is the smallest integer i for which a control uappears explicitly in d h^>f> i and is called the order ofthe state constraint. Note that, along state constrainedarcs, the left-hand side of the control constraint (19)becomes a component of the constraint function g in-troduced in (9).

At the beginning of the state constrained arc, theinterior-point constraint (18) causes a discontinuousjump in the multipliers X(t) and in the Hamiltonian H ,namely

(20)

(21)pdM(x(ta),ta)dta

Here, f = [/o, ••-, lq-i] is a g-dimensional vector of con-stant multipliers that compensates for the q degrees offreedom lost by enforcing the conditions (18).

3. Relation Between Sensitivitiesand Lagrange Multipliers

It is well-known from [2] that the Lagrange multi-pliers A(i), loosely speaking, represent the sensitivityof the optimal cost with respect to perturbations in thestate vector x at time t. A comprehensive study of thesesensitivities is given in [11]. In the following, the mainresults are briefly summarized.

Let J ( x ( t ) , f ) denote the value of the optimal cost (1)obtained by solving Problem (1-8) with the initial con-ditions (3) replaced by the conditions that the state atsome time t be x(i). Then, along arcs where no stateconstraints are active, we have

T _ dJ(x(t},t)X(t) ~ 8x(t) ' (22)

Now assume that a q-th order state inequality con-straint h(x, £) < 0 becomes active along the time inter-val [ta,ti]. Then, following (16-21), the Lagrange multi-pliers A(i) experience a discontinuous jump at ta, withthe height of the jump determined by the constant mul-tiplier vector / £ R9. A(i~), the value of A just beforethe jump, satisfies the sensitivity equation (22) as it be-longs to an unconstrained part of the trajectory. Theconstant multiplier / satisfies

8b 6=0(23)

where b denotes the q-vector b = [60, ..,6?_i]T, andJ(x(ta),ta,b) denotes the value of the optimal cost (1)obtained by solving Problem (1-8) with the initial con-ditions (3) replaced by the conditions that the stateat time ta be x(ta), and with the state constrainth(x,t) = 0 on [ta,ti] replaced by

h(x,t)- on [*«,*»]. (24)

For times f,- c (ta,ti) in the interior of the state con-strained arc, the right-hand side of (22) defines an arti-ficial Lagrange multiplier A(i^~), which would exist as areal multiplier only if t, were the beginning of the con-strained arc. The multiplier A(i,-) is related to A(iJ")through the jump condition (20), with ta and A(i^) re-placed by t{ A(ij), respectively, and with the constantmultiplier / £ R? defined by (23), also with ta replacedby tj,. The constant multiplier vectors i/o and Vj asso-ciated with the initial and final conditions (3) and (4),respectively, satisfy

_ dJ(co,cf)—————

VS

dc0

_ dJ(cq,Cf)

dcf C0 =0

(25)

(26)

respectively. Here, J(co,c/) denotes the value of theoptimal cost of problem (1-8) with the boundary condi-tions (3-4) replaced by

$f(x(to),t0) - Cj = 0. (27)

The relations (25, 26) have not been explicitly statedin [11], but can be easily derived with the methods pre-sented there.

Page 5: [American Institute of Aeronautics and Astronautics Guidance, Navigation, and Control Conference - San Diego,CA,U.S.A. (29 July 1996 - 31 July 1996)] Guidance, Navigation, and Control

II. Direct ApproachSolving an optimal control problem on the basis of the

necessary conditions summarized in Section 1.2 leads toa multi-point boundary value problem (MPBVP). Nu-merically. this represents a nonlinear zero finding prob-lem. The main unknowns are the lengths of all timeintervals involved, the initial values for the states thatare not prescribed explicitly through the initial condi-tions (3), the initial values of the time- varying costatesA(i), and the constant multipliers /, VQ, i// introducedin (9-21). Additionally, before a BVP can be set up, theanalyst has to correctly guess the temporal sequence inwhich different control logics become active along theoptimal solution (i.e., the optimal switching structure).

In an obvious way, direct approaches to trajectoryoptimization provide guesses for the optimal switch-ing structure, including the length of each subarc, andguesses for the state histories. However, it is not clear,a priori, how estimates for the costates A(i) and theconstant multipliers /, 2>o, v; can be obtained.

It is known (see [2, 11]) that the multipliers in ques-tion represent sensitivities of the optimal cost with re-spect to perturbations in certain state initial values.The equivalent sensitivities of the parameter optimalsolution are captured in the Karush-Kuhn-Tucker mul-tipliers associated with the prescribed initial states, ifonly the initial states are prescribed explicitly.

In the following, we will state the discretized formof the original optimal control problem (1-8). Then wewill define two auxiliary problems with explicitly pre-scribed initial states. The important result is that thesolution of the auxiliary problems, including all multi-pliers involved, can be determined analytically from thesolution to the original discretized optimal control prob-lem. Thus, cost sensitivities of the parameter optimalsolution can be derived in terms of the Karush-Kuhn-Tucker multipliers associated with the NLP solution tothe original discretized optimal control problem.

1. Discretized Optimal Control Problem(Problem A)

One of the most successful discretization methods isthe collocation approach in which both, states and con-trols, are discretized, and the dynamical and state con-straints (2, 5, 6, 7, 8) are enforced only at isolatedpoints. Using a Trapezoidal rule to enforce the equa-tions of motion at a single point in between neighboringnodes, this scheme leads to the following nonlinear pro-gramming problem:

min

subject to the conditions

= 0. 3 =

, t0) =0,

(28)

(29)

h e ( x j , t j ) = 0,

where

= 0, (31)

j = l,..,A', (32)

j = 0,.., Ar, (33)

j =<),.. ,#, (34)

j=l,..,N. (35)

Above, JV is a user-chosen integer, and 0 = TO < T\ <• • • < rjv-i < TJV = 1 is a user-chosen subdivision ofthe unit time interval [0,1]. Clearly, as N — * oo, it canbe expected under mild assumptions that the optimalsolution tg, t*N, x*, i — 0,..,N of the discretized prob-lem (28-35) converges to the optimal solution tg, t*f,x*(tf), i = 0, ..,N of the continuous problem (1-8).

For the remainder of the paper, let problem (28-33) bedenoted as problem A. Following the general formalismof the KKT-conditions (70, 71), we associate problemA with the Lagrangian function

LA =

N

/=!N

N

£3=0

(36)

In an obvious way, (36) identifies the cost function, theconstraints, and the nomenclature used for the multi-pliers associated with the various constraints (29-33).

The main result of this paper is a method by whichcostate estimates for the continuous optimal controlproblem can be determined from the Lagrange multi-pliers associated with the solution to the discretizedoptimal control problem. These costate estimates areobtained through some post-processing of the Lagrangemultipliers appearing in (refestm-e 2.1.9). The basic un-derlying assumption is, loosely speaking, that the costsensitivity of the original optimal control problem withrespect to perturbations in the initial states can be wellapproximated by that of the discretized problem.

To develop the explicit post-processing procedures weneed to introduce two auxiliary problems.

2. Auxiliary Problem BIn order to determine cost sensitivities with respect

(30) to changes in the initial states we consider now a new

Page 6: [American Institute of Aeronautics and Astronautics Guidance, Navigation, and Control Conference - San Diego,CA,U.S.A. (29 July 1996 - 31 July 1996)] Guidance, Navigation, and Control

problem, denoted by BO, that differs from problem Ain that the initial conditions (30) are replaced by con-ditions that explicitly prescribe the initial time and theinitial states, namely

(37)

Here, (t£)*, (a$)* denote the values of the quantities t0,XQ associated with the optimal solution to problem A,respectively. Furthermore, to avoid redundancy or in-compatibility with the conditions (37) , we assume thatnone of the constraints (33) is active at node 0. (Thelatter assumption will be relaxed in the auxiliary prob-lem C, below). Following the nomenclature introducedin (36), problem BQ is. represented by the Lagrangianfunction

LB° =

N

AT

AT

(38)

-(#)*)

Obviously, by construction of the initial condi-tions (37), the optimal values of all independent vari-ables X{, i = Q,..,N, U{, i = 1,..,N, to, and ij\r areidentical for problems A and BQ . In order to determinethe values of the KKT-multipliers associated with theoptimal solution to problem BO, we note that the func-tional form of the expressions ̂ °, i = 1,.., N, a%u °,i = 1,.., JV, dgt^, is identical to the functional form ofthe expressions ^h-, i = 1,.., N, ̂ r, i = 1,.., N,respectively. Rom here, it can be quickly shown thatall first-order KKT-conditions ^^- = 0, i = 1,..,JV,2$- = 0, i = 1,..,N, &&. = 0, are satisfied ifthe expressions on the left-hand side are evaluated atxt = (xf)*, i = Q,..,N, Ui = (uf)*, i = 1,..,N,

and m = (ftf)*, i — 1,..,N.The remaining first-order conditions, aj[t " = 0, anddgx ° = 0, can then be used to determine GO and /3o,respectively. For /?o this yields explicitly

[ dg_I dx:ti *i-*o*=^ . = fl-

(39)In light of equation (72) it is clear that the negativeof the multiplier — /3o represents the sensitivity of the

optimal cost (28) associated with problem BO with re-spect to perturbations in the initial states #0 prescribedin (37). Hence /J^ represents an approximation to theright-hand side of (22), evaluated at t — to. Note alsothat the right-hand side of (39) involves only quantitiesthat are known once an optimal solution to problem Ais obtained.

3. Auxiliary Problem CIn the previous section, an expression was derived for

the sensitivity of the optimal cost (28) with respect toperturbations in the state values prescribed at initialtime. In this derivation, an important assumption wasthat none of the state constraints (33) are active at thestarting node, node 0.

For the case where this assumption is violated, wenow derive an expression for the cost sensitivity /?o thatis valid an arbitrarily small €-time step before the initialtime io, as long as no state constraints (33) are activeat the node introduced preceding node 0. Let us denotethis node by subscript 0 — e. We arbitrarily introducethe control constraint

u — uc (40)

on the interval from node 0 — e to node 0. Here, uconst isa constant control vector independent of € chosen suchthat i) all state constraints are satisfied at t = to, x =x0, u = uconst, and ii) |j > 0 at t = t0, x = XQ, u =Ueonst- The discretized equations of motion are enforcedthrough a backward Euler step on the interval from node0 to 0 — e, and the initial conditions enforced at node0 — e are

(41)

Note that for sufficiently small e > 0 the condition ii)above guarantees that

(42)f\ i

= h(x0, t0) +—ax -C — XQ~)

(43)

so that the state constraint h < 0 is satisfied with strictinequality at node 0 — € for all sufficiently small e > 0.The underlying assumption on the existence of a con-trol ucons-t that satisfies points i) and ii) above at thebeginning or in the interior of a state constrained arc israther mild, as it is tantamount to the assumption that

Page 7: [American Institute of Aeronautics and Astronautics Guidance, Navigation, and Control Conference - San Diego,CA,U.S.A. (29 July 1996 - 31 July 1996)] Guidance, Navigation, and Control

there is a control that would lead to a violation of thestate constraint after node 0.

The so-obtained discretized optimal control problemis represented by the Lagrangian function

N

Z3 = 1

.V

N

j=0(44)

f(xQ,U0,to)-

+5o (*o-e -- € •

Using arguments similar to the ones of the previoussection, it can be quickly verified that all first-orderKKT-conditions S^S. = 0, t = 1,..,N, ̂ = 0,i = 1,..,N, 3jfc£- = 0, are satisfied if the expressionson the left-hand side are evaluated at a;, = (xf-}* ,i = Q,..,N, m = ( u f ) ' , i = l,..,N, t0 = (t£}~,

i = 1,.., N, and m = (iif)*i z = 0, ..,N. The remainingfirst-order conditions, 8gx° = 0, 8gu° = 0, |̂ = 0,and g£ ° = 0. can then be used to determine AO, <7o,ao, and /?o- For J3o, we obtain explicitly (use |^_° = 0to show that AO = — e/?o

A? a/2 9a;

fl££2 dx

insert this result into

\TAl

* = '1

dh

I-e-1

(45)

Above, I denotes the n x n identity matrix. As e shrinksto zero, the limit of — /?0 is well-defined. With the defi-nition —/?Q" = lime_o — Po , we obtain

—to

dhr = *! (46)

1 = tfl

In light of equation (72) it is clear that the negativeof the multiplier /?o represents the left-hand limit of thesensitivity of the optimal cost (28) associated with prob-lem Co with respect to perturbations in the initial statesXQ at time to- Hence — p$ represents an approximationto the left-hand limit of the right-hand side of (22), eval-uated at t = to.

Note also that the right-hand side of (46) involvesonly quantities that are known once an optimal solutionto problem A is obtained. Furthermore, equation (46)reduces to equation (39) if (IQ = 0, i.e. if no state con-straints (33) are active at node 0 in the solution to prob-lem A. It is also interesting to observe that the sameexpression (46) would have been formally obtained inthe previous section, had we not imposed the restrictionthat none of the state constraints (33) must be activeat node 0. However, the authors see no justification fortaking this approach.

III. Costate EstimatesThe goal of this paper is to present a method by

which estimates for the constant multipliers /, Z/Q, z//,and the time-varying multipliers A(i) associated withthe continuous optimal control problem (1-8) can beconstructed from the solution of the discretized optimalcontrol problem (28-33). The general idea is to approx-imate the cost sensitivities appearing on the right-handside of (22, 23, 25, 26) through the appropriate cost sen-sitivities associated with the discretized optimal controlproblem (28-33).

In the following, estimates for A(i;), i > 0, will befirst developed under the assumption that no state con-straints are active at time ti, using equation (46). If astate constraint is active, then the so-obtained costateestimate, A(ir), represents the value that A(£) wouldhave possessed along an unconstrained arc, just beforehitting the state constraint at time i,-. The correct mul-tiplier value, A(i/"), is then obtained by adding the jumpgiven by (20).

1. Costate Estimate at Initial Time ta

In light of equation (72) it is clear that — J3o definedin (45) represents the sensitivity of the optimal cost (28)associated with problem Co with respect to perturba-tions in the initial states prescribed at the beginning ofan artificially introduced interval of length e precedingto, along which no state constraints are active. Hence, inlight of equation (22), thejsxpression —/?o defined in (46)represents an approximation to the left-hand limit of the

Page 8: [American Institute of Aeronautics and Astronautics Guidance, Navigation, and Control Conference - San Diego,CA,U.S.A. (29 July 1996 - 31 July 1996)] Guidance, Navigation, and Control

time-varying Lagrange multiplier A(t) at to§, i.e.

\T f>f \T^1 vj ^\i

ti — to

T dh2 dx (.47)

If no state constraints are active at to, then \(t) is con-tinuous at to, and the left-hand side of (47) can be re-placed by A(i0)T- Note that in this case }io = 0 and theright-hand sides of (46, 47) reduce to the right-hand sideof (39).

In case at least one of the state constraints (7, 8) isactive at to, the expression (47) has to be interpretedas an estimate for the multiplier A(tfo) "just before thestate constraint is hit". The correct multiplier valuealong the state constrained arc can then be formallyobtained from (20) with ta replaced by to, if only theconstant multiplier vector / is known. A method toestimate / will be discussed in Section III.5.

2. Costate Estimate at Nodal Time ijThe concepts presented in the previous section to es-

timate A(io) can be easily extended to the calculation ofX(t) at any of the nodal times t = ti, i = 1,.., N—l. Thebasic idea is to delete the i leading nodes, nodes 0, ...,i— 1, in problem A, and to consider ti as the new initialtime. Then, in analogy to problems BQ, Co defined inSections II.2, II.3, respectively, auxiliary problems Bf,d in the independent variables Xi, ..., Zjv, «t+i> —, ^N,

respectively, are defined. The initial conditions and theLagrangian functions associated with problems Bi/dare given by equations (37, 38) / (41, 44) with 0 re-placed by i and 1 replaced by i + 1, everywhere. Incomplete analogy to the analysis presented in the pre-vious section we then arrive at the costate estimate

/\ • i -j

2 6

+~ ~dxdh

(48)

If no state constraints are active at tf, then A(i) is con-tinuous at ti, and the left-hand side of (48) can be re-placed by A(i,)T. If at least one of the constraints (7, 8)is active at node i, then (48) has to be interpreted againas an estimate for A(£~), i.e. the value of A(i) at the endof an infinitesimally short unconstrained arc preceding

ij. The correct multiplier value at tf, the beginning ofthe state constrained arc, can then be formally obtainedfrom (20) with ta replaced by ti, if only the constantmultiplier vector / is known. A method to estimate /will be discussed in Section III.5.

3. Costate Estimates at Final Time ifAt the final node, tfj, the method for estimating A(if)

as discussed in the previous section breaks down. Notethat the quantities ajj+i, u»+i, i»+i, ^i+i, o-z-+i appear-ing on the right-hand side of (48) do not exist for i = N.An alternative method for estimating A(fy) can be de-veloped from equation (11) if only an estimate for theconstant multiplier Vj e R5 is available. Then

T/,estimate

(49)An estimate, z//estimate, for the constant multiplier vec-tor Vj will be presented in the next section. Notethat (49) is valid irrespective of whether or not any stateconstraints are active at the final node iff.

4. Estimates for the Multiplier Vectors z/0 and VfFrom (25, 26) we know that the constant multiplier

vectors z/o and Vj represent the sensitivity of the op-timal cost associated with the variational solution toperturbations in the right-hand side of the boundaryconditions (3) and (4), respectively. Hence, in light ofequation (72) it is clear that the negatives of the mul-tiplier vectors TTQ and it; associated with the optimalsolution to the discretized problem, problem A, repre-sent approximations for v§ and Vf, respectively, i.e.

Vf =-7T/.

(50)

(51)

5. Estimate for the Multiplier Vector IIn the present section, we consider only the case of

a single, scalar state inequality constraint h(x,t) < 0,and we assume that the optimal variational solution hasthe switching structure given by (16). Generalizationof the results below to vector-valued state constraints(equality and inequality) and to other switching struc-tures are straightforward. For the optimal solution tothe discretized optimal control problem, problem A, weassume that

< 0 at nodes 0, ..,ia-ih(x(t),t) ^ = 0 at nodes ia,..,it

< 0 at nodes tj+1, ..,N(52)

§ Strictly speaking, it is necessary to define a new variationalproblem on the interval [*o-e, */] such that its finite dimensionaldiscretization is represented by problem GO . This step should beclear and it is omitted in this paper, for conciseness.

with 0 < ia < ij < N.According to (23), the constant multiplier vector

/ 6 Rs appearing in (20, 21) represents the sensitivity ofthe optimal cost associated with the variational solution

Page 9: [American Institute of Aeronautics and Astronautics Guidance, Navigation, and Control Conference - San Diego,CA,U.S.A. (29 July 1996 - 31 July 1996)] Guidance, Navigation, and Control

to problem (1-6, 24) with respect to perturbations in theparameter vector b about its nominal value. 6 = 0 G R?.In the following, an estimate for / is developed by re-placing the right-hand side of (23) through the appro-priate sensitivity of the optimal cost associated with adiscretized optimal control problem, namely, problem Awith the constraints (refestm-e 2.1.6) replaced by

g-l ,

ill £* ,Li i ~~~ / ""'" 1 it4 ~~" t j I — U lOr !2/7 ^«« i "̂ 2/) I\ J ' J / / ./ /_ 1 *• J f c O/ L * * —— •* —— v J

constant multiplier vector / are determined from equa-tion (55). At all following nodes, i, with ia < i < ib,an estimate for A(tJ-)-r can be obtained by first calculat-ing the artificial quantity \(t7)T from (48), and thenperforming the "multiplier jump"

(56)

where the components of / are determined from

problem A. Clearly, perturbations in the parameter vec-tor b in (53) lead to a well-defined change in the value ofh(x, t) at each individual node. In light of equation (72)the sensitivity of the optimal cost with respect to per-turbations in the prescribed value of h(x,t) at an in-dividual node number i is given by the negative of themultiplier fa, i=0,..,N. Hence, the total sensitivity ofthe optimal cost with respect to the &-th component of6 in (53) k = 1,.., q — 1, is given by

(57)

dJ_9bk

dJb=0 4=0

V^ '2^ -&• k\(54)

By construction, the left-hand side of (54) represents anapproximation to the k-ih component of the right-handside of (23). Hence we obtain

Ik = x —A*? ——n—~, k = 0,... Q — 1. (55)K Z—/ 3 1?} ^ '

6. Practical ApplicationAssume an optimal control problem of the general

form (1-8) is given, and assume that an optimal solu-tion to the discretized optimal control problem (28-33)has been obtained. In the following, the nomenclaturefor the time- varying and constant multipliers involved ina variational solution is adopted from Section 1.2. Thenomenclature for the multipliers associated with the di-rect approach (28-35) is indicated in the Lagrangianfunction (36). For simplicity, we consider only a single,scalar state constraint h(x,t) < 0. Let this state con-straint be of order q, and assume that 1.) h(x,t) = 0 atall nodes i with ia < i ' < z'j, 2.) if ia ^ 0 then h(x, t) < 0at node ia-\ , and 3.) if z'j ^ N then h(x, t) < 0 at node

At nodes i, i < N, where the state constraint (33)is not active, an estimate for the transpose of the time-varying Lagrange multiplier X(ti)T is given by the right-hand side of (48). At node ia, an estimate for A(t~)T, isgiven by the right-hand side of (48) with i replaced byia- An estimate for A(f+)T can be obtained by addingthe multiplier jump (20), where the components of the

k=i

At the final node £/y, a costate estimate is obtained fromequation (49), irrespective of whether or not any stateconstraints are active at IN .

IV. Remarks and Extensions

1. Interior-Point ConstraintsIn the original problem formulation (1-8), interior-

point constraints were not considered explicitly. Suchconstraints were introduced only later in (18) throughthe treatment of state inequality constraints, and the as-sociated necessary conditions for optimality in the vari-ational approach were stated.

Note that the treatment of state inequality con-straints through the direct approach presented in Sec-tion II.1 did not require the treatment of interior-pointconstraints, as the state inequality constraints wereenforced directly at individual nodes, without trans-forming them into a combination of interior-point con-straints and control constraints. In fact, the treatmentof interior-point constraints with our direct optimiza-tion approach would have required the introduction ofmultiple phases, i.e., it would have been necessary to di-vide the original time interval [to, i/] into subarcs [to, ts]and [ t , , t j ] , where the additional parameter ts denotesthe location of the interior-point. Obviously, the intro-duction of interior-point constraints would have com-plicated the presentation of the results obtained in thispaper. Without proof it is stated here, however, thatthe extension of the methods and results presented inthis paper to the case of interior-point constraints isstraightforward.

2. Relation to Previously Published WorkThe basic idea of this paper is to make use of the

physical interpretation of Lagrange multipliers in termsof the sensitivities of the variational cost function withrespect to perturbations in the initial states, and to ap-proximate the Lagrange multipliers by the cost sensi-tivities associated with near-optimal solutions obtainedthrough parameter optimization. The same idea was al-ready used in [11]. However, in [11] the cost sensitivities

Page 10: [American Institute of Aeronautics and Astronautics Guidance, Navigation, and Control Conference - San Diego,CA,U.S.A. (29 July 1996 - 31 July 1996)] Guidance, Navigation, and Control

associated with the parameter optimal solution were ob-tained through finite differences, while, in the presentpaper, the required sensitivities are expressed in termsof the KKT multipliers obtained from the NLP solu-tion. As a consequence, the approach in [11] requiresmuch more CPU time, and the expected precision in thecostate estimates should be lower. However, an impor-tant advantage of the finite difference approach in [11]is that no new relations between KKT multipliers andcost sensitivities need to be derived if the discretizationscheme is changed.

A seemingly very different costate estimation methodwas presented in [16]. For ease of comparison, let us con-sider only the case where no state constraints (33) arepresent. Then the discretization chosen in [16] is iden-tical to the one used in the present paper, even thoughthe nomenclature is somewhat different. Restated interms of the nomenclature used in the present paper,it was shown in [16] that the quantities t~^l^f satisfya discretized version of the continuous Euler-Lagrangeequations (9). From this observation, costate estimatesat the mid-points, fi+i — *'+1

2't't', in between neighbor-ing nodes t;+i and t{ were derived, namely

(58)

subject to the equations of motion

i(i) = v(y) cos8(t),y(t) = v(y) sin6(t),

the boundary conditions

'i+l

The associated result of the present paper, equa-tion (48), with A(i,r) replaced by A(t,), and with ,̂- setequal zero to reflect the fact that no state constraintsare present, can be written in the form

A(*)T = -;Af+1 df_

dx «i+i*i+l

2 dx

\T~* ^•A, (59)

where

_—1{ dx £+!

f;+i

Note that the term in square brackets on the right-handside of (59) represents an Implicit-Euler integration stepof the Euler-Lagrange equations (9) from time ti to FJ+I.This is the discretization within which the results (58)and (59) are consistent.

V. Numerical Example, We consider the same numerical example as in [11],

i.e. a state constrained Brachistochrone problem. InMeyer form, the problem can be stated as follows:

min t e«6PWC[t0>*/] J

(60)

= 0, »(t/)free,

and the state constraint

y(t) - x(t) tanj - h0 < 0.

(61)

(62)

(63)

Here, x and y are the state variables and 0 is the onlycontrol. The quantity v denotes the velocity and is ashort-hand notation for v = ^/v^ + Igy. The quantities^0; 9, 7, ho are constant. For numerical calculations weuse

«o = 1,9 = 1,7 = 20°,h0 = 0.05.

(64)

The state inequality constraint (63) is of first order andthe optimal switching structure for problem (60-64) is

free - constrained - free. (65)

A precise treatment of the variational optimality condi-tions is given in [11]. To generate a finite dimensionalapproximation to the variational solution in the style ofproblem A introduced in Section II. 1, the time intervalis divided into 100 equidistant subintervals by introduc-ing the nodal times

(66)

In Figures 1-6 the exact optimal time histories ofthe states x, y, the costates Xx, Xy, the constraintc = y — x tanj — ho and the multiplier fi, respectively,are represented as solid lines. Here, p. is the multiplierfunction associated with the control constraint obtainedfrom the g-th time derivative of the state constraint, qbeing the order of the state constraint. Superimposedas circles in Figures 1, 2, and 6, respectively, are theresults for the states x, y, and the constraint c, ob-tained with the discretization scheme of Section II.1.A total of N = 100 nodes were used, and all nodes wereplaced equidistantly. In Figures 3 and 4, the circles rep-resent the costate estimates Az and Xy generated withthe method summarized in Section III.6. The agreementwith the variational solution is excellent, even acrossthe discontinuous jump of the multipliers at the begin-ning of the state constrained arc. As discussed in Sec-tion III, the calculation of Lagrange multiplier estimatesalong the state constrained arc is a two-step procedure,which requires an estimate for the height of multiplierjumps. This height, /, varies with the time t, at which

Page 11: [American Institute of Aeronautics and Astronautics Guidance, Navigation, and Control Conference - San Diego,CA,U.S.A. (29 July 1996 - 31 July 1996)] Guidance, Navigation, and Control

costate estimates need to be calculated. Along uncon-strained arcs all components of l(t) are zero, and it canbe shown analytically that the first component, l o ( t ) ,of the so-defined function l(t) is identical to the multi-plier function fi(t). In Figure 6, the circles represent theapproximate values for lo obtained from equation (57).Again, the agreement with the variational solution isexcellent.

It is interesting to compare the results obtained herewith the results obtained in [11]. In general, it is ob-served that the costate approximations obtained hereare "better". The "noisiness" of the costate estimatesobtained in [11] can be attributed to the imprecisions re-sulting from the finite difference nature of the approach.However, in addition, a very consistent deviation ofthe costate estimates from the correct variationally ob-tained solution is observed along the state constrainedarc. This deviation represents an additional discretiza-tion error stemming from the fact that, along state con-strained arcs, cost sensitivities were calculated withoutintroducing an additional node an e-interval before thenode of interest (compare auxiliary problems B and Cof Section II in the present paper). It is clear, however,that this additional discretization error shrinks to zeroas the "fineness" of the discretization grid is increased.

In terms of CPU time, the method presented hereis also much superior to the method presented in [11].In [11], cost sensitivities are calculated through finitedifferences which require calculation of "many" per-turbed trajectories. Thus, the numerical procedure forcalculating costate estimates typically takes long com-pared to calculating the reference trajectory. In com-parison, the CPU time requirement for the method pre-sented in this paper is negligible even compared to theeffort involved in calculating a single trajectory.

However, a caution is in place for numerical im-plementation. The main results of this paper, equa-tions (refestm-e 3.1.1), (48), and (57) change dramat-ically if the problem discretization is changed. Thisshould be expected intuitively, for example, if theTrapezoidal integration step (29) is replaced by a third-order Simpson integration step. But even minutechanges, such as multiplication of condition (29) with— 1 would make it necessary to replace A,-+i by —A^+iin (48). Further sign changes are required in (refestm-e 3.1.1, 48), 57) if the NLP code used as optimizationengine is a maximizer instead of a minimizer, or if theunderlying augmented cost function (71) is defined bysubtracting the cost terms instead of adding them.

SummaryA method was introduced in this paper for the auto-

matic calculation of costates using only results obtainedfrom a collocation-type direct optimization approach.The class of problems addressed in this paper is fairlygeneral and includes problems with state constraints ofarbitrary finite order. As a starting point the known

relations between Lagrange multipliers and certain costsensitivities was used. Then the cost sensitivities of thevariational solution were approximated by the cost sen-sitivities of the discretized solution. As a result, costateestimates at the nodal points were obtained in terms ofthe Kuhn-Tucker multipliers associated with the non-linear programming solution. The obtained results werealso shown to be consistent with results obtained previ-ously by other researchers. As a numerical example, astate constrained version of the Brachistochrone prob-lem was solved and the results were compared to thevariational solution. The agreement was found to beexcellent. The CPU time requirement for the costateestimation step is negligible. Even though the deriva-tions in this paper are fairly general, the obtained re-sults are highly customized to specific discretizationscheme to a certain format of the problem formulation.

Acknowledgement s

This work was supported by NASA Langley ResearchCenter under Contract Number NASl-20405.

Appendix: The Karush-Kuhn-Tucker ConditionsIn the present section we consider a generic nonlinear

programming problem of the form

min f(y)

ai(y) = &», i—l,..,me,

(67)

(68)(69)

where / : R" ->• R, a : R" -» Rm«+m', and b €Rm=+m' is an arbitrary but fixed vector of constants.Note that the discretized optimal control problem (28-35) is of this general form.

Under the assumption (normality) that the gradientsof the active constraints a are linearly independent, thesolution to problem (67 - 69) satisfies the Karush-Kuhn-Tucker conditions, namely

dy

(70)

> 0 if ai(y) = hi ' .., me + m^

I

III

IV

V

AboveL(y, a) = f(y) + a (a(y) — b) (71)

denotes the Lagrangian and the vector function a con-sists of the left-hand sides of (68) and the left-hand sidesof the active components of (69).

rfor all Ay e R" satisfying = 0

10

Page 12: [American Institute of Aeronautics and Astronautics Guidance, Navigation, and Control Conference - San Diego,CA,U.S.A. (29 July 1996 - 31 July 1996)] Guidance, Navigation, and Control

The Lagrange multipliers a can be interpreted as sen-sitivities of the optimal cost value /* with respect tochanges in the constraints (68, 69). More precisely, lety*(b) denote the optimal solution of problem (67 - 69)as a function of the parameter vector b. Then, with thehelp of (70 I) we find

a/(»*(&))db8f_dy

= —a

— —a'

= —a

wo96

1 ̂dy .

-fa(ir(6))96

db

(72)

Note that these relations can only be guaranteed'if thenormality condition stated at the beginning of the ap-pendix is satisfied. This normality condition may beviolated either due to incompatibility or through (lo-cal) redundancy of certain constraint components. Incase of incompatibility, no solution exists. In case ofredundancy, the optimal solution may still exist, andif it exists, it is guaranteed to satisfy the KKT condi-tions (70). However, the multiplier vector a is no longerdetermined uniquely (there may even be solutions thatviolate the sign condition (70 IV)), and the sensitivityequation (72) is no longer applicable.

References

1. Pontryagin, L.S., et al, "The Mathematical The-ory of Mathematical Processes," Interscience, NewYork, NY, 1962.

2. Bryson, A.E., and Ho, Y.C., "Applied Opti-mal Control," Hemisphere Publishing Corporation,New York, 1975.

3. Lee, E.B., and Markus, L., "Foundations of Opti-mal Control Theory," Robert E. Krieger PublishingCompany, Malabar, Florida, 1986.

4. Leitmann, G., "The Calculus of Variations and Op-timal Control," Plenum Press, New York, 1981.

5. Neustadt, L.W., "A Theory of Necessary Condi-tions," Princeton University Press, Princeton, NewJersey, 1976.

6. Roberts, S.M., and Shipman, J.S., "Two-PointBoundary Value Problems: Shooting Methods,"American Elsevier Publishing Company, New York,1972.

7. Roberts, S.M., and Shipman, J.S., communicatedby A. Miele, "Multipoint Solution of Two-PointBoundary Value Problems," Journal of Optimiza-tion Theory and Applications, Vol. 6, No. 4, pp.301-318, 1971.

8. Stoer, J., and Bulirsch, R., "Introduction to Nu-merical Analysis," English Translation by R. Bar-ties, W. Gautschi, and C. Witzgall, Springer Ver-lag, New York, NY, 1980.

9. Wouk, A., "A Course of Applied Functional Anal-ysis," John Wiley & Sons, 1979.

10. Gill, P.E., Murray, W., and Wright, M.H., "Prac-tical Optimization," Academic Press, 1981.

11. Seywald, H., and Kumar, R.R., "A Finite Differ-ence Based Scheme for Automatic Costate Calcula-tion." Paper no. 94-3583, Proceedings of the AIAAGN&C Conference, Scottsdale, AZ, August 1-3,1994.

12. Seywald, H., and Kumar, R. R., "ConcatenatedApproach to Trajectory Optimization (CATO),"Proceedings of the European Control ConferenceECC 95, Rome, Italy, September 5-8, 1995, pp.2100-2105.

13. Kumar, R. R., and Seywald, H., "Robust On-BoardNear-Optimal Guidance Using Differential Inclu-sions," Proceedings of the European Control Con-ference ECC 95, Rome, Italy, September 5-8, 1995,pp. 3148-3153.

14. Enright, P.J., and Conway, B.A., "Discrete Ap-proximations to Optimal Trajectories Using DirectTranscription and Nonlinear Programming," AIAAPaper 90-2963-CP, 1990.

15. Lawton, J.A., and Martell, C.A., "Adjoint Vari-able Solutions via an Auxiliary Optimization Prob-lem," Journal of Guidance, Control and Dynamics,Vol.18, No.6, November/December 1995.

16. Stryck, 0. von, "Numerical Solution of Optimal. Control Problems by Direct Collocation," in: Bu-lirsch, R., Miele, A., Stoer, J., and Well, K.H., "Op-timal Control," International Series of NumericalMathematics, Vol. Ill, Birkhauser Verlag, Basel,1993.

11

Page 13: [American Institute of Aeronautics and Astronautics Guidance, Navigation, and Control Conference - San Diego,CA,U.S.A. (29 July 1996 - 31 July 1996)] Guidance, Navigation, and Control

1.0 -

0.8 -

solid line: shooting solutioncircles: direct solution (101 nodes)

>,cb

0.2

0.1

0.0

•0.1

solid line: shooting solutioncircles: calculated estimates

EJS -0.3

Figure 1: State x vs time t Figure 4: Costate Ay vs time t

solid line: shooting solutioncircles: direct solution (101 nodes)

solid line: shooting solutioncircles: direct solution (101 nodes)

0.2 0.4t

0.8 1.0

Figure 2: State y vs time t Figure 5: Constraint c vs time t

-0.780

X -0.800CO

•D.OE -0,810

solid line: shooting solutioncircles: calculated estimates

0.08

0.06

0.04_ p

3E 0.02

O.OOf

-0.02

-

'-

fe\\ '

\

_ solid line: shooting solution p. :circles: calculated estimates for I0 :

0.2 0.4

Figure 3: Costate Xx vs time t Figure 6: Multiplier fj, vs time t

12