jwvanwingerden/documents/lqginternalreport.pdfchapter 1 introduction...

71
Internal report on the statistical properties of the LQG cost function 1 Hildo Bijl 2 , Jan-Willem van Wingerden 2 , Michel Verhaegen 2 December 10, 2014 1 This research is supported by the Dutch Technology Foundation STW, which is part of the Netherlands Organisation for Scientific Research (NWO), and which is partly funded by the Ministry of Economic Affairs. 2 Delft Center for System and Control, Delft University of Technology, 2628 CD Delft, the Netherlands, {h.j.bijl,j.w.vanwingerden,m.verhaegen}@tudelft.nl.

Upload: others

Post on 12-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Internal report on the statistical properties of the LQG costfunction1

Hildo Bijl2, Jan-Willem van Wingerden2, Michel Verhaegen2

December 10, 2014

1This research is supported by the Dutch Technology Foundation STW, which is part of the NetherlandsOrganisation for Scientific Research (NWO), and which is partly funded by the Ministry of EconomicAffairs.

2Delft Center for System and Control, Delft University of Technology, 2628 CD Delft, the Netherlands,{h.j.bijl,j.w.vanwingerden,m.verhaegen}@tudelft.nl.

Page 2: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Abstract

Linear Quadratic Gaussian (LQG) systems are relatively well-understood in literature. Methodsto minimize the expected cost are readily available. Less is known about the statistical propertiesof the resulting cost function. This report summarizes the known theorems on LQG systems andexpands on them by providing a set of analytic expressions for the mean and variance of the LQGcost function. Both the discounted and the non-discounted cost function are considered, as well asthe finite-time and the infinite-time cost function.

The expressions for the mean and variance of the cost function are derived using two differentmethods. On one side, expressions are derived using solutions to Lyapunov equations. In addition,an extensive body of theory related to Lyapunov solutions is presented. On the other side, expres-sions are derived using only matrix exponentials. Both methods turn out to have their advantagesand disadvantages. The derived expressions are verified using a numerical simulation, showingtheir accuracy.

Page 3: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Contents

1 Introduction 31.1 The problem setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Literature overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Overview of this report . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Linear Quadratic Gaussian control 52.1 Basic theories of LQ control without noise . . . . . . . . . . . . . . . . . . . . . . . 5

2.1.1 Solution of the LQ paradigm . . . . . . . . . . . . . . . . . . . . . . . . . . 52.1.2 The discounted cost solution . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 Basic theories of LQG control with process noise . . . . . . . . . . . . . . . . . . . 62.2.1 A system with process noise . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2.2 Steady-state distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.3 Basic theories of LQG control with measurement noise . . . . . . . . . . . . . . . . 72.3.1 Adding measurement noise to the system equations . . . . . . . . . . . . . . 72.3.2 Constructing an observer when the state isn’t known . . . . . . . . . . . . . 72.3.3 Applying the observer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.3.4 An alternative way of writing the combined system . . . . . . . . . . . . . . 8

3 Statistical properties of the cost 103.1 The problem setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103.2 Using Lyapunov solutions to find E[J ] and V[J ] . . . . . . . . . . . . . . . . . . . . 11

3.2.1 Terminology and notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.2.2 The mean of the cost function . . . . . . . . . . . . . . . . . . . . . . . . . 123.2.3 The variance of the cost function . . . . . . . . . . . . . . . . . . . . . . . . 12

3.3 Using matrix exponentials to find E[JT ] and V[JT ] . . . . . . . . . . . . . . . . . . 133.3.1 New expressions for E[JT ] and V[JT ] . . . . . . . . . . . . . . . . . . . . . . 143.3.2 Comparing the two methods . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.4 Applying the derived equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143.4.1 A simulation verifying the derived equations . . . . . . . . . . . . . . . . . . 153.4.2 A simulation applying the derived equations . . . . . . . . . . . . . . . . . . 15

4 Conclusions 18

A Mathematical theorems 20A.1 Solutions to Lyapunov equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

A.1.1 Basic properties of Lyapunov solutions . . . . . . . . . . . . . . . . . . . . . 20A.1.2 Combinations of Lyapunov solutions . . . . . . . . . . . . . . . . . . . . . . 23

A.2 Using matrix exponentials to solve integrals . . . . . . . . . . . . . . . . . . . . . . 26A.2.1 Integrals within matrix exponentials . . . . . . . . . . . . . . . . . . . . . . 26A.2.2 Using matrix exponentials to solve Lyapunov equations . . . . . . . . . . . 31

A.3 Power forms of Gaussian random variables . . . . . . . . . . . . . . . . . . . . . . . 32A.4 Other mathematical proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

1

Page 4: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

B Theorems related to LQG systems 35B.1 Controlling LQG systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

B.1.1 Basic properties of the LQG paradigm . . . . . . . . . . . . . . . . . . . . . 35B.1.2 Properties of a system with noise . . . . . . . . . . . . . . . . . . . . . . . . 38

B.2 Using Lyapunov solutions to find E[J ] and V[J ] . . . . . . . . . . . . . . . . . . . . 43B.2.1 The mean of the cost function . . . . . . . . . . . . . . . . . . . . . . . . . 43B.2.2 The variance of the cost function . . . . . . . . . . . . . . . . . . . . . . . . 48

B.3 Using matrix exponentials to find E[JT ] and V[JT ] . . . . . . . . . . . . . . . . . . 62

2

Page 5: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Chapter 1

Introduction

This internal report is a supporting document for a paper to be published in the IEEE Transactionson Automatic Control. Whereas the paper has many shortened versions of proofs, with sometimesproofs omitted entirely, this report contains all the details. The proofs of the theorems are allworked out in small steps, making this report accessible to students on a graduate level as well.

1.1 The problem settingIn this report, we examine the statistical properties of the Linear Quadratic Gaussian (LQG) costfunction

JT =

∫ T

0

e2αtxT (t)Qx(t) dt, (1.1)

in which x(t) satisfiesx(t) = Ax(t) + v. (1.2)

If you’re not familiar with the notation: we will discuss it in-depth later. What we intend to do inthis report is examine the distribution of JT . In particular, we will derive expressions for the meanE[JT ] and the variance V[JT ] of the cost. We do this for various values of α, for various types ofmatrices A and for the cases where T is finite and infinite.

1.2 Literature overviewIn literature (see for instance [AM90], [SP05], [BKM08] or [Å70]) the LQG problem is studiedrelatively well. The method to minimize the expected value E[JT ] of the cost function if well-known. Often, an expression for the expected value E[JT ] is then given as well, but only for aspecific case. To the best of my knowledge, there is no paper or book which gives an expression forE[JT ] with (1) a randomly distributed initial state, (2) added process noise and (3) a discountedcost function. At the same time, there is also no expression for the variance V[JT ] of the costfunction. In fact, very little is known about the distribution of the cost JT itself.

The cost function JT is usually calculated by integrating over xT (t)Qx(t), in which x(t) is a non-zero-mean Gaussian process. This makes JT a summation of quadratic forms of Gaussian variables.There have been many articles on such parameters, some as old as [Ric44], which quite generallyconsidered statistical properties of random noise. A bit more specific work is done in [Sch70], whichderives expressions for the average power of zero-mean Gaussian random processes.

Others, such as [SL71], try to approximate the cumulants of the distribution using tech-niques like Fourier approximations, Karhunen-Loeve expansions and Neural Networks. More re-cently [MP92] specifically discusses quadratic forms of Gaussian variables, but it does not offer anyanalytic expressions for the PDF of the cost distribution of LQG systems. In fact, the resultingdistribution will be a generalized noncentral χ2 distribution, which does not have a known PDF,although the cumulative density function can be approximated numerically, as is done in [Dav80].

3

Page 6: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Nevertheless, there are control methods related to cost variances. An example is Minimal CostVariance (MCV) control, in which the mean cost E[J ] is fixed and the cost variance V[J ] (oralternatively the cost cumulant) is then minimized. This is studied in for instance [KAW14], aswell as [WG02]. This latter paper also gives an expression for the mean cost E[J ] but not for thecost variance. A good overview of such methods is given in [WSM08].

An entirely different method is Variance Constrained LQG (VCLQG), which is LQG controlwith the addition that certain upper bounds on the variance of the state and/or the input areintroduced. This method has been studied by [CS99] and [CH08].

1.3 Overview of this reportIn this paper we will derive expressions for the mean E[JT ] and the variance V[JT ] of the LQGcost. First, in chapter 2, we examine the basic theories behind LQG control. These theories arealready well-known in literature. Then, in chapter 3, we go into depth on the statistical propertiesof the LQG cost function. It is where the equations for E[JT ] and V[JT ] are presented, for variouscases.

The proofs of the mathematical theorems in this report do not appear in the main body ofthis paper. For clarity and brevity, they have been relegated to the appendices. In appendix Awe find proofs related to mathematical subjects like Lyapunov equations, matrix exponentials andpower forms of Gaussian random variables. On the other hand, the theorems more related to LQGsystems, like the ones giving expressions for E[JT ] and V[JT ] can be found in appendix B.

4

Page 7: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Chapter 2

Linear Quadratic Gaussian control

The Linear Quadratic Gaussian (LQG) control paradigm is one that is well-known and well-studied, mainly because its optimal solution has a rather elegant form and can be analyticallycalculated. In this chapter, we are going to look at the theory behind it, as well as derive someexpressions concerning the cost function.

We are going to start in section 2.1 by examining the case without noise. Then we expand theproblem step by step, adding process noise in section 2.2 and measurement noise in section 2.3.All details about the cost distribution will then be dealt with in the next chapter.

2.1 Basic theories of LQ control without noiseIn the Linear Quadratic (LQ) paradigm, we have a linear system with a quadratic cost function.The linear system has the form

x(t) = Ax(t) +Bu(t), (2.1)

where x(t) is the state and u(t) is the input. Also, A is the system matrix and B is the inputmatrix. Our goal is to find a control law which minimizes the infinite-time quadratic costfunction

J =

∫ ∞0

(xTQx+ uTRu) dt, (2.2)

where we have left out the time-dependency x(t) for easy notation. Furthermore, Q and R aresymmetric weight matrices, with Q ≥ 0 positive semi-definite and R > 0 positive definite. Theyare chosen to achieve a certain kind of behavior.

2.1.1 Solution of the LQ paradigmAssuming that the system is stabilizable in the first place, the optimal control law is a linear statefeedback control law u = −Fx. (See theorem B.1 for an intuitive proof.) The correspondingfeedback matrix F is given (theorem B.2) by

F = R−1BT X, (2.3)

where X is the solution of the Algebraic Riccati Equation

AT X + XA+Q− XBR−1BT X = 0. (2.4)

This solution can for instance be found through the Matlab command are. It should be notedhere that state feedback isn’t always possible, since the state isn’t always known. However, laterwe will present some methods of working around this.

5

Page 8: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

2.1.2 The discounted cost solutionInstead of the cost function defined previously, we can also define a discounted cost function. Thiscost function weighs costs incurred in the near future more heavily than costs incurred in the farfuture. A common way to define this discounted cost function is through

J =

∫ ∞0

e2αt(xTQx+ uTRu) dt. (2.5)

Alternatively, we can also replace the weighing factor e2αt by γt, where we call α < 0 thediscount exponent and 0 < γ < 1 the discount rate.

The solution to this problem is actually quite elegant. If we replace the system matrix A byAα ≡ A + αI (with the ‘≡’ sign meaning ‘per definition equals’) then this problem is reduced toour earlier problem. (See theorem B.3.) As a result, the optimal control law is given by u = −Fx,where now F is given by

F = −R−1BT Xα, (2.6)

with Xα the solution to

ATαXα + XαAα +Q− XαBR−1BT Xα = 0. (2.7)

By discounting a cost function like this, we can potentially also deal with non-stabilizablesystems. Previously, J only had a finite value when the system was stabilizable. That is, when itwas possible to get the largest eigenvalue λmax(A − BF ) of A − BF below zero. Now things areslightly different. Now J has a finite value if it is possible to get λmax(A+αI−BF ) below zero, orequivalently if it is possible to get λmax(A−BF ) below −α. After all, by adding αI to a matrix,we increase all eigenvalues of that matrix by α.

Alternatively, we might also choose a positive value for α. By doing so, J only has a finitevalue when (A− BF ) has all eigenvalues below −α. So, by choosing α > 0, we give the system aprescribed degree of stability. In this report, we will consider all possible values of α. Whena theorem only holds for specific values of α, we specifically mention it.

2.2 Basic theories of LQG control with process noisePreviously we have looked at a linear system without any noise. Now we will add Gaussian processnoise. Since this also turns the state into a Gaussian random variable, we call it the LinearQuadratic Gaussian (LQG) paradigm.

2.2.1 A system with process noiseConsider a linear system with process noise v. So,

x = Ax+Bu+ v, (2.8)

where v is white noise with intensity V . That is, E[v(t)] = 0 and E[v(t1)vT (t2)] = V δ(t1 − t2),where δ is the Kronecker delta function, defined as

δ(t) =

{0 if t 6= 0

∞ if t = 0and

∫ ∞−∞

δ(t) dt = 1. (2.9)

The concept of true ‘white noise’ is physically impossible in real life, just like in real life a system isnever truely linear, but we make these approximations so we can conveniently analyze the system.

The question which we can ask ourselves now is, is the control law of the previous section alsooptimal for this problem? Or do we need another control law?

The answer is that the control law is still optimal. The reason is that the control law isthe optimal law to reduce any disturbance in x back to zero. The process noise v will createdisturbances – nothing can be done about that – but once these disturbances are present, usingu = −Fx is still the best way to get rid of them as soon as possible.

The conclusion is: the presence of process noise does not directly affect our controller design.

6

Page 9: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

2.2.2 Steady-state distributionWhen process noise v is present, it will be impossible to reduce the state x to zero. The noise justkeeps disturbing the state. It is impossible to predict what exactly the state will wind up to be.However, we can say something about the distribution of the state.

Consider the system without input

x = Ax+ v. (2.10)

(We don’t consider the input u for simplicity, but in case there is an input and some control lawu = −Fx, then replacing A by A−BF will solve that problem.) The state now progresses as

x(t) = eAtx0 +

∫ t

0

eA(t−s)v(s) ds. (2.11)

Note that the influence v(s) on the (later) state x(t) is damped by a time t − s in the system;hence the above equation. According to theorem B.4, the expected value of the state will become

E[x(t)] = eAtE[x0] = eAtµ0, (2.12)

where we have defined µ0 = E[x0]. So in the limit of t → ∞, the mean state will converge tozero. (Assuming A is stable.) We can derive a similar expression (again see theorem B.4) for thevariance. It equals

Σ(t) ≡ E[x(t)xT (t)] = eAt(Σ0 −XV )eAT t +XV , (2.13)

where we have defined Σ(t) as shown above, Σ0 as Σ(0) and where XV can be found by solvingthe Lyapunov equation

AXV +XVAT + V = 0. (2.14)

(This can be done through the Matlab command lyap.) Again we can look at the limit t → ∞.For a stable system, the first term vanishes and we remain with the steady-state state varianceXV .

2.3 Basic theories of LQG control with measurement noisePreviously we have added process noise to our system, but we still assumed that we had fullknowledge of the state x. But what happens when we only have noisy measurements of the state?

2.3.1 Adding measurement noise to the system equationsWhen we add measurement noise to our system, the system equations will look like

x = Ax+Bu+ v, (2.15)y = Cx+w, (2.16)

where C is the output matrix and w is the measurement noise. w is assumed to be whitenoise with intensity W .

2.3.2 Constructing an observer when the state isn’t knownTo solve the problem that we cannot directly measure the state, we will construct an observer.This observer will make use of a state estimate x. It is initialized at what we expect the state xto be: x0 = E[x0]. Then we update x through

˙x = Ax+Bu+K(y − Cx). (2.17)

The last term in this equation serves to adjust x toward x in case measurements indicate thereis a discrepancy. The amount by which it is adjusted depends on the observer gain matrix K.But what should it be?

7

Page 10: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

What we want to do is minimize the state estimation error e = x− x. This error behaveslike

e = ˙x− x (2.18)= Ax+Bu+K(y − Cx)−Ax−Bu− v= A(x− x) +K(Cx+w − x)− v= (A−KC)e+Kw − v.

Assuming A−KC is stable, we can apply theorem B.4 to this. So we see that the error e, as t→∞converges to a zero mean and a steady-state error variance E. According to theorem B.5, theminimum steady-state error variance can be found by solving

AE + EAT + V − ECTW−1CE = 0. (2.19)

The corresponding optimal observer gain K is

K = ECTW−1. (2.20)

This is the observer gain which will reduce the error e as quickly as possible, given all the noisethat is present in our system.

2.3.3 Applying the observerNow that we have an observer, we can control the system again. Theorem B.7 argues that, to doso, we should simply pretend that the state estimate x is the real state x and apply the controlinput u = −F x. This is known as the separation principle.

When we do, we actually get two system equations, one for x and one for x. They are

x = Ax−BF x+ v, (2.21)˙x = Ax−BF x+K(Cx+w − Cx). (2.22)

We can plug all this in a matrix form according to[x˙x

]=

[A −BFKC A−BF −KC

] [xx

]+

[vKw

]. (2.23)

It’s interesting to note here that, if we write ˙x = Ax+ v, with

x =

[xx

], (2.24)

A =

[A −BFKC A−BF −KC

], (2.25)

V =

[V 00 KWKT

], (2.26)

then we have reduced our system to the same form as system (2.10) which didn’t have any mea-surement noise. In fact, due to the observer, the measurement noise has become part of the processnoise. So later on, when further investigating systems, we don’t even have to separately investigatemeasurement noise. We can just examine systems of the form x = Ax+ v.

2.3.4 An alternative way of writing the combined systemInstead of writing the system using x and x in the state, we can also write it using x and e in thestate. This time, the system equations are given by

x = (A−BF )x−BFe+ v, (2.27)e = (A−KC)e+Kw − v. (2.28)

8

Page 11: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

If we put this in a matrix form, we get[xe

]=

[A−BF −BF

0 A−KC

] [xe

]+

[v

Kw − v

]. (2.29)

This may in some cases be advantageous, because x does not affect e. So it is possible to firstsolve for e and only then solve for x, instead of solving for two parameters simultaneously.

9

Page 12: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Chapter 3

Statistical properties of the cost

The results of the previous chapter are well-known in literature and described in most textbookson LQ regulators. Something which is a lot less intensively studied is the outcome of the costfunction J . This is why it deserves a closer look.

In section 3.1 we study the exact problem the we aim to solve. In section 3.2 we then solvethis problem using Lyapunov equation solutions. In section 3.3 we derive similar expressions usingonly matrix exponentials. Finally in section 3.4 we apply the equations, to verify their accuracy.

3.1 The problem settingIn section 1.1 we’ve already briefly examined the problem setting: we want to examine the LQGcost. When doing so, we will only consider systems of the form

x = Ax+ v. (3.1)

After all, when using a control law of the form u = −Fx, and (optionally) when using an observer,the system can always be reduced back to this form. Furthermore, we will consider two types ofcost functions. There are the infinite-time cost J and the finite-time cost JT , respectivelydefined as

J =

∫ ∞0

e2αtxT (t)Qx(t) dt, (3.2)

JT =

∫ T

0

e2αtxT (t)Qx(t) dt. (3.3)

We don’t consider any input u in the cost, because if there is a control law of the form u = −Fx,then our original control law will collapse to this form, as shown by

J =

∫ ∞0

e2αt(xTQx+ uTRu) dt =

∫ ∞0

e2αtxT (Q+ FRFT )x dt =

∫ ∞0

e2αtxT Qx dt. (3.4)

We assume that v is white noise with intensity V and that the initial state x0 is a Gaussian randomvariable satisfying

µ0 = E[x0] and Σ0 = E[x0xT0 ]. (3.5)

Note that Σ0 is not the initial variance of x0. Instead, if we denote the initial variance of x0 byX0, so that x0 ∼ N (µ0, X0), then we have

X0 ≡ V[x0] ≡ E[(x0 − µ0)(x0 − µ0)T ] = E[x0xT0 ]− µ0µ

T0 = Σ0 − µ0µ

T0 . (3.6)

Furthermore, we assume that V > 0 and X0 > 0, and that V , X0 and Q are symmetric.We are now interested in the distribution of J and JT . In section 1.2 it was already mentioned

that this distribution is a generalized noncentral χ2 distribution, which does not have a knownprobability density function. As a result, we will concern ourselves with finding expressions for themeans E[J ] and E[JT ] as well as the variances V[J ] and V[JT ].

10

Page 13: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

3.2 Using Lyapunov solutions to find E[J ] and V[J ]It is possible to analyze the cost distribution using solutions to Lyapunov equations. The advantageis that this gives reasonably simple expressions. The downside is that there are a few cases in whichthis method doesn’t work.

3.2.1 Terminology and notationBefore we examine the expressions for E[J ], E[JT ], V[J ] and V[JT ], it is wise to discuss terminologyand notation first.

Terminology

We start with terminology. We call a matrix A stable (or Hurwitz) if its eigenvalues λ1, . . . , λnall have real parts lower than zero. In addition, we call a matrix Sylvester if it does not havetwo (possibly equal) eigenvalues λi and λj which satisfy λi + λj = 0. A stable matrix is alwaysSylvester, but the opposite doesn’t always hold. (See theorem A.3.)

Notation for Lyapunov solutions

Next we discuss notation. We have already seen the notation Akα ≡ A+ kαI. We now define XQkα

to be the solution of the Lyapunov equation

AkαXQkα +XQ

kαATkα +Q = 0. (3.7)

So the superscript refers to the constant matrix of the Lyapunov equation, while the subscriptrefers to what is added to the matrix A. Similarly, we define XQ

kα to be the solution of

ATkαXQkα + XQ

kαAkα +Q = 0. (3.8)

Note that now the first matrix of the equation is ATkα instead of Akα. Because this results in adifferent solution to the Lyapunov equation, we use the bar notation to indicate the differencebetween the two solutions.

It’s interesting to know that there isn’t always a unique solution for XQkα. There is a unique

solution if and only if A is Sylvester. (See theorem A.1.) This solution is symmetric if and only ifQ is symmetric. (See theorem A.2.)

We can expand the notation for XQkα. We formally define XQ

kα(t1, t2), for any matrix A, as

XQkα(t1, t2) =

∫ t2

t1

eAkαtQeATkαt dt. (3.9)

In practice, whenever Akα is Sylvester, we find XQkα(t1, t2) through

XQkα(t1, t2) = eAkαt1XQeA

Tkαt1 − eAkαt2XQeA

Tkαt2 . (3.10)

A similar definition exists for XQkα(t1, t2), except that we replace Akα by its transpose ATkα.

In practice, we often have t1 = 0. Because of that, we shorten the notation to XQkα(0, t) ≡

XQkα(t). It’s important to keep in mind that Σ(t) has nothing to do with this notation. It’s still

defined through equation (2.13).

Notation for matrix exponentials

The last notation we discuss concerns an integral with matrix exponentials. We define

XQk1α,k2α

(T ) =

∫ T

0

eAk1α(T−t)QeAk2αt dt. (3.11)

To find XQk1α,k2α

(T ), we can make use (see theorem A.15) of

XQk1α,k2α

(T ) =[I 0

]exp

([Ak1α Q

0 Ak2α

]T

)[0I

]. (3.12)

11

Page 14: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

3.2.2 The mean of the cost functionNow that notation is out of the way, we can look at the mean of the cost function.

The infinite-time cost J

Let’s consider the expected value E[J ] of the infinite-time cost. We can distinguish several cases.

• If α ≥ 0, then E[J ] = ∞. The reason is that the noise constantly disturbs the system. Assuch, x(t) will never converge to zero. An infinite integral over e2αtxTQx will therefore beinfinite. (An exception occurs if V = 0.)

• If α < 0 and Aα is unstable, then E[J ] = ∞. Note that in this case A is unstable too. Infact, A is so unstable that future values of x(t) diverge faster than the discount factor e2αt

converges to zero. (An exception occurs if X0 is set up in such a way that the initial statex0 is guaranteed to be in the stable eigenspace of Aα.)

• If α < 0 and Aα is stable, then E[J ] can be found (see theorem B.8) through

E[J ] = tr((

Σ0 −V

)XQα

). (3.13)

Note that it’s not necessary for A to be stable. As long as the largest (positive) eigenvalueof A is smaller than −α, then the above equation holds.

Let’s take a closer look at the mean cost (3.13). It’s interesting that there is a clear divisionbetween costs due to the initial state and costs due to noise. The trade-off between them isdetermined by α. For a small time window (largely negative α) the initial state is more important,while for a large time window (α close to zero) the noise is more important.

Another interesting fact is that in equation (3.13) it doesn’t matter whether we start in adeterministic initial state x0 ∼ N (µ0, 0) or in a random initial state x0 ∼ N (0,µ0µ

T0 ). The

expected value E[J ] of the cost is the same. (Later we see that the variance V[J ] does differ.)

The finite-time cost JT

Next, we look at the expected value E[JT ] of the finite-time cost. Because we are dealing with finitetime, the cost JT is always finite. However, we can only calculate it using Lyapunov solutions if bothA and Aα are Sylvester. (If this is not the case, we would have to resort to the more complicatedmethods of section 3.3.) Again, there are various cases.

• If α 6= 0, then the expected cost E[JT ] can be found (see theorem B.9) using

E[JT ] = tr((

Σ0 − e2αTΣ(T ) +(1− e2αT

)(−V2α

))XQα

), (3.14)

where Σ(T ) can be found using equation (2.13).

• If α = 0, then the expected cost E[JT ] equals (see theorem B.10)

E[JT ] = tr((Σ0 − Σ(T ) + TV ) XQ

). (3.15)

It is interesting to note that, in the limit of α→ 0, equation (3.15) reduces to equation (3.14).So in practice we actually don’t even need equation (3.15). We can just take a very small value ofα and use equation (3.14) instead.

3.2.3 The variance of the cost functionNow that we know the expected value of the cost for various cases, we examine the variance of thecost.

12

Page 15: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

The infinite-time cost J

We start with the infinite-time cost J . Previously we saw that it is only finite when α < 0 and Aαis stable. In this case, we can find (see theorem B.11) that the variance V[J ] equals

V[J ] = 2tr(Σ0XQα ΣXQ

α )− 2µT0 XQα µ0µ

T0 X

Qα µ0− 2tr

(4α(XΣ0

2α −XV2α

)XQα

(−V2α

)XQα

). (3.16)

Again some interesting results can be seen here. First, we can note that the first two termsdescribe the costs due to the initial state. If the initial state is known deterministically, and henceΣ0 = µ0µ

T0 , then these two terms cancel out.

The influence of the noise on V[J ] is described by the last term. Interestingly enough thereis also some interaction between the initial state x0 and the noise. This is described by the termXΣ0

2α . If the initial state was certain to be zero, then this term would vanish too and we wouldindeed remain with a quite simple expression for the cost variance V[J ]. Alternatively, if we startin the steady-state distribution with µ0 = 0 and Σ0 = XV (assuming that A is stable and such adistribution exists), then we can also find (using theorem A.7) that V[J ] reduces to

V[J ] = 2tr((

XV − V

)XQαX

V XQα

). (3.17)

The finite-time cost JT

Next, we consider the variance of the finite-time cost JT . Just like for the mean, there are twocases.

• If α 6= 0, then the cost variance V[JT ] can be found (see theorem B.12) using

V[JT ] =2tr(Σ0XQα (T )Σ0X

Qα (T ))− 2µT0 X

Qα (T )µ0µ

T0 X

Qα (T )µ0 (3.18)

+ 4tr((

XΣ02α (T )− XV

2α(T )

4α+e4αTXV (T )

)XQα V X

)+ 2tr

(e4αTΣ(T )XQ

α Σ(T )XQα

)− 2tr

(Σ0e

ATαT XQα e

AαTΣ0eATαT XQ

α eAαT

)+ 8tr

((XV − Σ0)eA

TαT XQ

α XV XQαα,3α

)− 8tr

(e4αTXV X

XQα−α (T )V XQ

α

).

The equirements for using this equation are that A−α, A, Aα and A2α are Sylvester.

• If α = 0, then the cost variance V[JT ] equals (see theorem B.13)

V[JT ] =2tr(Σ0XQ(T )Σ0X

Q(T ))− 2µT0 XQ(T )µ0µ

T0 X

Q(T )µ0 (3.19)

+ 4tr((XΣ0(T )−XXV (T ) + TXV

)XQV XQ

)+ 2tr

(Σ(T )XQΣ(T )XQ

)− 2tr

(Σ0e

ATT XQeATΣ0eATT XQeAT

)+ 8tr

((XV − Σ0)eA

TT XQXV XQ)− 8tr

(XV XXQ(T )V XQ

).

The requirement for this theorem is a bit simpler: only A needs to be Sylvester.

Let’s study the above equations a bit more closely. If α < 0 and if Aα is stable, then equa-tion (3.18) turns into equation (3.16) as T → ∞. To see how this happens, keep in mind thate4αT → 0, eAαT → 0 and XQ

kα(T ) → XQkα as T → ∞. A similar thing happens when we consider

the limit α→ 0, although now equation (3.18) turns into equation (3.19).

3.3 Using matrix exponentials to find E[JT ] and V[JT ]With the equations of the previous section, we can easily find E[J ], V[J ], E[JT ] andV[JT ] wheneverthe cost is finite. The equations almost always work. It is only when A−α, A, Aα or A2α is notSylvester that there is a problem. In this section, we consider a method which circumvents thatproblem and gives us equations which work for any matrix A.

13

Page 16: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

3.3.1 New expressions for E[JT ] and V[JT ]In our new method for finding E[JT ] and V[JT ], we first define the matrix

C =

−AT2α Q 0 0 0

0 A V 0 00 0 −AT Q 00 0 0 A2α V0 0 0 0 −AT−2α

. (3.20)

We then calculate eCT and write it as

eCT =

Ce11 Ce12 Ce13 Ce14 Ce15

Ce21 Ce22 Ce23 Ce24 Ce25

Ce31 Ce32 Ce33 Ce34 Ce35

Ce41 Ce42 Ce43 Ce44 Ce45

Ce51 Ce52 Ce53 Ce54 Ce55

. (3.21)

Note that each term Ceij does depend on T , but for notational simplicity we have omitted thisdependency.

Once we have eCT , we can find E[JT ] quite quickly (see theorem B.14) by using

E[JT ] = tr((Ce44)T (Ce12Σ0 + Ce13)

). (3.22)

Similarly, we can find V[JT ] (see theorem B.15) through

V[JT ] =2tr((

(Ce44)T (Ce12Σ0 + Ce13))2 − 2(Ce44)T (Ce14Σ0 + Ce15)

)− 2(µT0 (Ce44)TCe12µ0)2. (3.23)

This is a very quick and efficient way of calculating E[JT ] and V[JT ].

3.3.2 Comparing the two methodsWe’ve now seen two methods of finding E[JT ] and V[JT ]: using Lyapunov solutions and using onlymatrix exponentials. The second method, using matrix exponentials, works for any value of A andα, which is a very significant advantage. Another advantage is that the expressions are shorter,making the calculations easier.

On the flip side, the main advantage of using Lyapunov solutions is that the resulting equationsalso work for T → ∞. Furthermore, using Lyapunov solutions results in equations which have amore intuitive meaning. After all, XV is something which actually has a physical meaning (it’sthe steady-state state variance) while a parameter like Ce12(T ) is a more abstract mathematicalconstruct.

Other than those reasons, whether to use Lyapunov equations or matrix exponentials can alsodepend on the numerical properties of both methods. In [WAG14] a comparison is made betweenusing matrix exponentials and Lyapunov solutions to findXQ(T ). In this paper, it’s argued that forrelatively small time steps T , the method of matrix exponentials has a better numerical accuracy,while for relatively large time steps, using Lyapunov solutions can offer numerical advantages.These same conclusions are likely to hold for finding E[JT ] and V[JT ] as well, but up to whatdegree that is the case is an interesting topic for future research.

3.4 Applying the derived equationsNow that we can calculate the mean and the variance of J and JT for all values of α and A, it istime to check whether these equations give the right results, as well as examine how they can beapplied.

14

Page 17: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

3.4.1 A simulation verifying the derived equationsWe have set up a two-state LQG system x = Ax+ v, in which A was chosen as

A =

[1/4 1/2−1/7 −1/3

], (3.24)

because it has interesting eigenvalues at 0.075 and −0.16. Subsequently, the distribution of theinitial state has been chosen to satisfy

µ0 ≡E[x0] =

[1−1

], (3.25)

Σ0 ≡E[x0xT0 ] =

[5 00 3

], (3.26)

for no other reason than an unhealthy affinity with Fibonacci numbers. The simulation time wasset at T = 3, as cost function weight we used Q = I, and finally the noise intensity V was set at

V =

[3 −2−2 8

](3.27)

to let the noise and the initial state contribute roughly equally to the cost.For this system, we have run n = 90 000 numerical simulations. The mean and standard

deviations of the cost JT , for different values of α, are shown in table 3.1. This table also comparesthese experimental results with the theoretical results given by equations (3.14), (3.15), (3.18) and(3.19). Alternatively, we could have of course used the matrix exponential equations (3.22) and(3.23), but these equations give (as expected) exactly the same result, barring minor numericaldifferences.

Table 3.1: The mean and standard deviation of the cost JT after 90 000 experiments. Experimentaldata is compared to theoretical predictions by equations (3.14), (3.15), (3.18) and (3.19).

α 0.05 0 −0.05 −0.1

Experiment E[JT ] 97.5 80.6 67.0 56.0

Theoretical E[JT ] 97.6 80.5 66.9 55.9

Experiment√V[JT ] 96.4 78.8 64.9 53.7

Theoretical√V[JT ] 94.6 77.3 63.5 52.6

In Table 3.1 we can see that the experimental and the theoretical results are nearly identical.There are minor differences, but these can all be explained by statistical variations. As a result,we can conclude that the derived equations are very likely correct.

It is also interesting to note that the mean and the standard deviation of JT are of the sameorder of magnitude. This seems strange, because for positive (semi-)definite Q and R the cost JTcannot be negative. The reason for this can be seen if we examine the probability density functionof JT , which is shown in Figure 3.1. Here we see that most of the time the cost JT is lower thanthe mean E[JT ]. However, when the cost is higher than this mean, then it may be a lot higher,increasing the standard deviation of the distribution.

3.4.2 A simulation applying the derived equationsIn literature, people almost always use the controller which minimizes the expected value of thecost. This is done irrespective of the variance of the cost. However, if the goal is to keep the costbelow a certain threshold, then this is not always the best approach.

Consider the following system

x =

[1 0

1/20 1

]x+

[10

]u+ v, (3.28)

15

Page 18: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

0 50 100 1500

0.002

0.004

0.006

0.008

0.01

0.012

0.014

0.016

0.018

Cost JT

Pro

ba

bili

ty d

en

sity

Probability density function of the cost.

α=0.05

α=0

α=−0.05

α=−0.1

Figure 3.1: Probability density function of the cost JT for various values of α. The vertical linesdenote the means. This plot was generated using a histogram of the cost of 90 000 experimentsperformed with the system described in the text.

where it is only the first state that can be controlled directly. Furthermore, the coupling betweenthe first and the second state is very weak. This is troublesome, as the noise v with intensity V = Idoes excite this second state, and with Q = I there will be penalties. Fixing the problem can beexpensive though, because R = I. However, to prevent the cost from becoming very high, we usea large discount factor α = −0.8.

As control law we use a linear controller u = Fx. The optimal control matrix, minimizing E[J ]at E[J(Fopt)] = 154.4, is given by Fopt =

[1.6 9.9

]. However, we can also minimize V[J ] using a

basic gradient descent method. This gives the minimum-variance control matrix Fmv =[4.4 30.0

]with mean cost E[J(Fmv)] = 187.5. This mean cost is significantly larger than E[J ]opt, making itseem as if this is a significantly worse control matrix. This is also illustrated in Figure 3.2, whichshows that using Fmv (at f = 1) results in a significantly higher mean cost than using Fopt (atf = 0).

However, now suppose that we do not care so much about the mean cost. All we want is toreduce the probability that the cost J is above a certain threshold J . That is, we aim to minimizep(J > J) where we use J = 1 500, which is roughly ten times the mean. Using 250 000 numericalsimulations, with T = 20 s and dt = 0.01 s, we have found that

p(J(Fopt) > J) ≈ 0.091%, (3.29)p(J(Fmv) > J) ≈ 0.059%. (3.30)

That is, our optimal control law has more than half as many threshold-violating cases as ourminimum-variance control law, which is a significantly worse result. As such we see that in somecases there are definite advantages to minimizing the variances V[J ] instead of the mean E[J ].Alternatively, it is of course also possible to minimize some other function of the mean E[J ] andvariance V[J ].

16

Page 19: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

−0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6150

160

170

180

190

200

210

220

Controller factor f

Mean a

nd s

tandard

devia

tion o

f th

e c

ost J

Mean and standard deviation of the cost versus the control law.

Mean

St.Dev.

Figure 3.2: The cost mean E[J ] and standard deviation√V[J ] with respect to the control matrix

F = (1− f)Fopt + fFmv, for varying f , when applying the control law u = Fx in system (3.28). Itis clear that minimizing the mean cost does not give the same control law as minimizing the costvariance.

17

Page 20: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Chapter 4

Conclusions

In this report, we have examined the infinite-time cost J and the finite-time cost JT . Both pa-rameters are generalized noncentral χ2 distributions without a known probability density function.To cope with this, we have derived expressions for both the means and the variances of these twoparameters. We have done so using two different methods: using Lyapunov solutions and usingmatrix exponentials.

The infinite-time cost J only has a finite value when α < 0 and Aα is stable. For this parameter,the method using matrix exponentials doesn’t work. Instead, using Lyapunov equations, we havefound that

E[J ] =tr((

Σ0 −V

)XQα

), (4.1)

V[J ] =2tr((

XV − V

)XQαX

V XQα

). (4.2)

The finite-time cost J is always finite. For this cost, we can consider various cases. If α 6= 0,then

E[JT ] =tr((

Σ0 − e2αTΣ(T ) +(1− e2αT

)(−V2α

))XQα

), (4.3)

V[JT ] =2tr(Σ0XQα (T )Σ0X

Qα (T ))− 2µT0 X

Qα (T )µ0µ

T0 X

Qα (T )µ0 (4.4)

+ 4tr((

XΣ02α (T )− XV

2α(T )

4α+e4αTXV (T )

)XQα V X

)+ 2tr

(e4αTΣ(T )XQ

α Σ(T )XQα

)− 2tr

(Σ0e

ATαT XQα e

AαTΣ0eATαT XQ

α eAαT

)+ 8tr

((XV − Σ0)eA

TαT XQ

α XV XQαα,3α

)− 8tr

(e4αTXV X

XQα−α (T )V XQ

α

).

For the first equation, we require A and Aα to be Sylvester. For the second, we additionally requireA−α and A2α to be Sylvester. On the other hand, if α = 0, then we have

E[JT ] =tr((Σ0 − Σ(T ) + TV ) XQ

), (4.5)

V[JT ] =2tr(Σ0XQ(T )Σ0X

Q(T ))− 2µT0 XQ(T )µ0µ

T0 X

Q(T )µ0 (4.6)

+ 4tr((XΣ0(T )−XXV (T ) + TXV

)XQV XQ

)+ 2tr

(Σ(T )XQΣ(T )XQ

)− 2tr

(Σ0e

ATT XQeATΣ0eATT XQeAT

)+ 8tr

((XV − Σ0)eA

TT XQXV XQ)− 8tr

(XV XXQ(T )V XQ

).

where we now only require A to be Sylvester.If the requirements on A are causing problems, then we can use matrix exponentials instead of

18

Page 21: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Lyapunov solutions. If we define C as

C =

−AT2α Q 0 0 0

0 A V 0 00 0 −AT Q 00 0 0 A2α V0 0 0 0 −AT−2α

, (4.7)

and subsequently calculate eCT and write it as

eCT =

Ce11(T ) Ce12(T ) Ce13(T ) Ce14(T ) Ce15(T )Ce21(T ) Ce22(T ) Ce23(T ) Ce24(T ) Ce25(T )Ce31(T ) Ce32(T ) Ce33(T ) Ce34(T ) Ce35(T )Ce41(T ) Ce42(T ) Ce43(T ) Ce44(T ) Ce45(T )Ce51(T ) Ce52(T ) Ce53(T ) Ce54(T ) Ce55(T )

, (4.8)

then we can also find E[JT ] and V[JT ] through

E[JT ] =tr((Ce44(T ))T (Ce12(T )Σ0 + Ce13(T ))

), (4.9)

V[JT ] =2tr(((Ce44(T ))TCe12(T )Σ0 + (Ce44(T ))TCe13(T ))2 − 2(Ce44(T ))T (Ce14(T )Σ0 + Ce15(T ))

)− 2(µT0 (Ce44(T ))TCe12(T )µ0)2. (4.10)

These equations work for any matrix A and any value α. They just don’t work when T → ∞.As a result, sometimes it may be more convenient to use Lyapunov solutions, while sometimes it’smore useful to apply matrix exponentials.

19

Page 22: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Appendix A

Mathematical theorems

In this appendix, we will list various mathematical theorems of varying subjects. We will start withexamining solutions of Lyapunov equations in section A.1, continue with power forms of Gaussianrandom variables in section A.3, and bundle all other proofs in section A.4.

A.1 Solutions to Lyapunov equationsIn this section we look at solutions to various Lyapunov equations, what properties they have andhow they are related. We will use the terminology and notation described in subsection 3.2.1, somake sure you are aware of that.

A.1.1 Basic properties of Lyapunov solutionsLet’s consider a Lyapunov equation. When does it have a solution? And when is it unique?

Theorem A.1. There is a unique solution XQ (or equivalently XQ) to the Lyapunov equation

AXQ +XQAT +Q = 0 (A.1)

if and only if A is Sylvester.

Proof. A more general variant of this theorem concerns the Sylvester Equation

AX +XB = Q. (A.2)

It is known in literature (see for instance [BS72] or [Ant05]) that this equation has a unique solutionif and only if A and −B do not have any common eigenvalues. To use this in our situation, weshould substitute B by AT . So the Lyapunov equation has a unique solution if and only if A and−AT do not have any common eigenvalues.

But when is this the case? Let’s denote the eigenvalues of A by λ1, . . . , λn. The eigenvaluesof −AT now equal −λ1, . . . ,−λn. After all, transposing a (square) matrix does not alter itseigenvalues, and if λ is an eigenvalue of A, then −λ is an eigenvalue of −A. We can hence see thatA and −AT have a common eigenvalue if and only if A has two eigenvalues λi and λj (with possiblyi = j) which satisfy λi = −λj . This is (per definition) the case if and only if A is Sylvester. So wecan conclude that the Lyapunov equation has a unique solution if and only if A is Sylvester.

It’s interesting to note that the presence of a unique solution does not depend on Q. However,when there is no unique solution (for instance when A = 0) then the value of Q does determinewhether there are zero or infinitely many solutions.

We continue by figuring out some more properties of the Lyapunov equation solution. First weconsider the influence of symmetry.

Theorem A.2. Assume that A is Sylvester. The Lyapunov solution XQ of equation (A.1) issymmetric if and only if Q is symmetric.

20

Page 23: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Proof. We know that XQ is the unique solution to

AXQ +XQAT +Q = 0. (A.3)

If we transpose this equation, then we get

A(XQ)T + (XQ)TAT +QT = 0. (A.4)

If Q is symmetric, then the above two equations are identical. Because they have a unique solution,we must haveXQ = (XQ)T and henceXQ must be symmetric. This proves one side of the theorem.

To prove the other side, we subtract the above equations from each other. This results in

A(XQ − (XQ)T

)+(XQ − (XQ)T

)AT + (Q−QT ) = 0. (A.5)

If Q is not symmetric, then (Q−QT ) is nonzero. As a result, the unique solution(XQ − (XQ)T

)to the above Lyapunov equation must also be nonzero. After all, the zero solution clearly isn’t theunique solution. Hence we must have XQ 6= (XQ)T . In other words, if Q is not symmetric, thenXQ is not symmetric. This proves the second side of the theorem.

We could also ask ourselves what happens when we put some extra requirements on A. Forinstance, what happens when A is stable?

Theorem A.3. Assume that A is a stable (Hurwitz) matrix. Then A is Sylvester, the Lyapunovequation (A.1) has a unique solution XQ, and this solution equals the infinite integral

XQ =

∫ ∞0

eAtQeAT t dt. (A.6)

Proof. The first claim is that A is Sylvester. This follows directly from the assumption that A isstable. After all, for a stable matrix A, all eigenvalues λ1, . . . , λn have a negative real part. Thismeans that for every two eigenvalues λi and λj , also λi + λj has a negative real part. So therecannot be two eigenvalues λi and λj satisfying λi + λj = 0, which means A is Sylvester.

Because A is Sylvester, theorem A.1 immediately also tells us that the solution for XQ existsand is unique.

Now we only need to prove that XQ equals the infinite integral. Because A is stable, we knowthat

limt→∞

eAt = 0. (A.7)

As a result, we can write Q as

Q = −[eAtQeA

T t]∞

0(A.8)

= −∫ ∞

0

d

dt

(eAtQeA

T t)dt

= −∫ ∞

0

(AeAtQeA

T t + eAtQeAT tAT

)dt

= −A(∫ ∞

0

eAtQeAT t dt

)−(∫ ∞

0

eAtQeAT t dt

)AT .

We see that the equation above is a Lyapunov equation, with the quantity within brackets as thesolution. Because the solution exists, is unique and equals XQ, the quantity within brackets mustequal XQ. So we see that we indeed have

XQ =

∫ ∞0

eAtQeAT t dt. (A.9)

21

Page 24: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

It may be interesting to note that this theorem also holds for XQ, except that now

XQ =

∫ ∞0

eAT tQeAt dt. (A.10)

In fact, all theorems concerning XQ also hold for XQ if we replace A by AT .At this point you may be wondering: what if A is not stable? In that case the above integral

will not have a finite value, even though there might still be a unique finite solution for XQ. As aresult, XQ then does not equal the integral. To solve this problem, we can change the bounds ofthe integral to something finite. We wind up with XQ(t1, t2), which is formally defined as

XQ(t1, t2) =

∫ t2

t1

eAtQeAT t dt. (A.11)

This definition also works for non-Sylvester matrices A. However, for Sylvester matrices A it doeshave some extra properties.

Theorem A.4. Assume that A is Sylvester. In this case XQ(t1, t2) can either be found by solvingthe Lyapunov equation

AXQ(t1, t2) +XQ(t1, t2)AT + eAt1QeAT t1 − eAt2QeA

T t2 = 0. (A.12)

or by first finding XQ and then using

XQ(t1, t2) = eAt1XQeAT t1 − eAt2XQeA

T t2 . (A.13)

Proof. This theorem consists of two parts. To prove the first part, we consider the quantityeAt1QeA

T t1 − eAt2QeAT t2 . It equals

eAt1QeAT t1 − eAt2QeA

T t2 = −[eAtQeA

T t]t2t1

(A.14)

= −∫ t2

t1

d

dt

(eAtQeA

T t)dt

= −A(∫ t2

t1

eAtQeAT t dt

)−(∫ t2

t1

eAtQeAT t dt

)AT

= −AXQ(t1, t2)−XQ(t1, t2)AT .

This shows that XQ(t1, t2) indeed satisfies Lyapunov equation (A.12), proving the first part of thetheorem.

To prove the second part too, we make use of the expression AXQ +XQAT +Q = 0 and of thematrix property eAtA = AeAt. This allows us to find

eAt1QeAT t1 − eAt2QeA

T t2 =− eAt1(AXQ +XQAT )eAT t1 + eAt2(AXQ +XQAT )eA

T t2 (A.15)

=− eAt1AXQeAT t1 − eAt1XQAT eA

T t1 + eAt2AXQeAT t2 + eAt2XQAT eA

T t2

=−AeAt1XQeAT t1 − eAt1XQeA

T t1AT +AeAt2XQeAT t2 + eAt2XQeA

T t2AT

=−A(eAt1XQeAT t1 − eAt2XQeA

T t2)− (eAt1XQeAT t1 − eAt2XQeA

T t2)AT .

The above expression is actually Lyapunov equation (A.12), in which the part between brackets isreplaced by XQ(t1, t2). Because A is Sylvester, the Lyapunov equation has a unique solution, andhence we have

XQ(t1, t2) = eAt1XQeAT t1 − eAt2XQeA

T t2 . (A.16)

This also proves the second part of the theorem.

You might have noticed something odd about our definitions of XQ and XQ(t1, t2). ForSylvester matrices A, we have defined XQ as the solution to a Lyapunov equation, which in specialcases (A being stable) also happened to equal an integral over matrix exponentials. However, for

22

Page 25: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

any matrix A we have defined XQ(t1, t2) as an integral over matrix exponentials, which in specialcases (A being Sylvester) also happened to equal the solution to a Lyapunov equation. By doingso, we have made the definitions as broadly applicable as possible.

In further derivations it often happens that the lower limit t1 of XQ(t1, t2) equals zero. Tosimplify notation, we then write XQ(t) ≡ XQ(0, t). It’s also interesting to note that XQ(∞) = XQ

when A is stable. However, you should be careful with this notation, because it does not hold whenA is not stable!

A.1.2 Combinations of Lyapunov solutionsThere can also be a variant of the Lyapunov integral where an extra time factor is introduced. Inthat case the result is something that’s perhaps unexpected. It’s a solution of a Lyapunov equationwhich itself has a Lyapunov solution.

Theorem A.5. Assume A is stable. XXQ is defined as the solution to

AXXQ +XXQAT +XQ = 0. (A.17)

Now we haveXXQ =

∫ ∞0

teAtXQeAT t dt. (A.18)

Proof. From theorem A.3 we find that we can write

XXQ =

∫ ∞0

eAtXQeAT t dt =

∫ ∞0

eAt(∫ ∞

0

eAτQeAT τ dτ

)eA

T t dt. (A.19)

We can pull the integral signs outside, merging the integrands. This gives us

XXQ =

∫ ∞0

∫ ∞0

eA(t+τ)QeAT (t+τ) dτ dt. (A.20)

If we replace τ by s− t, taking into account the integration limits, then we get

XXQ =

∫ ∞0

∫ ∞t

eAsQeAT s ds dt. (A.21)

We can interchange the integrals, keeping the integration area the same. Doing so will result in

XXQ =

∫ ∞0

∫ s

0

eAsQeAT s dt ds. (A.22)

The inner integral is now with respect to t, but there is no t in the integrand. As a result, we canwrite

XXQ =

∫ ∞0

(∫ s

0

1 dt

)eAsQeA

T s ds (A.23)

=

∫ ∞0

seAsQeAT s ds.

This equals what we wanted to prove.

When Lyapunov solutions are inside a trace function, it is sometimes possible to transform oneinto another.

Theorem A.6. Assume that A is Sylvester. For matrices F and G satisfying AF = FA andATG = GAT , and for any Q and V , we have

tr(QFXVG

)= tr

(XQFV G

). (A.24)

23

Page 26: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Proof. We can prove the above by rewriting one into the other. That is,

tr(QFXVG

)= tr

((−AT XQ − XQA)FXVG

)(A.25)

= tr((−AT XQFXVG− XQAFXVG)

)= tr

((−GXQFXVAT −GXQFAXV )

)= tr

(GXQF (−XVAT −AXV )

)= tr

(XQFV G

).

This proves the statement. Do note that we have used the cyclic property of the trace function atvarious points in the above derivation. The theorem doesn’t hold without the trace function.

When applying this theorem, typical values for F are I, eAt and eAαt, while often G = FT .

Next to turning one Lyapunov solution into another, we can sometimes also turn a differencein Lyapunov solutions into a Lyapunov solution.

Theorem A.7. Assume that both A and Aα are Sylvester. For XQ, XQα , XXQ

α and XXQα we have

XXQ

α = XXQα =XQα −XQ

2α. (A.26)

Proof. Per definition, we have

(A+ αI)XVα +XV

α (A+ αI)T + V = 0, (A.27)

AXV +XVAT + V = 0. (A.28)

By subtracting the two equations, and by using Aα = A+ αI, we can get either of two results

A(XVα −XV ) + (XV

α −XV )AT + 2αXVα = 0, (A.29)

Aα(XVα −XV ) + (XV

α −XV )ATα + 2αXV = 0. (A.30)

Next, we divide the above equations by 2α, resulting in

A

(XVα −XV

)+

(XVα −XV

)AT +XV

α = 0, (A.31)

(XVα −XV

)+

(XVα −XV

)ATα +XV = 0. (A.32)

These are Lyapunov equations with respect to the term in brackets. Because A and Aα areSylvester, they must have a unique solution, which equals

XXVα =XVα −XV

2α= XXV

α . (A.33)

This completes the proof.

We have seen earlier that an integral over eAtQeAT t, which has two matrix exponentials, can

be solved by calculating XQ. Occasionally we get integrals with four matrix exponentials. How dowe solve those?

Theorem A.8. Assume that A is stable, and that P1, P2 and Q are symmetric. Then we have

tr(∫ ∞

0

P1eAtQeA

T tP2eAtXQeA

T t dt

)=

1

2tr(P1X

QP2XQ). (A.34)

Proof. We will start with the right hand side of the expression. It equals

1

2tr(P1X

QP2XQ)

=1

2tr(P1

(∫ ∞0

eAtQeAT t dt

)P2

(∫ ∞0

eAsQeAT s ds

))(A.35)

=1

2tr(∫ ∞

0

∫ ∞0

P1eAtQeA

T tP2eAsQeA

T s ds dt

).

24

Page 27: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

The above integrand is symmetric in t and s. That is, if we interchange t and s, it has exactlythe same value. To see why, transpose whatever is in the trace function. This is allowed, since thetrace of a matrix is equal to the trace of its transpose. After transposing, you will find exactly thesame expression. Also note that XQ is symmetric, due to theorem A.2 and the assumption thatQ is symmetric.

Because the integrand is symmetric in t and s, we don’t have to integrate over all values oft ≥ 0 and s ≥ 0. If we integrate over t ≥ 0 and s ≥ t, then we get exactly half of what we otherwisewould have gotten. Hence, we can rewrite the above to

1

2tr(P1X

QP2XQ)

= tr(∫ ∞

0

∫ ∞t

P1eAtQeA

T tP2eAsQeA

T s ds dt

). (A.36)

If we work this out further, substituting s by τ + t, we get

1

2tr(P1X

QP2XQ)

= tr(∫ ∞

0

P1eAtQeA

T tP2

(∫ ∞t

eAsQeAT s ds

)dt

)(A.37)

= tr(∫ ∞

0

P1eAtQeA

T tP2

(eAt

∫ ∞0

eAτQeAT τ dτeA

T t

)dt

)= tr

(∫ ∞0

P1eAtQeA

T tP2eAtXQeA

T t dt

).

And this was what we wanted to prove.

We have a similar theorem in case the integral does not run to infinity, but to a finite time T .

Theorem A.9. Assume that A is Sylvester, and that P1, P2 and Q are symmetric. Then we have

tr

(∫ T

0

P1eAtQeA

T tP2eAtXQeA

T t dt

)=

1

2tr(P1X

QP2XQ)−1

2tr(P1e

ATXQeATTP2e

ATXQeATT).

(A.38)

Proof. We cannot use the results of theorem A.8, because now we have assumed A to be Sylvesterinstead of stable. Instead, we will make repeated use of theorem A.4. First, we will use

XQ =

∫ T

0

eAsQeAT s ds+ eATXQeA

TT . (A.39)

If we apply this on the term tr(P1X

QP2XQ)and expand the brackets, then we find that

1

2tr(P1X

QP2XQ)

(A.40)

=1

2tr

(P1

(∫ T

0

eAsQeAT s ds+ eATXQeA

TT

)P2

(∫ T

0

eAsQeAT s ds+ eATXQeA

TT

))

=1

2tr

(∫ T

0

∫ T

0

P1eAs1QeA

T s1P2eAs2QeA

T s2 ds1 ds2

)+

1

2tr

(∫ T

0

P1eAsQeA

T sP2eATXQeA

TT ds

)

+1

2tr

(∫ T

0

P1eATXQeA

TTP2eAsQeA

T s ds

)+

1

2tr(P1e

ATXQeATTP2e

ATXQeATT).

In the result, the second and third term are in fact equal. Whatever is in the trace function issimply transposed. So we can merge them into one term and get rid of the factor 1

2 . The integrandof the first term is symmetric in s1 and s2, so we can get rid of the factor 1

2 by simply integratingover half of the integration area. The result will be

1

2tr(P1X

QP2XQ)

= tr

(∫ T

0

∫ T

s2

P1eAs1QeA

T s1P2eAs2QeA

T s2 ds1 ds2

)(A.41)

+ tr

(∫ T

0

P1eATXQeA

TTP2eAsQeA

T s ds

)+

1

2tr(P1e

ATXQeATTP2e

ATXQeATT).

25

Page 28: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Theorem A.4 now also tells us that∫ T

s2

eAs1QeAT s1 ds1 = eAs2XQeA

T s2 − eATXQeATT . (A.42)

If we apply this, we find that

1

2tr(P1X

QP2XQ)

= tr

(∫ T

0

P1

(eAs2XQeA

T s2 − eATXQeATT)P2e

As2QeAT s2 ds2

)(A.43)

+ tr

(∫ T

0

P1eATXQeA

TTP2eAsQeA

T s ds

)+

1

2tr(P1e

ATXQeATTP2e

ATXQeATT).

Expanding the brackets will result in some terms cancelling out. We then wind up with

1

2tr(P1X

QP2XQ)

= tr

(∫ T

0

P1eAsQeA

T sP2eAsXQeA

T s ds

)+

1

2tr(P1e

ATXQeATTP2e

ATXQeATT),

(A.44)and this is the equation which we needed to prove.

A.2 Using matrix exponentials to solve integralsSo far we’ve been solving integrals of matrix exponentials using Lyapunov equations. In the workof [vL78], there are methods which solve such integrals using matrix exponentials. In this sectionwe’ll first take a look at which integrals can be solved. Then we will apply them to the sort ofintegrals we want to solve.

A.2.1 Integrals within matrix exponentialsThe theorems in this subsection come directly from [vL78]. However, they are worked out in smallersteps here, to make everything easier to follow.

Theorem A.10. If we define

C =

[A1 B1

0 A2

], (A.45)

and write eCt aseCT ≡

[Ce11(t) Ce12(t)Ce21(t) Ce22(t)

], (A.46)

then we have

Ce21(t) =0, (A.47)

Ce22(t) =eA2t, (A.48)

Ce11(t) =eA1t, (A.49)

Ce12(t) =

∫ t

0

eA1(t−s)B1eA2s ds. (A.50)

Proof. The key to proving this is using the relation ddte

Ct = CeCt. That is[Ce11(t) Ce12(t)

Ce21(t) Ce22(t)

]=

[A1 B1

0 A2

] [Ce11(t) Ce12(t)Ce21(t) Ce22(t)

]. (A.51)

We can now use the above relation to find several matrix differential equations. Starting with thebottom row and working upward, we have

Ce21(t) =A2Ce21(t), (A.52)

Ce22(t) =A2Ce22(t), (A.53)

Ce11(t) =A1Ce11(t) +B1C

e21(t), (A.54)

Ce12(t) =A1Ce12(t) +B1C

e22(t). (A.55)

26

Page 29: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

To solve these differential equations, we do need initial conditions. To obtain them, we should notethat eCt takes the value of I when t = 0. As a result, Ce11(0) = I, Ce12(0) = 0, Ce21(0) = 0 andCe22(0) = I. Using this, we can solve the differential equations.

To do so, we use the method of the integrating factor. That is, for the first differential equation(for Ce21(t)) we multiply both sides of the equation by e−A2t. This turns the differential equationinto

e−A2tCe21(t)− e−A2tA2Ce21(t) = 0. (A.56)

By doing this, we can write the term on the left as a derivative. That is,

d

dt

(e−A2tCe21(t)

)= 0. (A.57)

Next, we integrate the above expression from 0 to t. However, we’re already using t in the aboveexpression, so in there we change t to s. The result is

[e−A2sCe21(s)

]t0

=

∫ t

0

0 ds. (A.58)

The integral on the right is zero. Hence we find that

e−A2tCe21(t)− e−A20Ce21(0) = 0. (A.59)

We have e−A20 = I, but Ce21(0) = 0, so the second term drops out. If we also note that e−A2t isnonsingular (a matrix exponential is always nonsingular) we find that

Ce21(t) = 0. (A.60)

That proves the first part of the theorem. Solving the other differential equations goes in anidentical way, so we’ll speed up the process a bit. For Ce22(t) we have

d

dt

(e−A2tCe22(t)

)=0, (A.61)[

e−A2sCe22(s)]t0

=

∫ t

0

0 ds,

e−A2tCe22(t)− e−A20Ce22(0) =0,

Ce22(t) =eA2t,

where we have used Ce22(0) = I. For Ce11(t) we similarly find that

d

dt

(e−A1tCe11(t)

)=e−A1tB1C

e21(t), (A.62)

e−A1tCe11(t)− e−A10Ce11(0) =0,

Ce11(t) =eA1t,

where we have used our earlier result of Ce21(t) = 0. Finally there’s Ce12(t). Now things don’tconveniently turn out to be zero on the right-hand side. In this case we get

d

dt

(e−A1tCe12(t)

)=e−A1tB1C

e22(t), (A.63)[

e−A1sCe12(s)]t0

=

∫ t

0

e−A1sB1Ce22(s) ds,

e−A1tCe12(t)− e−A10Ce12(0) =

∫ t

0

e−A1sB1eA2s ds,

Ce12(t) =

∫ t

0

eA1(t−s)B1eA2s ds.

This completes the proof.

27

Page 30: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

We generally do not use the above theorem to find eCt. After all, calculating matrix exponen-tials is relatively easy. Instead, we calculate eCt and use the result to find the difficult integral∫ t

0eA1(t−s)B1e

A2s ds.We can also expand the matrix C, granting us more complicated integrals. This is done in the

next few theorems.

Theorem A.11. If we define

C =

A1 B1 C1

0 A2 B2

0 0 A3

, (A.64)

and write eCt as in theorem A.10, then Ce11(t), Ce12(t), Ce21(t) and Ce22(t) are equal to the resultsof theorem A.10. Furthermore, we have

Ce31(t) =0, (A.65)Ce32(t) =0, (A.66)

Ce33(t) =eA3t, (A.67)

Ce23(t) =

∫ t

0

eA2(t−s)B2eA3s ds, (A.68)

Ce13(t) =

∫ t

0

eA1(t−s)B1eA2(s−r)B2e

A3r dr ds+

∫ t

0

eA1(t−s)C1eA3s. (A.69)

Proof. The proof for this is pretty much identical to the proof of theorem A.10. We use ddte

Ct =CeCt, which tells us thatCe11(t) Ce12(t) Ce13(t)

Ce21(t) Ce22(t) Ce23(t)

Ce31(t) Ce32(t) Ce33(t)

=

A1 B1 C1

0 A2 B2

0 0 A3

Ce11(t) Ce12(t) Ce13(t)Ce21(t) Ce22(t) Ce23(t)Ce31(t) Ce32(t) Ce33(t)

. (A.70)

We can solve for eCt by going bottom-up. For the bottom row we have

Ce31(t) =A3Ce31(t), (A.71)

Ce32(t) =A3Ce32(t), (A.72)

Ce33(t) =A3Ce33(t), (A.73)

which can be solved in the way we’ve seen in the proof of theorem A.10. We find that Ce31(t) =Ce32(t) = 0 and Ce33(t) = eA3t.

For the second row, we have

Ce21(t) =A2Ce21(t) +B2C

e31(t), (A.74)

Ce22(t) =A2Ce22(t) +B2C

e32(t), (A.75)

Ce23(t) =A2Ce23(t) +B2C

e33(t). (A.76)

Using previous results, we find that Ce21(t) = 0 and Ce22(t) = eA2t. For Ce23(t) we get, similarly toequation (A.63),

Ce23(t) =

∫ t

0

eA2(t−s)B2eA3s ds. (A.77)

Then there’s the top row. Now we have

Ce11(t) =A1Ce11(t) +B1C

e21(t) + C1C

e31(t), (A.78)

Ce12(t) =A1Ce12(t) +B1C

e22(t) + C1C

e32(t), (A.79)

Ce13(t) =A1Ce13(t) +B1C

e23(t) + C1C

e33(t). (A.80)

28

Page 31: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Using previous results, we immediately find that Ce11(t) = eA1t. Next, identically to equa-tion (A.63), we have

Ce12(t) =

∫ t

0

eA1(t−s)B1eA2s ds. (A.81)

Finally there’s the toughest term to deal with, which is Ce13(t). When applying the method of theintegrating factor, we find that

d

dt

(e−A1tCe13(t)

)= e−A1tB1C

e23(t) + e−A1tC1C

e33(t). (A.82)

Substituting for Ce23(t) and Ce33(t) will give us

d

dt

(e−A1tCe13(t)

)= e−A1tB1

∫ t

0

eA2(t−s)B2eA3s ds+ e−A1tC1e

A3t. (A.83)

Next, we will integrate from 0 to t. However, we already have a t in the above expression, so inthe above we will rename t to s. Except that we already have an s too, so we will rename s to r.In this way, all parameter names are ‘moved down’ in the alphabet. Doing so will result in

[e−A1sCe13(s)

]t0

=

∫ t

0

(e−A1sB1

∫ s

0

eA2(s−r)B2eA3r dr + e−A1sC1e

A3s

)ds. (A.84)

Working out the result will give us

Ce13(t) =

∫ t

0

∫ s

0

eA1(t−s)B1eA2(s−r)B2e

A3r dr ds+

∫ t

0

eA1(t−s)C1eA3s ds. (A.85)

This completes the proof.

We’ve got the result for a 3× 3 matrix now. We can expand this to a 4× 4 matrix.

Theorem A.12. If we define

C =

A1 B1 C1 D1

0 A2 B2 C2

0 0 A3 B3

0 0 0 A4

, (A.86)

and write eCt as in theorem A.10, then the results of Ce11(t) up to Ce33(t) can be found in theo-rems A.10 and A.11. For the rest, we have

Ce41(t) =0, (A.87)Ce42(t) =0, (A.88)Ce43(t) =0, (A.89)

Ce44(t) =eA4t, (A.90)

Ce34(t) =

∫ t

0

eA3(t−s)B3eA4s ds, (A.91)

Ce24(t) =

∫ t

0

eA2(t−s)B2eA3(s−r)B3e

A4r dr ds+

∫ t

0

eA2(t−s)C2eA4s, (A.92)

Ce14(t) =

∫ t

0

∫ s

0

∫ r

0

eA1(t−s)B1eA2(s−r)B2e

A3(r−q)B3eA4q dq dr ds+

∫ t

0

eA1(t−s)D1eA4s ds (A.93)

+

∫ t

0

∫ s

0

eA1(t−s)B1eA2(s−r)C2e

A4r dr ds+

∫ t

0

∫ s

0

eA1(t−s)C1eA3(s−r)B3e

A4r dr ds.

Proof. The proof for almost all terms is identical as to what was done in theorems A.10 and A.10,so we will not discuss it again. The only term which is somewhat new is Ce14(t), so we will considerthe proof of that.

29

Page 32: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Our starting point will be the differential equation

Ce14(t) = A1Ce14(t) +B1C

e24(t) + C1C

e34(t) +D1C

e44(t). (A.94)

Using the method of the integrating factor and working out the left hand side will give us

Ce14(t) = eA1t

∫ t

0

(B1Ce24(s) + C1C

e34(s) +D1C

e44(s)) ds. (A.95)

If we insert earlier results for Ce24(s), Ce34(s) and Ce44(s), and if we again shift all parameters downin the alphabet, we find our final result

Ce14(t) =

∫ t

0

∫ s

0

∫ r

0

eA1(t−s)B1eA2(s−r)B2e

A3(r−q)B3eA4q dq dr ds+

∫ t

0

eA1(t−s)D1eA4s ds (A.96)

+

∫ t

0

∫ s

0

eA1(t−s)B1eA2(s−r)C2e

A4r dr ds+

∫ t

0

∫ s

0

eA1(t−s)C1eA3(s−r)B3e

A4r dr ds.

The order of the matrices in the integrals of the above theorem might seem haphazard. However,they are most certainly not. There is an ‘intuitive’ way of looking at the above equations, whichmake them more easy to recognize.

To see how it works, consider the matrix C as

C =

A1 B1 C1 D1

0 A2 B2 C2

0 0 A3 B3

0 0 0 A4

. (A.97)

To find Ce14(t), we will start our ‘walk’ through the matrix at A1. From there, we may ‘jump’ toany matrix on the right. For instance, we may jump to B1. From there, we must jump downward,back to the diagonal. For our case, that’s A2. From there there process continues. For instance,we can jump to C2. From there, we have to jump down to A4. Once we arrive at A4, we’re done.

Next, we will set up an expression for our walk. For that, we start with eA1t. Then we look ateach set of jumps which we did away from and back to the diagonal. If we jumped away from Aito some matrix Xi (where X can be B, C, D and so on) and then downward to Aj , then we shouldadd e−AisXie

Ajs to our expression, or use r if we’ve already used s, and so on. Furthermore, weshould add an integral. Its limits should run from 0 to the previous parameter (like s, r) which weadded. If we do this for our walk, then we have

Result of walk:∫ t

0

∫ s

0

eA1t(e−A1sB1e

A2s) (e−A2rC2e

A4r)ds dr. (A.98)

If we set up an expression for every possible walk like this through our matrix, and add up allthe results, then we arrive at the expression for Ce14(t). A similar trick also works for any elementCeij(t), except now we need to start our walk at Ai and end it at Aj .

Theorem A.13. If we define

C =

A1 B1 C1 D1 E1

0 A2 B2 C2 D2

0 0 A3 B3 C3

0 0 0 A4 B4

0 0 0 0 A5

, (A.99)

30

Page 33: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

and write eCt as in theorem A.10, then we have

Ce15(t) =

∫ t

0

∫ s

0

∫ r

0

∫ q

0

eA1(t−s)B1eA2(s−r)B2e

A3(r−q)B3eA4(q−p)B4e

A5p dp dq dr ds (A.100)

+

∫ t

0

∫ s

0

∫ r

0

eA1(t−s)B1eA2(s−r)B2e

A3(r−q)C3eA5q dq dr ds

+

∫ t

0

∫ s

0

∫ r

0

eA1(t−s)B1eA2(s−r)C2e

A4(r−q)B4eA5q dq dr ds

+

∫ t

0

∫ s

0

∫ r

0

eA1(t−s)C1eA3(s−r)B3e

A4(r−q)B4eA5q dq dr ds

+

∫ t

0

∫ s

0

eA1(t−s)B1eA2(s−r)D2e

A5r dr ds+

∫ t

0

∫ s

0

eA1(t−s)D1eA4(s−r)B4e

A5r dr ds

+

∫ t

0

∫ s

0

eA1(t−s)C1eA3(s−r)C3e

A5r dr ds+

∫ t

0

eA1(t−s)E1eA5s ds.

Proof. The proof for this is done identically as the proofs of the previous three theorems (A.10,A.11 and A.12). After applying the method of the integrating factor, we will find

Ce15(t) = eA1t

∫ t

0

(B1Ce25(s) + C1C

e35(s) +D1C

e45(s) + E1C

e55(s)) ds. (A.101)

If we work this out, keeping track of all the integrals, we will arrive at the expression for Ce15(t)which we need to prove.

In this report, we will generally set all C, D and E matrices to zero. As a result, all matrixterms (yes, even Ce15(t)) will consist of only one term. That term may contain multiple integrals,but it’s still an expression which will fit on one line.

A.2.2 Using matrix exponentials to solve Lyapunov equationsUsually Lyapunov equations and related integrals are solved using techniques like matrix inversions.However, it is also possible to use matrix exponentials to find solutions. For this subsection it’simportant to recall the notation described in subsection 3.2.1, and in particular definition (3.11).

We start with solving a Lyapunov equation using matrix exponentials.

Theorem A.14. For any k1 and k2 satisfying k1 + k2 = 2k, and for any matrix A, we have

XQkα(t1, t2) = eAk1αt2

[I 0

]exp

([−Ak1α Q

0 ATk2α

](t2 − t1)

)[0I

]eA

Tk2α

t1 . (A.102)

Proof. From theorem A.10 we know that∫ t

0

eA1(t−s)B1eA2s ds =

[I 0

]exp

([A1 B1

0 A2

]t

)[0I

]. (A.103)

If we use A1 = −Ak1α, B1 = Q, A2 = ATk2α and t = t2 − t1, then we get∫ t2−t1

0

eAk1α(s−(t2−t1))QeATk2α

s ds =[I 0

]exp

([−Ak1α Q

0 ATk2α

](t2 − t1)

)[0I

]. (A.104)

If we now substitute s by s− t1, updating integral limits accordingly, and subsequently pull termsnot depending on s out of the integral, we directly find that

e−Ak1αt2(∫ t2

t1

eAk1αsQeATk2α

s ds

)e−A

Tk2α

t1 =[I 0

]exp

([−Ak1α Q

0 ATk2α

](t2 − t1)

)[0I

].

(A.105)

31

Page 34: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

The part within brackets now equals XQkα(t1, t2), with k = k1+k2

2 . As a result, we have

XQkα(t1, t2) = eAk1αt2

[I 0

]exp

([−Ak1α Q

0 ATk2α

](t2 − t1)

)[0I

]eA

Tk2α

t1 , (A.106)

which completes this proof.

It’s interesting to note that the above theorem does not require A or any other matrix to beSylvester. It works for any matrix A.

Now that we know how to find XQkα(t1, t2) for each matrix A, we look at how to find XQ

k1α,k2α(t).

Theorem A.15. When XQk1α,k2α

(T ) is defined as in equation (3.11), then we have

XQk1α,k2α

(T ) =[I 0

]exp

([Ak1α Q

0 Ak2α

]T

)[0I

]. (A.107)

Proof. From theorem A.10 we know that∫ t

0

eA1(t−s)B1eA2s ds =

[I 0

]exp

([A1 B1

0 A2

]t

)[0I

]. (A.108)

If we use A1 = Ak1α, B1 = Q and A2 = Ak2α, and replace t by T and s by t, then we get∫ T

0

eAk1α(T−t)QeAk2αt dt =[I 0

]exp

([Ak1α Q

0 ATk2α

]T

)[0I

]. (A.109)

The left hand side of the expression equals the definition of XQk1α,k2α

(T ), so this proves the theorem.

So now we have efficient methods of finding XQkα(t1, t2) and XQ

kα(T ). These will come in handyin the proofs of theorems in appendix B.

A.3 Power forms of Gaussian random variablesIn this section we’ll look at the expectations of various power forms of Gaussian random variables.For instance, what is E[xTPx]?

Theorem A.16. Consider a Gaussian random variable x with mean µ, covariance matrix X andsquared expectation Σ. That is,

X = E[(x− µ)(x− µ)T ] = E[xxT ]− µµT = Σ− µµT . (A.110)

If P is a symmetric positive definite matrix, then we have

E[xTPx] = tr(ΣP ). (A.111)

Proof. We can prove this in a brief way using the trace operator. Taking the trace operator of ascalar is allowed, as the trace of a scalar is still the same scalar, and it allows us to rotate the orderof the elements within the trace operator. So,

E[xTPx] = tr(E[xTPx]) (A.112)

= tr(E[xxTP ])

= tr(E[xxT ]P )

= tr(ΣP ).

This concludes our proof.

32

Page 35: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Theorem A.17. For the same conditions as in theorem A.16, for symmetric positive definitematrices P and Q, we have

E[xTPxxTQx] = tr(ΣP )tr(ΣQ) + 2tr(ΣPΣQ)− 2µTPµµTQµ. (A.113)

Proof. We’re going to start halfway into our proof, by bringing in a theorem from [Ken81], appendixF. Here, the theorem is stated and proven that, for a zero mean process y = x−µ, with covariancematrix Y = X, we have

E[yTPyyTQy] = tr(Y P )tr(Y Q) + 2tr(Y PY Q). (A.114)

However, we want to know what the above expression is for x. So we write

E[xTPxxTQx] = E[(y + µ)TP (y + µ)(y + µ)TQ(y + µ)]. (A.115)

If we work out brackets, we’ll get 16 terms. However, y is a zero-mean Gaussian, so any termwhich has either one or three times ‘y’ in it will have an expectation of zero. This causes 8 termsto drop out.

For the remaining 8 terms, we can find out that some of them are equal. For instance,(yTPµ)(yTQµ) equals (yTPµ)(µTQy). Here, we have transposed the right half, which is al-lowed because it’s a scalar. We could have also transposed the left half, finding something elsewhich is equal.

As a result, we can write

E[xTPxxTQx] =E[yTPyyTQy + yTPyµTQµ+ µTPµyTQy + µTPµµTQµ (A.116)

+ 2µTPyyTQµ+ 2yTPµµTQy].

Now we will introduce the trace operator. We can take the trace operator of an entire term(which is a scalar), like ‘tr(yTPyµTQµ)’, or only of half of a term (which is also a scalar), like‘tr(yTPy)tr(µTQµ)’. As a possible result, we could get

E[xTPxxTQx] =E[yTPyyTQy + tr(yTPy)tr(µTQµ) + tr(µTPµ)tr(yTQy) (A.117)

+ tr(µTPµ)tr(µTQµ) + 2tr(µTPyyTQµ) + 2tr(yTPµµTQy)].

Next, we’re going to use the result of equation (A.114). Simultaneously, we’re also going to workout the expectation operator, using E[yyT ] = Y . We then get

E[xTPxxTQx] =tr(Y P )tr(Y Q) + tr(Y P )tr(µµTQ) + tr(µµTP )tr(Y Q) + tr(µµTP )tr(µµTQ)

+ 2tr(Y PY Q) + 2tr(µµTPY Q) + 2tr(Y PµµTQ)]. (A.118)

If you look closely, you can already see some structure appearing in the above equation. The nextstep is to bring terms between brackets. (Remember that the trace operator is a linear operator.)Doing so will give us

E[xTPxxTQx] = tr((Y+µµT )P )tr((Y+µµT )Q)+2tr((Y+µµT )P (Y+µµT )Q)−2tr(µµTPµµTQ)].(A.119)

We know that Σ = X +µµT = Y +µµT . This allows us to simplify the above equation, giving usour final result,

E[xTPxxTQx] = tr(ΣP )tr(ΣQ) + 2tr(ΣPΣQ)− 2µTPµµTQµ. (A.120)

Note that we have also rewritten the last term, turning it back into a scalar and subsequentlydropping the trace operator.

A.4 Other mathematical proofsWe’ve lumped some other mathematical proofs in this section. The following theorem containsdetails on completing the squares, showing how Lyapunov and Riccati equations are related.

33

Page 36: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Theorem A.18. The Lyapunov equation

(A−BF )T X + X(A−BF ) + (Q+ FTRF ) = 0 (A.121)

can be rewritten, by completing the squares with respect to F , to

(F −R−1BT X)TR(F −R−1BT X) +AT X + XA+Q− XBR−1BT X = 0. (A.122)

Proof. We want to rewrite the first equation to a quadratic form

(F − T1)TT2(F − T1) + T3 = 0, (A.123)

where T1, T2 and T3 are terms which we still need to find, but none of them have F in them. Tofind T1, T2 and T3, we expand the above to

FTT2F − FTT2T1 − TT1 T2F + TT1 T2T1 + T3 = 0. (A.124)

Now we will compare equations (A.121) and (A.124). By looking at all the terms that have twoF ’s in them, we can immediately see that T2 = R. Then, by comparing all terms that have onlyone F in them, we find that

TT1 T2F = XBF. (A.125)

Although there might be multiple values of T1 which satisfy this equation, we are sure that one ofthem equals

T1 = (XBT−12 )T = R−1BT X. (A.126)

Remember that both R and X are assumed to be symmetric, while B is not.Finally, by looking at all terms that are so far unaccounted for, we find that

TT1 T2T1 + T3 = AT X + XA+Q. (A.127)

This implies that

T3 = AT X + XA+Q− TT1 T2T1 = AT X + XA+Q− XBR−1RR−1BT X. (A.128)

Now that we have T1, T2 and T3, we can plug them into equation (A.123). The result will be

(F −R−1BT X)TR(F −R−1BT X) +AT X + XA+Q− XBR−1BT X = 0. (A.129)

34

Page 37: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Appendix B

Theorems related to LQG systems

In this chapter, several proofs related to the LQG paradigm are bundled. In literature the solution(i.e., the optimal controller) for the LQG paradigm is known. (See for instance [AM90], [SP05]or [BKM08].) In section B.1, we’ll prove an already known result: that the offered solution is indeedoptimal. Something that’s a lot less well documented in literature is the statistical properties ofthe cost function. In section B.2 we examine these properties using Lyapunov equation solutions.Finally in section B.3 we derive expressions for these properties using only matrix exponentials.

B.1 Controlling LQG systemsIn this section we will examine how to optimally control LQG systems. We’ll start with some the-orems for LQ systems without any noise in subsection B.1.1, only to add noise in subsection B.1.2.The proofs for these theorems are presented step by step, in a somewhat intuitive fashion. Thegoal is to have the proofs make sense intuitively. Sometimes, to keep things comprehensible, formalparts of the proof are not noted, but we instead refer to other literature where these formal partsare treated.

B.1.1 Basic properties of the LQG paradigmFirst we will prove that there is an optimal linear control law in the LQ paradigm, and that theresulting cost function is quadratic.

Theorem B.1. Consider a linear system

x = Ax+Bu. (B.1)

Assume that there is a control law u = π(x) which can stabilize the system. Then there exists atleast one optimal control law u = π∗(x) which minimizes the quadratic cost function

J =

∫ ∞0

(xTQx+ uTRu) dt, (B.2)

with Q and R symmetric positive definite matrices. Furthermore, of all the optimal control laws,there is always at least one which is linear in x; so of the form π∗(x) = −Fx. The resultingoptimal cost function J∗ is quadratic in the initial state x0; so of the form J∗(x0) = xT0 Xx0.

Proof. The first part of the theorem claims that for a stabilizable system there is an optimalcontrol law. The reason for this is that, for a stabilizing control law π(x), the cost J(x0) is finitebut positive. As such, it must have a minimum, and the corresponding control law π∗(x) resultingin this minimum is the optimal control law. Of course it may happen that there are multiplecontrol laws resulting in the same optimal cost function J∗(x0).

To prove that the cost is quadratic in the initial state, we are going to do a thought experiment.This thought experiment actually consists of three simulation runs, which are visualized in the leftpart of figure B.1.

35

Page 38: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Figure B.1: A graphical illustration of the thought experiment which proves the cost function isquadratic in the initial state.

1. Suppose that we know some optimal control law u = π∗(x), which is not necessarily linear.For our first simulation run, we put the system in some initial state x0 and run this control lawπ∗. We denote the resulting state progression by x1(t), with x1(0) = x0. We also rememberexactly which input u1(t) we applied at each time t. At the end of our experiment, we haveaccumulated the (optimal) cost J1 = J∗(x0).

2. For our second experiment, we’re going to scale the previous experiment. That is, we’re goingto start in an initial state x2(0) = kx0, with k a nonzero number. We then apply the controlinput u2(t) = ku1(t). Now something interesting happens. Because the system is linear, wewill have x2(t) = kx1(t) for all future times t. In other words, everything is k times as large!As a result, we know that the cost J2 which we accumulate will equal k2J1 = k2J(x0).

3. For our third experiment, we again start in x3(0) = kx0, yet this time we simply apply ouroptimal control law u = π∗(x). The resulting cost will necessarily be optimal and will equalJ3 = J∗(kx0).

Now compare experiments 2 and 3. Both experiments had the same initial state, and in exper-iment 3 the cost was optimal. This means that we must have J2 ≥ J3, or

k2J∗(x0) ≥ J∗(kx0). (B.3)

Next, we can do another set of three experiments, but now with the set-up as shown in the rightpart of figure B.1. If we do, then we would also find that

1

k2J∗(kx0) ≥ J∗(x0). (B.4)

If you combine the above two equations, then you find that equality must hold. That is,

k2J∗(x0) = J∗(kx0). (B.5)

From this we can conclude that, when the initial state x0 becomes k times as large, then the optimalcost J∗ becomes k2 as large. In other words, the cost function is quadratic in x0. (Technically, forthe multivariate case, there’s an extra condition which must be met. For details on this, see [AM90],section 2.3.)

In addition, we have seen that when we take linear combinations of an optimal control law,we still wind up with an optimal control law. We can use this to show that when there is anoptimal control law, there is also at least one optimal control law which is linear. To do that, wecan actually take an optimal control law π∗(x) and construct a linear optimal control law from it.First, we define the notation

e1 =

100...

, e2 =

010...

, . . . . (B.6)

36

Page 39: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

We then define the matrix F as

F = −[π∗(e1) π∗(e2) · · · π∗(en)

], (B.7)

and use the linear control law π(x) = −Fx. Because linear combinations of optimal control lawsalso result in optimal control laws, this control law must be optimal. And because it’s also linear,it happens to prove the final part of our theorem.

Now that we know that our control law has to be of the form u = −Fx, it’s time to find theoptimal F . (That is, without already knowing some other optimal control law.)

Theorem B.2. For a linear system x = Ax + Bu with feedback law u = −Fx, the optimalfeedback matrix F is given by

F = R−1BT X, (B.8)

where X is the solution to the Algebraic Riccati Equation

AT X + XA+Q− XBR−1BT X. (B.9)

(The same assumptions hold as in theorem B.1.)

Proof. Subject to the feedback law, the system behaves as x = (A−BF )x. The solution for x(t)will hence be

x(t) = e(A−BF )tx0. (B.10)

The cost function now becomes

J =

∫ ∞0

xT (t)(Q+ FTRF )x(t) dt (B.11)

= xT0

∫ ∞0

e(A−BF )T t(Q+ FTRF )e(A−BF )t dtx0

= xT0 Xx0.

Using theorem A.3, and the assumption that A is stabilizable, we can find that X satisfies

(A−BF )T X + X(A−BF ) + (Q+ FTRF ) = 0. (B.12)

By completing the squares with respect to F (see theorem A.18 for details on how to do this) wecan rewrite the above to

(F −R−1BT X)TR(F −R−1BT X) +AT X + XA+Q− XBR−1BT X = 0. (B.13)

We can even complete the squares with respect to X (temporarily ignoring the leftmost term). Ifwe define Z = BR−1BT , then we get

(F −R−1BT X)TR(F −R−1BT X) + (Q+ATZ−1A)− (X − Z−1A)TZ(X − Z−1A) = 0. (B.14)

We now want to choose F such that the solution for X is as small as possible. To figure outwhich value for F that should be, we look at the three terms in the above equation. Because R ispositive definite, the first term is positive definite or positive semi-definite. Similarly, because Q ispositive semi-definite and Z is positive definite, the second term is positive definite. The last termis negative definite, due to the minus sign.

So we see that, the bigger the first two terms are, the bigger the last term (and hence the biggerX) must be to compensate. This means that we want to choose F such that the first term is assmall as possible. Luckily, we can choose F such that the first term becomes zero. This happensif F = R−1BT X.

The remaining equation for X is

AT X + XA+Q− XBR−1BT X = 0. (B.15)

We use this equation to solve for the best possible value which X might have. Then, we use thatvalue of X to find which value of F we should choose to achieve it.

37

Page 40: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Now that we know the optimal control law for the regular LQ paradigm, we will derive theoptimal control law for the discounted LQ paradigm.

Theorem B.3. For a linear system (subject to the same assumptions as previously) the optimalcontrol law minimizing the discounted cost function

J =

∫ ∞0

e2αt(xTQx+ uTRu) dt (B.16)

is given by u = −Fx, with F = R−1BT Xα and Xα the solution to

ATαXα + XαAα +Q− XαBR−1BT Xα = 0, (B.17)

where Aα = A+ αI.

Proof. Similarly to what was done in equation (B.11), we can find that the discounted cost is

J =

∫ ∞0

e2αtxT (t)(Q+ FTRF )x(t) dt (B.18)

= xT0

∫ ∞0

e2αte(A−BF )T t(Q+ FTRF )e(A−BF )t dtx0

= xT0

∫ ∞0

e(A+αI−BF )T t(Q+ FTRF )e(A+αI−BF )t dtx0.

Now we can see that this equation is identical to what we had in equation (B.11), except that Ais replaced by Aα. As a result, the outcome is also the same, except with A replaced by Aα.

B.1.2 Properties of a system with noiseNext, we can add some noise to the system. If we do, one question we can ask ourselves is, whatwill the state distribution become? For this first theorem, the notation of section 2.2 is used.

Theorem B.4. For a system x = Ax + v, where A is stable and v is white noise with intensityV , the state distribution x(t) has mean

E[x(t)] = eAtE[x0] (B.19)

and expected squared value

Σ(t) ≡ E[x(t)xT (t)] = eAt(Σ0 −XV )eAT t +XV , (B.20)

where the steady-state state variance XV can be found by solving the Lyapunov equation

AXV +XVAT + V = 0. (B.21)

Proof. For the mean, we can derive that

E[x(t)] = E

[eAtx0 +

∫ t

0

eA(t−s)v(s) ds

](B.22)

= eAtE[x0] +

∫ t

0

eA(t−s)E[v(s)] ds

= eAtE[x0].

For the expected squared value, we similarly have

E[x(t)xT (t)] = E

[(eAtx0 +

∫ t

0

eA(t−s)v(s) ds

)(eAtx0 +

∫ t

0

eA(t−s)v(s) ds

)T](B.23)

= eAtE[x0xT0 ]eA

T t +

∫ t

0

∫ t

0

eA(t−s1)E[v(s1)vT (s2)]eAT (t−s2) ds2 ds1

= eAtΣ0eAT t +

∫ t

0

eA(t−s)V eAT (t−s) ds.

38

Page 41: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Note that, in the second line, we have used E[x0vT (s)] = 0. Because v is white noise, it is not

correlated to the initial state. We have also used the property of the δ-function that it only has avalue at one point. As a result, we only needed to consider the point s1 = s2 = s in the integral,which caused the inner integral to drop out (i.e., become one).

Next, using theorem A.4, we can rewrite the above to

E[x(t)xT (t)] = eAtΣ0eAT t +XV − eAtXV eA

T t, (B.24)

where XV can be found by solving the Lyapunov equation

AXV +XVAT + V = 0. (B.25)

The above can then be rewritten to

E[x(t)xT (t)] = eAt(Σ0 −XV )eAT t +XV , (B.26)

which is what we wanted to prove.

When looking at equation (B.20), it’s interesting to see that for a stable system, as t→∞, theleft term disappears and Σ(t) converges to steady-state variance XV .

Next to process noise, we can also add measurement noise, as is described in section 2.3. Inthis case, we need an observer of the form (2.17) to find the state x. But what should the observergain matrix be?

Theorem B.5. Suppose that for the system of equation (2.15) and (2.16) we use a state estimatewhich is updated through equation (2.17). Then the state estimation error e = x− x behaves like

e = (A−KC)e+Kw − v. (B.27)

If we also assume that there is a matrix K for which A−KC is stable (the system is detectable),then the minimum possible steady-state error variance E equals the solution of the algebraic Riccatiequation

AE + EAT + V − ECTW−1CE = 0, (B.28)

where the corresponding optimal observer gain matrix K equals

K = ECTW−1. (B.29)

Proof. First we will prove equation (B.27). This is done by noting that e equals

e = ˙x− x (B.30)= Ax+Bu+K(y − Cx)−Ax−Bu− v= A(x− x) +K(Cx+w − x)− v= (A−KC)e+Kw − v.

As a result, for any K which stabilizes A−KC, we can find the steady-state error variance E fromtheorem B.4. It is the solution of

(A−KC)E + E(A−KC)T + (V +KWKT ) = 0. (B.31)

By completing the squares with respect to W (see theorem A.18 for the procedure) we can rewritethis to

(K − ECTW−1)W (K − ECTW−1)T +AE + EAT + V − ECTW−1CE = 0. (B.32)

Through a similar argument as was made in the proof of theorem B.2, minimizing E comes downto choosing

K = ECTW−1. (B.33)

The remaining equation for E is

AE + EAT + V − ECTW−1CE = 0, (B.34)

which is what we wanted to show.

39

Page 42: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

The prediction x of the observer has some interesting properties, as shown by the followingtheorem.

Theorem B.6. Assume that the initial state estimate x0 is chosen such that E[e0xT0 ] = 0 and

E[e0eT0 ] = E, with E the steady-state error variance. Then, if the optimal observer gain matrix

K (equation B.29, subject to the assumptions of theorem th:OptimalObserverGain) is used, we willhave

E[e(t)xT (t)] = 0 (B.35)

for all subsequent times t. In other words, the error e(t) is always orthogonal to the estimated statex(t).

Proof. We are going to prove this by simply calculating E[e(t)xT (t)]. This will result in a lotof equations involving difficult integrals. But if we do all the book-keeping correctly, we shouldeventually arrive at our final result.

We know that e(t) and x(t) satisfy

e =(A−KC)e+ (Kw − v), (B.36)˙x =(A−BF )x+Kw −KCe. (B.37)

The solutions for e(t) and x(t) now equal

e(t) =e(A−KC)te0 +

∫ t

0

e(A−KC)(t−s)(Kw(s)− v(s)) ds = e1(t) + e2(t), (B.38)

x(t) =e(A−BF )tx0 +

∫ t

0

e(A−BF )(t−s1)(Kw(s1)−KC

(e(A−KC)s1e0 (B.39)

+

∫ s1

0

e(A−KC)(s1−s2)(Kw(s2)− v(s2)) ds2

))ds1 = x1(t) + x2(t) + x3(t) + x4(t).

We use the notation e1(t) and e2(t) to designate the individual terms of e(t), and similarly forx(t) which has four terms. This will ease our notation. We now can write

E[e(t)xT (t)] = E[e1(t)xT1 (t) + e2(t)xT1 (t) + e1(t)xT2 (t) + . . .+ e2(t)xT4 (t)], (B.40)

where there are eight terms in total. Most of the terms are quite easy to deal with. For instance,

T1,1 ≡ E[e1(t)xT1 (t)] = E[e(A−KC)te0x

T0 e

(A−BF )T t]

= 0, (B.41)

which follows from our assumptions. Furthermore, because v and w are white noise, which are notcorrelated with each other, nor with e0 or x0, we have

T2,1 ≡ E[e2(t)xT1 (t)] =0, (B.42)

T1,2 ≡ E[e1(t)xT2 (t)] =0, (B.43)

T2,3 ≡ E[e2(t)xT3 (t)] =0, (B.44)

T1,4 ≡ E[e1(t)xT4 (t)] =0. (B.45)

That leaves us with three terms. The first of these equals

T2,2 =E

[(∫ t

0

e(A−KC)(t−s)(Kw(s)− v(s)) ds

)(∫ t

0

e(A−BF )(t−s)Kw(s) ds

)T](B.46)

=E

[(∫ t

0

e(A−KC)(t−s)(Kw(s)− v(s)) ds

)(∫ t

0

wT (s)KT e(A−BF )T (t−s) ds

)]=

∫ t

0

e(A−KC)sKWKT e(A−BF )T s ds.

It is possible to solve this integral through a Lyapunov equation, but later on we will find that wedon’t have to.

40

Page 43: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Our next term equals

T1,3 =E

[(e(A−KC)te0

)(∫ t

0

e(A−BF )(t−s)(−KC)e(A−KC)se0 ds

)T](B.47)

=E

[(e(A−KC)te0

)(∫ t

0

eT0 e(A−KC)T s(−KC)T e(A−BF )T (t−s) ds

)]=e(A−KC)tE[e0e

T0 ]

(∫ t

0

e(A−KC)T s(−KC)T e(A−BF )T (t−s) ds

).

Though not fully solved, we will also leave T1,3 in this form for now and turn towards T2,4. Thisfinal term will be the most difficult to solve. It equals

T2,4 =E

[(∫ t

0

e(A−KC)(t−s)(Kw(s)− v(s))

)(B.48)(∫ t

0

e(A−BF )(t−s1)(−KC)

∫ s1

0

e(A−KC)(s1−s2)(Kw(s2)− v(s2)) ds2 ds1

)T]

=

∫ t

0

∫ s1

0

e(A−KC)(t−s2)(KWKT + V )e(A−KC)T (s1−s2)(−KC)T e(A−BF )T (t−s1) ds2 ds1.

Note that, in the above equation, we have worked out the expectation operator. The expectationE[v(s)vT (s2)] only had a value when s = s2, and similarly for w(s), which allowed us to get rid ofone integral.

Next, we will apply a change of variable to the above equation. We are going to substitutes2 = s1 − t2. When doing this substitution, we should of course do some checks. First of all,ds2 = −dt2, so there should appear a minus sign before the integral. Secondly, we should checkthe bounds. The lower bound of s2 was 0, so the lower bound of t2 will be s1. Similarly, the upperbound of s2 was s1, so the upper bound of t2 will be 0.

However, bounds from s1 (a positive value) to 0 are rather inconvenient, so we interchangethem. When interchanging the bounds of an integral, we introduce another minus sign though,which is good, because it cancels out the first minus sign. In the end, we wind up with

T2,4 =

∫ t

0

∫ s1

0

e(A−KC)(t−s1+t2)(KWKT + V )e(A−KC)T t2(−KC)T e(A−BF )T (t−s1) dt2 ds1 (B.49)

=

∫ t

0

e(A−KC)(t−s1)

(∫ s1

0

e(A−KC)t2(KWKT + V )e(A−KC)T t2 dt2

)(−KC)T e(A−BF )T (t−s1) ds1.

The part within brackets can be solved analytically. In fact, we know that the steady-state errorvariance E is the unique solution to the Lyapunov equation (B.31). Applying theorem A.4 willnow directly tell us that∫ s1

0

e(A−KC)t2(KWKT + V )e(A−KC)T t2 dt2 = E − e(A−KC)s1Ee(A−KC)T s1 . (B.50)

This turns T2,4 into

T2,4 =

∫ t

0

e(A−KC)(t−s1)(E − e(A−KC)s1Ee(A−KC)T s1

)(−KC)T e(A−BF )T (t−s1) ds1

=

∫ t

0

e(A−KC)(t−s1)E(−KC)T e(A−BF )T (t−s1) ds1 (B.51)

− e(A−KC)tE

∫ t

0

e(A−KC)T s1(−KC)T e(A−BF )T (t−s1) ds1.

Great. We’re nearly there! It’s time to put the terms together. If you look carefully, you’ll seethat T2,2 has a strong similarity to the first term of T2,4, while T1,3 has a strong similarity to the

41

Page 44: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

second term of T2,4. If we use this, we can write our final result as

E[e(t)xT (t)] =

∫ t

0

e(A−KC)s(KWKT − ECTKT )e(A−BF )T s ds (B.52)

+ e(A−KC)t(E[e0eT0 ]− E)

∫ t

0

e(A−KC)T s1(−KC)T e(A−BF )T (t−s1) ds1.

And here we can see something interesting. First of all, due to the definition of the optimal observergain matrix K (equation B.29), we will have KWKT = ECTKT and the first part turns out tobe zero. Secondly, because of our assumption that E[e0e

T0 ] = E, the second part will become

zero. As a result, the whole term equals zero, irrespective of t, which is exactly what we wantedto prove.

What the above theorem basically says is that, if we set our initial state estimate x(0) inthe right way, and use the optimal observer gain matrix, then our estimate x(t) will always beuncorrelated with the error e(t). So there cannot be a case in which the estimate x(t) given by ourobserver can be improved further, unless we get additional knowledge not contained in the systemoutput.

This idea also gives rise to the following theorem, which shows that by using the predictions ofthe observer, our controller can get optimal performance.

Theorem B.7. Consider a linear system with an observer. The optimal way to control this systemis to control the state estimate x as optimally as possible, while simultaneously reducing the errore as much as possible. This is subject to the assumptions of the previous theorems.

Proof. Let’s consider the expected cost. It equals

E[J ] =E

[∫ ∞0

e2αt(xT (t)Qx(t) + uT (t)Ru) dt

](B.53)

=E

[∫ ∞0

e2αt((x(t)− e(t))TQ(x(t)− e(t)) + uT (t)Ru) dt

]=E

[∫ ∞0

e2αt(xT (t)Qx(t)− 2xT (t)Qe(t) + eT (t)Qe(t) + uT (t)Ru) dt

].

From theorem B.6 we know that the second term satisfies

E[xT (t)Qe(t)] = tr(E[xT (t)Qe(t)]

)= tr

(E[e(t)xT (t)]Q

)= 0. (B.54)

(Note that we are allowed to take the trace operator of a scalar. And then, within the traceoperator, we are allowed to rotate elements. And because the trace operator is a linear operator,like the expectation operator, we may interchange the two.) So the above expression allows us toget rid of a term in our expression. We remain with

E[J ] = E

[∫ ∞0

e2αt(xT (t)Qx(t) + uT (t)Ru)

]+ E

[∫ ∞0

e2αteT (t)Qe(t) dt

]. (B.55)

Minimizing the first term comes down to optimally controlling the state estimate x. This is whatour control law u = −F x does. Minimizing the second term comes down to getting the estimationerror e as small as possible, which is exactly what our observer is doing.

So now we have shown that the well-known method of controlling an LQG system, using anobserver and using the observer estimate in a linear control law, is indeed optimal. It cannot beimproved in any way, apart from changing any of the system matrices. It’s not a result which isnew. Everything so far in this appendix has been derived multiple times before in literature. Buthaving all these proofs together, deriving everything from the ground up, is something which isnot seen as much.

42

Page 45: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

B.2 Using Lyapunov solutions to find E[J ] and V[J ]Now that we know exactly how to control a (possibly noisy) system, it’s time to examine its perfor-mance. What kind of cost can we expect to get? In this section we’ll examine that using solutionsof Lyapunov equations. We will heavily make use of the notation described in subsection 3.2.1, somake sure you know what stuff like XQ

α and XXV means.

B.2.1 The mean of the cost functionFirst we will examine the mean of the cost function. In particular, we’ll start with the expectedvalue of the infinite-time cost J .

Theorem B.8. Consider the system x = Ax+ v. Assume that α < 0 and that Aα is stable. Theexpected value E[J ] of the infinite-time cost J of equation (3.2) is then given by

E[J ] = tr((

Σ0 −V

)XQα

). (B.56)

Proof. To prove this theorem, we’re going to work out the equations step by step. An easy idea,but a lot of work to execute. Our starting point is the expression for the state

x(t) = eAtx0 +

∫ t

0

eA(t−s)v(s) ds = xa(t) + xb(t). (B.57)

Note that we have split up x(t) into two parts: xa(t) and xb(t). This will make notation latereasier.

What we are going to do, is plug x(t) into the expression for the cost

E[J ] = E

[∫ ∞0

e2αtxT (t)Qx(t) dt

]. (B.58)

Working out the result will get us several terms

E[J ] = E

[∫ ∞0

e2αt(xTa (t)Qxa(t) + 2xTa (t)Qxb(t) + xTb (t)Qxb(t)

)dt

]. (B.59)

Because the expectation operator is a linear operator, we may pull it inside the integral and takethe expectation of each individual term. This gives us

E[J ] =

∫ ∞0

e2αt(E[xTa (t)Qxa(t)

]+ 2E

[xTa (t)Qxb(t)

]+ E

[xTb (t)Qxb(t)

])dt. (B.60)

Keep in mind that v is independent from x0. This means that xa(t) is also independent fromxb(t). And with v (and hence also xb) being zero-mean, the middle term reduces to zero. Thatleaves us with two terms. For simplicity, let’s call them T1 and T2, respectively.

First we will find T1. It equals

T1 =

∫ ∞0

e2αtE[xTa (t)Qxa(t)

]dt = E

[∫ ∞0

e2αtxT0 eAT tQeAtx0 dt

]. (B.61)

Because this is a scalar, we may take the trace of it. (The trace of a matrix is the sum ofthe diagonals of a matrix.) The trace operator is a linear operator, so we may freely move itinside/outside of the integral and interchange it with the expectation operator. In addition, bytaking the trace of the above expression, we may also rotate the order of multiplication. That is,

43

Page 46: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

we may use the rule tr(cABC) = tr(cCAB) = tr(cBCA). This then tells us that

T1 =tr(E

[∫ ∞0

e2αtxT0 eAT tQeAtx0 dt

])(B.62)

=

∫ ∞0

e2αtE[tr(x0x

T0 e

AT tQeAt)]

dt

=tr(∫ ∞

0

e2αtE[x0x

T0

]eA

T tQeAt dt

)=tr

(Σ0

∫ ∞0

e2αteAT tQeAt dt

)=tr

(Σ0

∫ ∞0

e(A+αI)T tQe(A+αI)t dt

)=tr

(Σ0X

).

In the last part, we have used theorem A.3 to work out the infinite integral. Note that at thispoint we’ve had to use our assumption that Aα is stable.

Now that we have solved for T1, we will solve for T2. The steps here are mostly the same, butthere are a few extras. First we have to reduce three integrals back to two,

T2 =E

[∫ ∞0

e2αt

(∫ t

0

v(s)eAT (t−s) ds

)Q

(∫ t

0

eA(t−s)v(s) ds

)dt

](B.63)

=tr(E

[∫ ∞0

∫ t

0

∫ t

0

e2αtv(s1)eAT (t−s1)QeA(t−s2)v(s2) ds1 ds2 dt

])=tr

(∫ ∞0

∫ t

0

∫ t

0

e2αtE [v(s2)v(s1)] eAT (t−s1)QeA(t−s2) ds1 ds2 dt

)=tr

(∫ ∞0

∫ t

0

∫ t

0

e2αtV δ(s2 − s1)eAT (t−s1)QeA(t−s2) ds1 ds2 dt

)=tr

(∫ ∞0

∫ t

0

e2αtV eAT (t−s)QeA(t−s) ds dt

).

Next, we apply a change-of-variable, where we replace s by t − τ . While s ranges from 0 to t,τ ranges from t to 0, so we need to change the limits of the integration. Also, we need to applyds = −dτ , which adds a minus sign. This results in

T2 =tr(∫ ∞

0

∫ 0

t

−e2αtV eAT τQeAτ dτ dt

)(B.64)

=tr(∫ ∞

0

∫ t

0

e2αtV eAT τQeAτ dτ dt

).

Note that interchanging the limits of integration will result in an extra minus sign for the integrand.The next step is to interchange the integrals themselves. The reason here is that the second

integral runs up to time t, which makes it difficult to evaluate. It would be easier if it ran to infinity.When interchanging the integrals, we should note that we integrate over the region bounded by0 ≤ τ ≤ t. The alternative way to integrate over this region is through

T2 = tr(∫ ∞

0

∫ ∞τ

e2αtV eAT τQeAτ dt dτ

). (B.65)

44

Page 47: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Working this out further will give us

T2 =tr(∫ ∞

0

(∫ ∞τ

e2αt dt

)V eA

T τQeAτ dτ

)(B.66)

=tr(V

∫ ∞0

(∫ ∞τ

e2αt dt

)eA

T τQeAτ dτ

)=tr

(−V2α

∫ ∞0

e2ατeAT τQeAτ dτ

)=tr

(−V2α

XQα

).

By adding up T1 and T2, we find our final expression for E[J ]. It equals

E[J ] = tr((

Σ0 −V

)XQα

). (B.67)

The above theorem gives the expected infinite-time cost J for α < 0 and Aα stable. The reasonfor this is that, in any other case, the cost J is usually infinite. If Aα is instead unstable, then thesystem state x(t) will diverge faster than e2αt, while if α > 0, then the continuous noise will causethe cost to go to infinity.

Now let’s examine the mean of the finite-time cost JT . This cost is always finite. As a result,there are this time two cases. We can either have α 6= 0 or α = 0.

Theorem B.9. Consider the system x = Ax+v. Assume that α 6= 0 and that A and Aα are bothSylvester. The expected value E[JT ] of the finite-time cost JT of equation (3.3) then equals

E[JT ] = tr((

Σ0 − e2αTΣ(T ) +(1− e2αT

)(−V2α

))XQα

). (B.68)

Proof. This theorem actually makes sense. We expect the finite-time cost at time 0 to equal theinfinite-time cost at time 0 minus the cost we expect to have incurred after time T . Hence, weexpect that

E[JT ] = tr((

Σ0 +

(−V2α

))XQα

)− e2αT tr

((Σ(T ) +

(−V2α

))XQα

). (B.69)

However, this doesn’t really constitute as a proof. It may make use of theorem B.8, but thattheorem only holds for α < 0 and stable Aα, while the requirements here are less restrictive.As a result, to prove this theorem, we have to work out the expression for E[JT ] and do all themathematics involved, just like we did for theorem B.8.

If we do, we will start with

E[JT ] =E

[∫ T

0

e2αtxT (t)Qx(t) dt

](B.70)

=E

[∫ T

0

e2αt(xTa (t)Qxa(t) + xTb (t)Qxb(t)) dt

].

(Don’t confuse the transpose here with the time T .) In the above expression we again have twoterms. Let’s call them T1 and T2 again, just like in the proof of theorem B.8. For T1 we know that

T1 =E

[∫ T

0

e2αtxT0 eAT tQeAtx0 dt

](B.71)

=tr

(E[x0x

T0 ]

∫ T

0

e2αteAT tQeAt dt

)=tr

(Σ0X

Qα (T )

)=tr

(Σ0(XQ

α − eATαT XQ

α eAαT )

).

45

Page 48: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Note that we have applied theorem A.4. For T2 we will find that

T2 =E

[∫ T

0

e2αt

(∫ t

0

v(s)eAT (t−s) ds

)Q

(∫ t

0

eA(t−s)v(s) ds

)dt

](B.72)

=tr

(∫ T

0

∫ t

0

e2αtV eAT (t−s)QeA(t−s) ds dt

)

=tr

(∫ T

0

∫ t

0

e2αtV eAT sQeAs ds dt

)

=tr

(∫ T

0

∫ T

s

e2αtV eAT sQeAs dt ds

)

=tr

(∫ T

0

(∫ T

s

e2αt dt

)V eA

T sQeAs ds

).

Everything is as usual so far, but now the limits, which don’t run to infinity, will start to make adifference. We continue through

T2 =tr

(∫ T

0

(e2αT − e2αs

)V eA

T sQeAs ds

)(B.73)

=tr

((−V2α

)(∫ T

0

e2αseAT sQeAs ds− e2αT

∫ T

0

eAT sQeAs ds

))

=tr((−V2α

)(XQα (T )− e2αT XQ(T )

))=tr

((−V2α

)(XQα − eA

TαT XQ

α eAαT − e2αT (XQ − eA

TT XQeAT )))

=tr((−V2α

)((XQ

α − e2αT XQ) + eATαT (XQ − XQ

α )eAαT))

.

The result for E[JT ] will be

E[JT ] =tr((

Σ0 −V

)XQα (T ) + e2αT V

2αXQ(T )

)(B.74)

=tr(

Σ0(XQα − eA

TαT XQ

α eAαT ) +

(−V2α

)((XQ

α − e2αT XQ) + eATαT (XQ − XQ

α )eAαT))

,

which is a fine expression, but nowhere near where we want to end up. To still get the rightexpression, we have to take a closer look at the rightmost term, involving (XQ − XQ

α ). By firstusing theorem A.7 and then theorem A.6 (with F = GT = eAαT ), we can write

tr((−V2α

)eA

TαT (XQ − XQ

α )eAαT)

=tr(V eA

TαT

XQα − XQ

2αeAαT

)(B.75)

=tr(V eA

TαT XXQα eAαT

)(B.76)

=tr(XV eA

TαT XQ

α eAαT

)=tr

(e2αT eATXV eA

TT XQα

).

We also take a closer look to the second rightmost term, involving (XQα − e2αT XQ). If we again

46

Page 49: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

use theorem A.7 followed by theorem A.6 (with F = GT = I), then

tr((−V2α

)(XQα − e2αT XQ

))=tr

((−V2α

)((1− e2αT

)XQα + e2αT

(XQα − XQ

)))(B.77)

=tr((−V2α

)(1− e2αT

)XQα − e2αTV XXQα

)=tr

((−V2α

)(1− e2αT

)XQα − e2αTXV XQ

α

).

By reordering the terms quite a bit more, we can derive that E[JT ] equals

E[JT ] = tr((

Σ0 − e2αT(eAT (Σ0 −XV )eA

TT +XV)

+(1− e2αT

)(−V2α

))XQα

). (B.78)

This is very interesting, because now we can apply theorem B.4. This theorem says that x(t)satisfies

Σ(t) ≡ E[x(t)xT (t)] = eAt(Σ0 −XV )eAT t +XV . (B.79)

Making use of this, we can write

E[JT ] = tr((

Σ0 − e2αTΣ(T ) +(1− e2αT

)(−V2α

))XQα

), (B.80)

which proves what we set out to prove.It’s interesting to note that in this proof we have not used theorem A.3. In other words, we

did not need to assume that any of the matrices was stable. Instead, we’ve applied theorem A.4,which only has as requirement that the respective matrix is Sylvester.

It’s interesting to note what happens with equation B.68 when T →∞. If α < 0, then e2αT → 0.As a result, equation (B.68) immediately reduces to equation (B.56), as can be expected.

The above theorem treated the case with α 6= 0. If α = 0, and there is no cost weigh-ing/discounting, we should instead use the next theorem.

Theorem B.10. Consider the system x = Ax + v. Assume that α = 0 and that A is Sylvester.The expected value E[JT ] of the finite-time cost JT of equation (3.3) is then given by

E[JT ] = tr((Σ0 − Σ(T ) + TV ) XQ

). (B.81)

Proof. First we should note that, because α = 0, we have A = Aα and similarly XQ = XQα . This

simplifies our equations a bit. To find E[JT ], we now use a similar set-up as the previous twotheorems. That is, we find T1 and T2. Finding T1 is actually done in an identical way as was donein theorem B.9. The result is

T1 = tr(Σ0X

Q(T )). (B.82)

For T2 our starting point is

T2 = tr

(∫ T

0

∫ t

0

V eAT sQeAs ds dt

), (B.83)

which comes directly out of equation (B.72). By using theorem A.4, we can turn this into

T2 = tr

(∫ T

0

∫ t

0

V eAT sQeAs ds dt

)(B.84)

= tr

(∫ T

0

V(XQ − eA

T tXQeAt)dt

)= tr

(TV XQ − V XXQ(T )

).

47

Page 50: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Next, we want to do something about XXQ(T ). So we expand it, apply theorem A.6 twice andthen simplify the result. That is,

T2 = tr(TV XQ −

(V XXQ − V eA

TT XXQeAT))

(B.85)

= tr(TV XQ −

(XV XQ −XV eA

TT XQeAT))

= tr(TV XQ −XV XQ(T )

).

If we now add up T1 and T2, and apply theorem B.4 to include Σ(T ), we find that

E[JT ] =tr(Σ0X

Q(T )−XV XQ(T ) + TV XQ)

(B.86)

=tr((

Σ0 −XV) (XQ − eA

TT XQeAT)

+ TV XQ)

=tr(

Σ0XQ −XV XQ − eAT

(Σ0 −XV

)eA

TT XQ + TV XQ)

=tr((Σ0 − Σ(T ) + TV ) XQ

).

This is the equation which we wanted to prove.

Let’s compare equations (B.81) and (B.68). In particular, let’s examine what happens whenα→ 0. Using l’Hôpital’s rule, we can find that

limα→0

1− ekαT

kα= limα→0

ddα

(1− ekαT

)ddα (kα)

= limα→0

−kTekαT

k= −T. (B.87)

If we consider equation (B.68) as α → 0, and use the above result, then we find that it directlyturns into equation (B.81). Again, this is something which we would expect.

B.2.2 The variance of the cost functionNow we know how to find the expected cost for almost all possible cases. The next step? Thevariance of the cost. Again we start with the infinite-time cost J .

Theorem B.11. Consider the system x = Ax+v. Assume that α < 0 and that Aα is stable. Thevariance V[J ] of the infinite-time cost J of equation (3.2) is then given by

V[J ] = 2tr(Σ0XQα Σ0X

Qα )− 2µT0 X

Qα µ0µ

T0 X

Qα µ0 + 4tr

((XΣ0

2α −XV

)XQα V X

). (B.88)

Proof. The basic method is the same the proof of theorem B.8. We will simply do a lot of mathsto find the variance V[J ]. In fact, we’re not going to find V[J ] directly, but instead make use of

V[J ] = E[(J − E[J ])2] = E[J2]− 2E[J ]E[J ] + E[J ]2 = E[J2]− E[J ]2. (B.89)

So we will find E[J2] first and then subtract E[J ]2 to find V[J ].If we make use of the notation xa(t) and xb(t), like in theorem B.8, we can write

E[J2] = E

[(∫ ∞0

e2αt(xa(t) + xb(t))TQ(xa(t) + xb(t)) dt

)2]

(B.90)

= E

[∫ ∞0

∫ ∞0

e2α(t1+t2)(xa(t1) + xb(t1))TQ(xa(t1) + xb(t1))(xa(t2) + xb(t2))TQ(xa(t2) + xb(t2)) dt2 dt1

].

If we work out the brackets, we will get 16 terms. Luckily, any term with either one or three timesxb in it (and hence one or three times v in it) will have an expectation of zero, because v is azero-mean Gaussian random variable. That cancels out half of the terms.

We remain with the other half. With some clever rearranging of terms, we can find that someof them are equal. After all, if we consider

(xTa (t1)Qxb(t1))(xTa (t2)Qxb(t2)), (B.91)

48

Page 51: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

we see that this is the product of two scalars. So we may freely transpose one (or both) of thesescalars. The above is hence equal to

(xTa (t1)Qxb(t1))(xTb (t2)Qxa(t2)), (B.92)

and similarly if we (also) transpose the left term.In the end, we can conclude that V[J ] consists of four terms. We call them T1 through T4 and

define them as

T1 =E

[∫ ∞0

∫ ∞0

e2α(t1+t2)xTa (t1)Qxa(t1)xTa (t2)Qxa(t2) dt2 dt1

], (B.93)

T2 =2E

[∫ ∞0

∫ ∞0

e2α(t1+t2)xTa (t1)Qxa(t1)xTb (t2)Qxb(t2) dt2 dt1

], (B.94)

T3 =4E

[∫ ∞0

∫ ∞0

e2α(t1+t2)xTa (t1)Qxb(t1)xTb (t2)Qxa(t2) dt2 dt1

], (B.95)

T4 =E

[∫ ∞0

∫ ∞0

e2α(t1+t2)xTb (t1)Qxb(t1)xTb (t2)Qxb(t2) dt2 dt1

]. (B.96)

One by one, we will work out their expressions.Starting off with T1, we have

T1 =E

[∫ ∞0

∫ ∞0

e2α(t1+t2)xT0 eAT t1QeAt1x0x

T0 e

AT t2QeAt2x0 dt2 dt1

](B.97)

=E

[xT0

(∫ ∞0

e2αt1eAT t1QeAt1 dt1

)x0x

T0

(∫ ∞0

e2αt2eAT t2QeAt2 dt2

)x0

]=E[xT0 X

Qα x0x

T0 X

Qα x0],

where we have used theorem A.3. If we then use theorem A.17, we see that this term equals

T1 = tr(Σ0XQα )2 + 2tr(Σ0X

Qα Σ0X

Qα )− 2µT0 X

Qα µ0µ

T0 X

Qα µ0. (B.98)

With the first term down, let’s continue with the second. It equals

T2 =2E

[∫ ∞0

∫ ∞0

e2α(t1+t2)xTa (t1)Qxa(t1)xTb (t2)Qxb(t2) dt2 dt1

](B.99)

=2E

[(∫ ∞0

e2αt1xTa (t1)Qxa(t1) dt1

)(∫ ∞0

e2αt2xTb (t2)Qxb(t2) dt2

)]=2E

[∫ ∞0

e2αt1xTa (t1)Qxa(t1) dt1

]E

[∫ ∞0

e2αt2xTb (t2)Qxb(t2) dt2

].

Note that we are allowed to split up the expectation operator here, because xa and xb are inde-pendent. Now, if we look closely at the two terms above, then we can see that they are exactlyequal to the two terms we had when solving for E[J ]. When solving for E[J ] we had to add themtogether though, while now we need to multiply them. As a result we get

T2 = 2tr(Σ0X

)tr(−V2α

XQα

). (B.100)

Next is the third term. Now things are getting more difficult. We are going to start with rearrangingthe equation a bit. In particular,

T3 =4tr(E

[∫ ∞0

∫ ∞0

e2α(t1+t2)xTa (t1)Qxb(t1)xTb (t2)Qxa(t2) dt2 dt1

])(B.101)

=4tr(∫ ∞

0

∫ ∞0

e2α(t1+t2)E[xa(t2)xTa (t1)Qxb(t1)xTb (t2)Q

]dt2 dt1

)=4tr

(∫ ∞0

∫ ∞0

e2α(t1+t2)E[xa(t2)xTa (t1)

]QE

[xb(t1)xTb (t2)

]Qdt2 dt1

).

49

Page 52: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Before we continue, let’s take a look at the terms E[xa(t2)xTa (t1)

]and E

[xb(t1)xTb (t2)

]individ-

ually. Working out the first of these two terms gives us

E[xa(t2)xTa (t1)

]= E

[eAt2x0x

T0 e

AT t1]

= eAt2Σ0eAT t1 . (B.102)

The second is a bit harder to tackle. It equals

E[xb(t1)xTb (t2)

]=E

[(∫ t1

0

eA(t1−s1)v(s1) ds1

)(∫ t2

0

vT (s2)eAT (t2−s2) ds2

)](B.103)

=

∫ t1

0

∫ t2

0

eA(t1−s1)E[v(s1)vT (s2)

]eA

T (t2−s2) ds2 ds1.

The above integral only has a value when s1 = s2. But we always have that s1 ≤ t1 and s2 ≤ t2.So one way to write the above integral is like

E[xb(t1)xTb (t2)

]=

∫ min(t1,t2)

0

eA(t1−s)V eAT (t2−s) ds. (B.104)

Having a minimum-function in the limits of the integral will make things rather difficult to solve,but there are tricks to get rid of it. We’ll see one very soon. First we assemble everything into theexpression for T3 though. This gives us

T3 = 4tr

(∫ ∞0

∫ ∞0

∫ min(t1,t2)

0

e2α(t1+t2)eAt2Σ0eAT t1QeA(t1−s)V eA

T (t2−s)Qds dt2 dt1

). (B.105)

Now we will get rid of the minimum-function in the integral limits. To do so, we will look closely atwhat area we’re integrating over. This area is described by 0 ≤ s, s ≤ t1 and s ≤ t2. These threebounds together are sufficient to describe our integration area (or volume) in the three-dimensionalspace of s, t1 and t2.

We are allowed to freely change the order of integration, but only if this does not change ourintegration area. Keeping this in mind, we can rewrite the expression for T3 to

T3 = 4tr(∫ ∞

0

∫ ∞s

∫ ∞s

e2α(t1+t2)eAt2Σ0eAT t1QeA(t1−s)V eA

T (t2−s)Qdt2 dt1 ds

). (B.106)

The upper limits of all these integrals are infinity. This is good, as it makes it easier to solve them.However, we would like all lower limits to be zero as well. To accomplish this, we will substitutet1 by τ1 + s and t2 by τ2 + s. The result is

T3 = 4tr(∫ ∞

0

∫ ∞0

∫ ∞0

e2α(τ1+τ2+2s)eA(τ2+s)Σ0eAT (τ1+s)QeAτ1V eA

T τ2Qdτ2 dτ1 ds

). (B.107)

Now it also happens to be possible to split up the integrals. Doing so will give us

T3 = 4tr((∫ ∞

0

e4αseAsΣ0eAT s ds

)(∫ ∞0

e2ατ1eAT τ1QeAτ1 dτ1

)V

(∫ ∞0

e2ατ2eAT τ2QeAτ2 dτ2

)).

(B.108)The result is

T3 = 4tr(XΣ0

2α XQα V X

), (B.109)

where XΣ02α is the solution to the Lyapunov equation

(A+ 2α)XΣ02α +XΣ0

2α (A+ 2α)T + Σ0 = 0. (B.110)

Note that XΣ02α has a unique solution, because A2α is stable. This in turn follows from the assump-

tions that Aα is stable and α < 0.Now that we have T3, there is only one term left, which is the most difficult one. We can

remember from equation (B.96) that

T4 = E

[∫ ∞0

∫ ∞0

e2α(t1+t2)xTb (t1)Qxb(t1)xTb (t2)Qxb(t2) dt2 dt1

]. (B.111)

50

Page 53: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

We already have two integrals and each of the xb terms will give us an additional one. The resultis a staggering set of six integrals,

T4 =

∫ ∞0

∫ ∞0

∫ t1

0

∫ t1

0

∫ t2

0

∫ t2

0

e2α(t1+t2)T4,inner dτ2 ds2 dτ1 ds1 dt2 dt1, (B.112)

where the integrand (that is, the part inside of all the integrals) equals

T4,inner = E[vT (s1)eA

T (t1−s1)QeA(t1−τ1)v(τ1)vT (s2)eAT (t2−s2)QeA(t2−τ2)v(τ2)

]. (B.113)

But how can we work out the expectation operator? There are basically three cases where theabove quantity has a (non-zero) value. It is (case A) when s1 = τ1 and s2 = τ2, or (case B) whens1 = τ2 and s2 = τ1, or (case C) when s1 = s2 and τ1 = τ2. For each of these different cases, weneed to work out the integrals.

Luckily, cases B and C result in the same value. To see why, assume that case B holds, transposethe second half of T4,inner (which is a scalar) and you’ll get case C. So, if we count case B twice,we may ignore case C.

The result is that, for case A, we have

T4,inner,A = tr(E[v(τ1)vT (s1)

]eA

T (t1−s1)QeA(t1−τ1))tr(E[v(τ2)vT (s2)

]eA

T (t2−s2)QeA(t2−τ2)),

(B.114)while for case B we get

T4,inner,B = tr(E[v(τ2)vT (s1)

]eA

T (t1−s1)QeA(t1−τ1)E[v(τ1)vT (s2)

]eA

T (t2−s2)QeA(t2−τ2)).

(B.115)If we now define T4 = T4,A + T4,B , counting case B twice, then

T4,A =

∫ ∞0

∫ ∞0

∫ t1

0

∫ t2

0

e2α(t1+t2)tr(V eA

T (t1−s1)QeA(t1−s1))tr(V eA

T (t2−s2)QeA(t2−s2))ds2 ds1 dt2 dt1,

(B.116)

T4,B = 2

∫ ∞0

∫ ∞0

∫ mi

0

∫ mi

0

e2α(t1+t2)tr(V eA

T (t1−s1)QeA(t1−s2)V eAT (t2−s2)QeA(t2−s1)

)ds2 ds1 dt2 dt1,

(B.117)where mi is short for min(t1, t2). But why does it appear? Well, in case A we could simply combines1 with τ1, which both ranged from 0 to t1. Similarly, we could combine s2 with τ2, which bothranged from 0 to t2. However, in case B we had to combine s1 with τ2, and while s1 ranged from0 to t1, τ2 ranged from 0 to t2. As a result, they could only have the same value when both weresmaller (or equal) to min(t1, t2). And a similar thing can be said about s2 and τ1.

Okay. We have reduced the term T4 to two equations of four integrals. Let’s start with the‘easy’ one of the two, which is T4,A. The reason that it’s relatively easy, is that we can split it up.That is,

T4,A =

(∫ ∞0

∫ t

0

e2αttr(V eA

T (t−s)QeA(t−s))ds dt

)2

. (B.118)

We can interchange the integrals, noting that our integration area is defined by 0 ≤ s ≤ t. Thisresults in

T4,A =

(∫ ∞0

∫ ∞s

e2αttr(V eA

T (t−s)QeA(t−s))dt ds

)2

. (B.119)

If we then also replace t by τ + s, to get all lower integration limits to zero, we get

T4,A =

(∫ ∞0

∫ ∞0

e2α(τ+s)tr(V eA

T τQeAτ)dτ ds

)2

. (B.120)

This is a good time to split up the integrals. If we do, we find

T4,A = tr(V

(∫ ∞0

e2αs ds

)(∫ ∞0

e2ατeAT τQeAτ dτ

))2

. (B.121)

51

Page 54: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Solving the integrals gives us our final solution for T4,A, being

T4,A = tr(−V2α

XQα

)2

. (B.122)

Good. Now let’s find T4,B , which will be a bit tougher. First we will change the order of integration.To be precise, we’re going to move t2 and t1 from the inside to the outside, while we bring s2 ands1 from the outside to the inside. Our integration area is defined by s1 ≥ 0, s1 ≤ t1, s1 ≤ t2,s2 ≥ 0, s2 ≤ t1 and s2 ≤ t2. This means that we can also write our set of integrals as

T4,B = 2

∫ ∞0

∫ ∞0

∫ ∞ma

∫ ∞ma

e2α(t1+t2)tr(V eA

T (t1−s1)QeA(t1−s2)V eAT (t2−s2)QeA(t2−s1)

)dt2 dt1 ds2 ds1,

(B.123)where ma is short for max(s1, s2). Now we have all upper integral limits set to infinity. However,we still have a maximum-function in the lower limits of our integral. To get rid of it, we’re goingto apply another trick.

We’re going to split our integration area up into two parts. One part has s1 ≤ s2 and the otherpart has s1 ≥ s2. If we integrate over both areas, we will get T4,B . The claim now is that, if weonly integrate over one of these areas, we get exactly half of T4,B .

The reason is symmetry. That is, if we interchange s1 and s2 in the integrand of any of theprevious expressions for T4,B , we get exactly the same. So if we integrate over the area withs1 ≤ s2, we should get exactly the same as if we integrate over the area with s2 ≥ s1. Now keepin mind that, if we integrate over both areas, we get T4,B . So if we integrate only over one of theareas, we should get exactly half of T4,B .

As a result, we can simply integrate over the reduced area with s1 ≤ s2 and multiply theoutcome by 2 to get T4,B . And because we now know (or have assumed) that s1 ≤ s2, we alsoknow that max(s1, s2) = s2. This causes our expression to turn into

T4,B = 4

∫ ∞0

∫ ∞s1

∫ ∞s2

∫ ∞s2

e2α(t1+t2)tr(V eA

T (t1−s1)QeA(t1−s2)V eAT (t2−s2)QeA(t2−s1)

)dt2 dt1 ds2 ds1.

(B.124)Next, we’re going to set the lower limits of our integrals to zero. If we start with the two innerintegrals, replacing t1 by τ1 + s2 and t2 by τ2 + s2, we get

T4,B = 4

∫ ∞0

∫ ∞s1

∫ ∞0

∫ ∞0

e2α(τ1+τ2+2s2)tr(V eA

T (τ1+s2−s1)QeAτ1V eAT τ2QeA(τ2+s2−s1)

)dτ2 dτ1 ds2 ds1.

(B.125)If we do the same with the third integral, replacing s2 by s2 + s1, we wind up with

T4,B = 4

∫ ∞0

∫ ∞0

∫ ∞0

∫ ∞0

e2α(τ1+τ2+2s2+2s1)tr(V eA

T (τ1+s2)QeAτ1V eAT τ2QeA(τ2+s2)

)dτ2 dτ1 ds2 ds1.

(B.126)Now it’s time to split up the integral. We get as result

T4,B =4tr((∫ ∞

0

e4αs1 ds1

)(∫ ∞0

e4αs2eAs2V eAT s2 ds2

)(B.127)(∫ ∞

0

e2ατ1eAT τ1QeAτ1 dτ1

)V

(∫ ∞0

e2ατ2eAT τ2QeAτ2 dτ2

)).

By solving each of the individual integrals, we get our final result for T4,B , which is

T4,B = 4tr(−1

4αXV

2αXQα V X

). (B.128)

Perfect! Now we know enough to find E[J2]. We simply have to put all individual terms together.

52

Page 55: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

So,

E[J2] =T1 + T2 + T3 + T4,A + T4,B (B.129)

=tr(Σ0XQα )2 + 2tr(Σ0X

Qα Σ0X

Qα )− 2µT0 X

Qα µ0µ

T0 X

Qα µ0 + 2tr

(Σ0X

)tr(−V2α

XQα

)+ 4tr

(XΣ0

2α XQα V X

)+ tr

(−V2α

XQα

)2

+ 4tr(−1

4αXV

2αXQα V X

).

But this doesn’t equal V[J ] yet. After all, V[J ] = E[J2] − E[J ]2. If we subtract E[J ]2 from theabove (using the result of theorem B.8), we see that terms 1, 4 and 6 drop out. If we then alsomerge terms 5 and 7 into a single term, we remain with

V[J ] = 2tr(Σ0XQα Σ0X

Qα )− 2µT0 X

Qα µ0µ

T0 X

Qα µ0 + 4tr

((XΣ0

2α −XV

)XQα V X

). (B.130)

This is the result which we wanted to find. Interestingly enough, we can also note that we canrewrite XΣ0

2α −XV2α4α to

XΣ02α −

XV2α

4α= XΣ0

2α −XV4α2α = X

Σ0− V4α

2α , (B.131)

but having too many terms in the superscript would just cause confusion, so we’ll stick with ourearlier way of writing V[J ].

It’s interesting to look at what equation (B.88) reduces to in certain special cases. For example,if we know the initial state x0 deterministically, then Σ0 = µ0µ

T0 and the first two terms cancel

each other out.Something else happens when we set our initial state distribution equal to the steady-state

distribution. That is, we choose µ0 = 0 and Σ0 = XV (assuming that A is stable and such adistribution exists). In this case the term XΣ0

2α −XV2α4α turns (through theorem A.7) into

XXV

2α −XV

4α=XV

2α −XV

4α− XV

4α= −X

V

4α. (B.132)

This means that the variance V[J ] of the cost for this case reduces to

V[J ] = 2tr((

XV − V

)XQαX

V XQα

), (B.133)

which is another short and elegant expression.Next we look at the variance of the finite-time cost JT . The procedure for finding it is mostly

identical to finding the variance of J , except there are a lot more intricate details. The result is apretty lengthy proof, so enjoy.

Theorem B.12. Consider the system x = Ax+ v. Assume that α 6= 0 and that A−α, A, Aα andA2α are Sylvester. The variance V[JT ] of the finite-time cost JT of equation (3.3) is then given by

V[JT ] =2tr(Σ0XQα (T )Σ0X

Qα (T ))− 2µT0 X

Qα (T )µ0µ

T0 X

Qα (T )µ0 (B.134)

+ 4tr((

XΣ02α (T )− XV

2α(T )

4α+e4αTXV (T )

)XQα V X

)+ 2tr

(e4αTΣ(T )XQ

α Σ(T )XQα

)− 2tr

(Σ0e

ATαT XQα e

AαTΣ0eATαT XQ

α eAαT

)+ 8tr

((XV − Σ0)eA

TαT XQ

α XV XQαα,3α

)− 8tr

(e4αTXV X

XQα−α (T )V XQ

α

).

Proof. The proof of this theorem will be very similar to that of theorem B.11. We will split E[J2T ]

up into four terms, being T1, T2, T3 and T4, and solve each of them individually. Then we addthem together and subtract E[JT ]2 to find V[JT ]. This proof will be a lot longer though, since wewill run into many complications along the way.

53

Page 56: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

The expression for T1 is the same as earlier, except we need to adjust the bounds to T wherethey used to be ∞. So,

T1 =E

[∫ T

0

∫ T

0

e2α(t1+t2)xTa (t1)Qxa(t1)xTa (t2)Qxa(t2) dt2 dt1

](B.135)

=E

[∫ T

0

∫ T

0

e2α(t1+t2)xT0 eAT t1QeAt1x0x

T0 e

AT t2QeAt2x0 dt2 dt1

]

=E

[xT0

(∫ T

0

e2αt1eAT t1QeAt1 dt1

)x0x

T0

(∫ T

0

e2αt2eAT t2QeAt2 dt2

)x0

]=E[xT0 X

Qα (T )x0x

T0 X

Qα (T )x0].

We can again solve this using theorem A.17. The result will be

T1 = tr(Σ0XQα (T ))2 + 2tr(Σ0X

Qα (T )Σ0X

Qα (T ))− 2µT0 X

Qα (T )µ0µ

T0 X

Qα (T )µ0. (B.136)

The next step is calculating T2. In a similar way as previously, we will get

T2 =2E

[∫ T

0

∫ T

0

e2α(t1+t2)xTa (t1)Qxa(t1)xTb (t2)Qxb(t2) dt2 dt1

](B.137)

=2E

[(∫ T

0

e2αt1xTa (t1)Qxa(t1) dt1

)(∫ T

0

e2αt2xTb (t2)Qxb(t2) dt2

)]

=2E

[∫ T

0

e2αt1xTa (t1)Qxa(t1) dt1

]E

[∫ T

0

e2αt2xTb (t2)Qxb(t2) dt2

].

We have found the two parts in brackets already. We did so when finding an expression for E[JT ]at theorem B.9. That is,

E

[∫ T

0

e2αt1xTa (t1)Qxa(t1) dt1

]=tr

(Σ0X

Qα (T )

), (B.138)

E

[∫ T

0

e2αt2xTb (t2)Qxb(t2) dt2

]=tr

((−V2α

)(XQα (T )− e2αT XQ(T )

)). (B.139)

For E[JT ] we had to add up these two parts. Now we need to multiply them. So we have

T2 = 2tr(Σ0X

Qα (T )

)tr((−V2α

)(XQα (T )− e2αT XQ(T )

)). (B.140)

Next is the third term, and now things are getting rough, so prepare yourself. We are startingat an adjusted version of equation (B.105), where the integration limits now run up to T . That is,

T3 = 4tr

(∫ T

0

∫ T

0

∫ min(t1,t2)

0

e2α(t1+t2)eAt2Σ0eAT t1QeA(t1−s)V eA

T (t2−s)Qds dt2 dt1

). (B.141)

By changing the order of integration, we get

T3 = 4tr

(∫ T

0

∫ T

s

∫ T

s

e2α(t1+t2)eAt2Σ0eAT t1QeA(t1−s)V eA

T (t2−s)Qdt2 dt1 ds

). (B.142)

Just like last time s is in the integration limits. Previously we applied the trick of changing thelower integration bound to get it out, allowing us to solve each integral individually. We can trythat trick again, but now we will find

T3 = 4tr

(∫ T

0

∫ T−s

0

∫ T−s

0

e2α(τ1+τ2+2s)eA(τ2+s)Σ0eAT (τ1+s)QeAτ1V eA

T τ2Qdτ2 dτ1 ds

).

(B.143)

54

Page 57: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

This time it is not possible to split up all integrals. Instead, we can only reduce this to

T3 = 4tr

(∫ T

0

e4αseAsΣ0eAT s

(∫ T−s

0

e2ατ1eAT τ1QeAτ1 dτ1

)V

(∫ T−s

0

e2ατ2eAT τ2QeAτ2 dτ2

)ds

).

(B.144)Making use of theorem A.4, we can resolve the inner integrals and find

T3 = 4tr

(∫ T

0

e4αseAsΣ0eAT s

(XQα − eA

Tα (T−s)XQ

α eAα(T−s)

)V(XQα − eA

Tα (T−s)XQ

α eAα(T−s)

)ds

).

(B.145)Here we see, if we work out the brackets, that there are actually four parts to T3. (More terms.Great...) If we indicate the part of XQ

α through a subscript ‘a’ and the part of eATα (T−s)XQ

α eAα(T−s)

through ‘b’, then we have

T3aa =4tr

(∫ T

0

e4αseAsΣ0eAT sXQ

α V XQα ds

)(B.146)

=4tr(XΣ0

2α (T )XQα V X

),

T3bb =4tr

(∫ T

0

e4αseAsΣ0eAT seA

Tα (T−s)XQ

α eAα(T−s)V eA

Tα (T−s)XQ

α eAα(T−s) ds

)(B.147)

=4tr

(∫ T

0

e2αseAαTΣ0eATαT XQ

α eAα(T−s)V eA

Tα (T−s)XQ

α ds

)

=4tr

(eAαTΣ0e

ATαT XQα

(∫ T

0

e2α(T−τ)eAατV eATατ dτ

)XQα

)

=4tr

(eA2αTΣ0e

AT2αT XQα

(∫ T

0

eAτV eAT τ dτ

)XQα

)=4tr

(eA2αTΣ0e

AT2αT XQαX

V (T )XQα

).

So far so good. We have found T3aa and T3bb . Next, we need to find T3ab and T3ba , but there’sgood and bad news. The good news is that they’re equal. (To see why, write out T3ab , transposewhatever is in the trace function, and you will get T3ba .) The bad news is that finding T3ba (orT3ab) cannot be done using Lyapunov solutions.

To see why, we write out T3ba . It equals

T3ba =− 4tr

(∫ T

0

e4αseAsΣ0eAT seA

Tα (T−s)XQ

α eAα(T−s)V XQ

α ds

)(B.148)

=− 4tr

(∫ T

0

e2αsΣ0eATαT XQ

α eAα(T−s)V XQ

α eAαs ds

)

=− 4tr

(Σ0e

ATαT XQα

(∫ T

0

eAα(T−s)V XQα e

A3αs ds

)).

We can solve this integral by applying theorem A.15 with k1 = 1, k2 = 3 and Q substituted forV XQ

α . That is, we have a term XV XQαα,3α which equals

XV XQαα,3α =

∫ T

0

eAα(T−s)V XQα e

A3αs ds (B.149)

=[I 0

]exp

([Aα V XQ

α

0 A3α

]T

)[0I

].

55

Page 58: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Using this new term, we can write

T3ba = −4tr(

Σ0eATαT XQ

α XV XQαα,3α

). (B.150)

And that was the last part of T3 which we needed to find.The next step is to find T4, which is again one step harder. Most of the process is identical to

finding T4 in the proof of theorem B.11. That is, we again split up T4 into a part T4,A and T4,B .Here, T4,A equals

T4,A =

∫ T

0

∫ T

0

∫ t1

0

∫ t2

0

e2α(t1+t2)tr(V eA

T (t1−s1)QeA(t1−s1))tr(V eA

T (t2−s2)QeA(t2−s2))ds2 ds1 dt2 dt1,

(B.151)where we have taken equation (B.116) and adjusted the integration bounds. Just like previously,we can split this integral up, like

T4,A = tr

(∫ T

0

∫ t

0

e2αtV eAT (t−s)QeA(t−s) ds dt

)2

. (B.152)

We have seen the integral above before. In fact, we saw it when finding E[JT ]. (See equation (B.72)in theorem B.9.) Using the same result, we have

T4,A = tr((−V2α

)(XQα (T )− e2αT XQ(T )

))2

. (B.153)

The next step is finding T4,B . Adjusting the integration limits in equation (B.117), we find that

T4,B = 2

∫ T

0

∫ T

0

∫ mi

0

∫ mi

0

e2α(t1+t2)tr(V eA

T (t1−s1)QeA(t1−s2)V eAT (t2−s2)QeA(t2−s1)

)ds2 ds1 dt2 dt1,

(B.154)where mi is again short for min(t1, t2). Changing the order of integration leads to

T4,B = 2

∫ T

0

∫ T

0

∫ T

ma

∫ T

mae2α(t1+t2)tr

(V eA

T (t1−s1)QeA(t1−s2)V eAT (t2−s2)QeA(t2−s1)

)dt2 dt1 ds2 ds1,

(B.155)where ma is again short for max(s1, s2). Again assuming that s1 ≤ s2 and integrating over half ofthe integration area will result in

T4,B = 4

∫ T

0

∫ T

s1

∫ T

s2

∫ T

s2

e2α(t1+t2)tr(V eA

T (t1−s1)QeA(t1−s2)V eAT (t2−s2)QeA(t2−s1)

)dt2 dt1 ds2 ds1.

(B.156)If we update the integration limits of the third and fourth integral, we get

T4,B = 4

∫ T

0

∫ T

s1

∫ T−s2

0

∫ T−s2

0

e2α(t1+t2+2s2)tr(V eA

T (t1+s2−s1)QeAt1V eAT t2QeA(t2+s2−s1)

)dt2 dt1 ds2 ds1.

(B.157)We could now also update the limits of the second integral, but that would be unwise. The reasonis that then we’d have both s1 and s2 in the integration limits, which will make it very hard tosplit up the integral. Instead, we will interchange the outer two integrals. So,

T4,B = 4

∫ T

0

∫ s2

0

∫ T−s2

0

∫ T−s2

0

e2α(t1+t2+2s2)tr(V eA

T (t1+s2−s1)QeAt1V eAT t2QeA(t2+s2−s1)

)dt2 dt1 ds1 ds2.

(B.158)At this point we can start separating integrals. If we then (in the second step) also substitute s1

56

Page 59: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

by s2 − s1, we find that

T4,B =4tr

(∫ T

0

e4αs2

(∫ s2

0

eA(s2−s1)V eAT (s2−s1) ds1

)(B.159)(∫ T−s2

0

e2αt1eAT t1QeAt1 dt1

)V

(∫ T−s2

0

e2αt2eAT t2QeAt2 dt2

)ds2

)

=4tr

(∫ T

0

e4αs2

(∫ s2

0

eAs1V eAT s1 ds1

)(∫ T−s2

0

e2αt1eAT t1QeAt1 dt1

)V

(∫ T−s2

0

e2αt2eAT t2QeAt2 dt2

)ds2

).

We can solve the three inner integrals, which leaves us with only one integral,

T4,B =4tr

(∫ T

0

e4αs2(XV − eAs2XV eA

T s2)

(B.160)(XQα − eA

Tα (T−s2)XQ

α eAα(T−s2)

)V(XQα − eA

Tα (T−s2)XQ

α eAα(T−s2)

)ds2

).

We’re almost there! Because if we work out brackets there are only ... oh ... eight terms still tosolve. Let’s denote these terms again like T4,Baaa , T4,Baab , etcetera, and solve them one by one,just like we did for T3. First we find T4,Baaa to be equal to

T4,Baaa =4tr

(∫ T

0

e4αs2XV XQα V X

Qα ds2

)(B.161)

=4tr((

e4αT − 1

)XV XQ

α V XQα

).

Next, T4,Bbaa equals

T4,Bbaa =− 4tr

(∫ T

0

e4αs2eAs2XV eAT s2XQ

α V XQα ds2

)(B.162)

=− 4tr(XXV

2α (T )XQα V X

).

The terms T4,Baba and T4,Baab are equal (whatever is in the trace function is merely transposed)and are given by

T4,Baba =− 4tr

(∫ T

0

e4αs2XV eATα (T−s2)XQ

α eAα(T−s2)V XQ

α ds2

)(B.163)

=− 4tr

(XV

(∫ T

0

e4α(T−s2)eATαs2XQ

α eAαs2 ds2

)V XQ

α

)

=− 4tr

(e4αTXV

(∫ T

0

eAT−αs2XQ

α eA−αs2 ds2

)V XQ

α

)=− 4tr

(e4αTXV X

XQα−α (T )V XQ

α

).

57

Page 60: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Next, we calculate T4,Bbab which happens to be equal to T4,Bbba . We have

T4,Bbba =4tr

(∫ T

0

e4αs2eAs2XV eAT s2eA

Tα (T−s2)XQ

α eAα(T−s2)V XQ

α ds2

)(B.164)

=4tr

(∫ T

0

e2αs2XV eATαT XQ

α eAα(T−s2)V XQ

α eAαs2 ds2

)

=4tr

(XV eA

TαT XQ

α

(∫ T

0

eAα(T−s2)V XQα e

A3αs2 ds2

))=4tr

(XV eA

TαT XQ

α XV XQαα,3α

).

Note that we have used the same method of calculation as we used when finding T3ba .We continue with T4,Bbbb . It equals

T4,Bbbb =− 4tr

(∫ T

0

e4αs2eAs2XV eAT s2eA

Tα (T−s2)XQ

α eAα(T−s2)V eA

Tα (T−s2)XQ

α eAα(T−s2) ds2

)

=− 4tr

(∫ T

0

e2αs2eAαTXV eATαT XQ

α eAα(T−s2)V eA

Tα (T−s2)XQ

α ds2

)(B.165)

=− 4tr

(eAαTXV eA

TαT XQ

α

(∫ T

0

e2αs2eAα(T−s2)V eATα (T−s2) ds2

)XQα

)

=− 4tr

(eAαTXV eA

TαT XQ

α

(∫ T

0

e2αT eAs2V eAT s2 ds2

)XQα

)=− 4tr

(eA2αTXV eA

T2αT XQ

αXV (T )XQ

α

).

Finally, there is T4,Babb . It equals

T4,Babb =4tr

(∫ T

0

e4αs2XV eATα (T−s2)XQ

α eAα(T−s2)V eA

Tα (T−s2)XQ

α eAα(T−s2) ds2

)(B.166)

=4tr

(∫ T

0

e4α(T−s2)XV eATαs2XQ

α eAαs2V eA

Tαs2XQ

α eAαs2 ds2

)

=4tr

(e4αT

∫ T

0

XV eAT s2XQ

α eAs2V eA

T s2XQα e

As2 ds2

).

To solve this, we need to apply theorem A.9. The result will be

T4,Babb =2tr(e4αTXV XQ

αXV XQ

α

)− 2tr

(e4αTXV eA

TT XQα e

ATXV eATT XQ

α eAT)

(B.167)

=2tr(e4αTXV XQ

αXV XQ

α

)− 2tr

(XV eA

TαT XQ

α eAαTXV eA

TαT XQ

α eAαT

).

That’s the last term we needed to solve! Now comes the arduous task of assembling E[J2T ].

58

Page 61: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

When assembling E[J2T ], it’s convenient to list all the terms we have acquired so far. So,

T1 =tr(Σ0XQα (T ))2 + 2tr(Σ0X

Qα (T )Σ0X

Qα (T ))− 2µT0 X

Qα (T )µ0µ

T0 X

Qα (T )µ0,

T2 =2tr(Σ0X

Qα (T )

)tr((−V2α

)(XQα (T )− e2αT XQ(T )

)),

T3aa =4tr(XΣ0

2α (T )XQα V X

),

T3bb =4tr(eA2αTΣ0e

AT2αT XQαX

V (T )XQα

),

T3ba + T3ab =− 8tr(

Σ0eATαT XQ

α XV XQαα,3α

),

T4,A =tr((−V2α

)(XQα (T )− e2αT XQ(T )

))2

,

T4,Baaa =4tr((

e4αT − 1

)XV XQ

α V XQα

),

T4,Bbaa =− 4tr(XXV

2α (T )XQα V X

),

T4,Baba + T4,Baab =− 8tr(e4αTXV X

XQα−α (T )V XQ

α

),

T4,Bbba + T4,Bbab =8tr(XV eA

TαT XQ

α XV XQαα,3α

),

T4,Bbbb =− 4tr(eA2αTXV eA

T2αT XQ

αXV (T )XQ

α

),

T4,Babb =2tr(e4αTXV XQ

αXV XQ

α

)− 2tr

(XV eA

TαT XQ

α eAαTXV eA

TαT XQ

α eAαT

).

However, we don’t want to find E[JT ]2 but V[JT ]. To find it, we need to subtract E[JT ]2. If wedo, we can expect some terms to cancel again. And in fact, they do, if we write E[JT ]2 in the rightway. From equation (B.74) we find that we can write E[JT ]2 as

E[JT ] =tr((

Σ0 −V

)XQα (T ) + e2αT V

2αXQ(T )

)2

(B.168)

=

(tr(Σ0X

Qα (T )

)+ tr

((−V2α

)(XQα (T )− e2αT XQ(T )

)))2

.

From the above, we can see some interesting things. If we work out the brackets for E[JT ]2, we willget three parts. One part (being tr

(Σ0X

Qα (T )

)2) equals the first part of T1, another part equals T2

and the last part equals T4,A. So we remain with V[JT ] = T1 + T3 + T4,B where for T1 we shouldignore its first part.

We’re almost able to write down an ‘elegant’ expression for V[JT ]. We only need to see whichterms we can conveniently combine with each other. We can combine terms T3ba and T4,Bbba since

they have a similar form. In fact, they’re the only terms with XV XQαα,3α . So when assembling V[JT ]

we need to addTo add: 8tr

((XV − Σ0)eA

TαT XQ

α XV XQαα,3α

). (B.169)

It is also possible to combine the terms T4,Bbbb and T4,Babb . To see how, consider

2tr(e4αTXV (T )XQ

αXV (T )XQ

α

)(B.170)

= 2tr(e4αT (XV − eATXV eA

TT )XQα (XV − eATXV eA

TT )XQα

)= 2tr

(e4αTXV XQ

αXV XQ

α

)+ 2tr

(e4αT eATXV eA

TT XQα e

ATXV eATT XQ

α

)− 4tr

(e4αT eATXV eA

TT XQαX

V XQα

)= T4,Babb + 4tr

(e4αT eATXV eA

TT XQα e

ATXV eATT XQ

α

)− 4tr

(e4αT eATXV eA

TT XQαX

V XQα

)= T4,Babb + T4,Bbbb .

59

Page 62: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

But there’s more. We can even add T3bb into this. If we use Σ(T ) = XV (T ) + eATΣ0eATT , then

we can find that

2tr(e4αTΣ(T )XQ

α Σ(T )XQα

)(B.171)

= 2tr(e4αT (XV (T ) + eATΣ0e

ATT )XQα (XV (T ) + eATΣ0e

ATT )XQα

)= 2tr

(e4αTXV (T )XQ

αXV (T )XQ

α

)+ 2tr

(e4αT eATΣ0e

ATT XQα e

ATΣ0eATT XQ

α

)+ 4tr

(e4αT eATΣ0e

ATT XQαX

V (T )XQα

)= T4,Bbbb + T4,Babb + 2tr

(e4αT eATΣ0e

ATT XQα e

ATΣ0eATT XQ

α

)+ T3bb . (B.172)

The term that’s still unresolved in the above equation actually occurs in the expansion of T1, butusing that fact will not result in any elegant expression, so we’ll just ignore that. All we rememberis that, to find V[JT ], we need to add

To add: 2tr(e4αTΣ(T )XQ

α Σ(T )XQα

)− 2tr

(e4αT eATΣ0e

ATT XQα e

ATΣ0eATT XQ

α

). (B.173)

Next, we’re going to examine T4,Bbaa . In particular, we’re going to examine XXV

2α (T ). Withsome help of theorem A.7, we can find that it equals

XXV

2α (T ) =XXV

2α − eA2αTXXV

2α eAT2αT (B.174)

=1

((XV

2α −XV )− eA2αT (XV2α −XV )eA

T2αT)

=1

(XV

2α(T )−XV + eA2αTXV eAT2αT).

This turns T4,Bbaa into

T4,Bbaa = −4tr(XV

2α(T )

4αXQα V X

)+ 4tr

(XV − eA2αTXV eA

T2αT

4αXQα V X

). (B.175)

There are two terms in the above expression. Combining the first term with T3aa results in

4tr((

XΣ02α (T )− XV

2α(T )

)XQα V X

). (B.176)

When we look at the expression for V[J ] (equation (B.88)) then this term will look familiar. Infact, it’s one of the few terms which doesn’t reduce to zero as T →∞.

Of course T4,Bbaa has a second term as well. Combining that with T4,Baaa will give us

4tr(e4αTXV (T )

4αXQα V X

). (B.177)

Then, by merging all the results involving T4,Bbaa , T3aa and T4,Baaa , we get

To add: 4tr((

XΣ02α (T )− XV

2α(T )

4α+e4αTXV (T )

)XQα V X

). (B.178)

Finally we need to add T4,Baba + T4,Baab and T1 (without the first term, which was in E[JT ]2).We cannot really put these terms in an elegant form any more than they already are. So, if weadd everything together, slightly adjusting the ordering to make it all similar to the expression forV[J ], we find our final expression,

V[JT ] =2tr(Σ0XQα (T )Σ0X

Qα (T ))− 2µT0 X

Qα (T )µ0µ

T0 X

Qα (T )µ0 (B.179)

+ 4tr((

XΣ02α (T )− XV

2α(T )

4α+e4αTXV (T )

)XQα V X

)+ 2tr

(e4αTΣ(T )XQ

α Σ(T )XQα

)− 2tr

(Σ0e

ATαT XQα e

AαTΣ0eATαT XQ

α eAαT

)+ 8tr

((XV − Σ0)eA

TαT XQ

α XV XQαα,3α

)− 8tr

(e4αTXV X

XQα−α (T )V XQ

α

).

This is what we wanted to find, so we’re done.

60

Page 63: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Things get interesting when we start comparing equation (B.134) to equation (B.88). Whathappens when T → ∞? If Aα is stable and α < 0, then all exponentials e4αT and eAαT go tozero. As a result, XQ

α (T ) goes to XQα . Because A does not have to be stable, XV (T ) may go to

infinity. However, the exponential e4αT goes to zero faster, so e4αTXV (T ) still goes to zero. Thesame thing holds for e4αT X

XQα−α (T ). So we see that, under these circumstances, equation (B.134)

indeed goes to equation (B.88) as T →∞.As a final theorem in this section, we look at the variance of JT when α = 0.

Theorem B.13. Consider the system x = Ax + v. Assume that α = 0 and that A is Sylvester.The variance V[JT ] of the finite-time cost JT of equation (3.3) is then given by

V[JT ] =2tr(Σ0XQ(T )Σ0X

Q(T ))− 2µT0 XQ(T )µ0µ

T0 X

Q(T )µ0 (B.180)

+ 4tr((XΣ0(T )−XXV (T ) + TXV

)XQV XQ

)+ 2tr

(Σ(T )XQΣ(T )XQ

)− 2tr

(Σ0e

ATT XQeATΣ0eATT XQeAT

)+ 8tr

((XV − Σ0)eA

TT XQXV XQ)− 8tr

(XV XXQ(T )V XQ

).

Proof. The proof of this theorem is nearly identical to that of theorem B.12. There are a few keydifferences though. We’ll walk through the proof step by step, considering each of the terms andhighlighting the differences.

The first term T1 remains unchanged. It’s still the same as equation (B.136), but with α = 0,and equals

T1 = tr(Σ0XQ(T ))2 + 2tr(Σ0X

Q(T )Σ0XQ(T ))− 2µT0 X

Q(T )µ0µT0 X

Q(T )µ0. (B.181)

With T2 there is a difference. Up to equation (B.137) things are the same. That is,

T2 = 2E

[∫ T

0

e2αt1xTa (t1)Qxa(t1) dt1

]E

[∫ T

0

e2αt2xTb (t2)Qxb(t2) dt2

]. (B.182)

We have found the two terms before for α = 0 in the proof of theorem B.10. (See equations (B.82)and (B.85).) If we use those results, we find that

T2 = tr(Σ0X

Q(T ))tr(TV XQ −XV XQ(T )

). (B.183)

For T3 there is no real problem when α = 0. Everything is still derived in exactly the same way,so identically to equations (B.146), (B.147) and (B.150) we have

T3aa =4tr(XΣ0(T )XQV XQ

), (B.184)

T3bb =4tr(eATΣ0e

ATT XQXV (T )XQ), (B.185)

T3ba = T3ab =− 4tr(

Σ0eATT XQXV XQ

). (B.186)

At T4 some complications arise again. First there is T4,A. Its derivation is the same as before upto equation (B.152). If we set α = 0 there, we get

T4,A = tr

(∫ T

0

∫ t

0

V eAT (t−s)QeA(t−s) ds dt

)2

. (B.187)

We have already worked out this integral before, in equations (B.84) and (B.85). Using thoseresults, we have

T4,A = tr(TV XQ −XV XQ(T )

)2. (B.188)

Next is T4,B . At T4,Baaa we find another small difference. Setting α = 0 in equation (B.161) andworking out the integral will result in

T4,Baaa = 4tr(TXV XQ

α V XQα

). (B.189)

61

Page 64: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

The rest of the terms do not have any further difficulties. They all equal the respective terms inthe previous proof when α is set to zero. Retrieving the results from equations (B.162) through(B.167), we get

T4,Bbaa =− 4tr(XXV (T )XQV XQ

), (B.190)

T4,Baba + T4,Baab =− 8tr(XV XXQ(T )V XQ

), (B.191)

T4,Bbba + T4,Bbab =8tr(XV eA

TT XQXV XQ), (B.192)

T4,Bbbb =− 4tr(eATXV eA

TT XQXV (T )XQ), (B.193)

T4,Babb =2tr(XV XQXV XQ

)− 2tr

(XV eA

TT XQeATXV eATT XQeAT

). (B.194)

Next, it’s time to merge terms together. When α = 0, the first line of equation (B.86) tells us thatE[JT ]2 can be written as

E[JT ]2 = tr((

Σ0XQ(T )

)+(TV XQ −XV XQ(T )

))2. (B.195)

This again means that we’ll lose the first part of T1, as well as T2 and T4,A. Merging all the otherterms together then takes place in an identical way as in theorem B.12. If we do that, then wearrive at our final result

V[JT ] =2tr(Σ0XQ(T )Σ0X

Q(T ))− 2µT0 XQ(T )µ0µ

T0 X

Q(T )µ0 (B.196)

+ 4tr((XΣ0(T )−XXV (T ) + TXV

)XQV XQ

)+ 2tr

(Σ(T )XQΣ(T )XQ

)− 2tr

(Σ0e

ATT XQeATΣ0eATT XQeAT

)+ 8tr

((XV − Σ0)eA

TT XQXV XQ)− 8tr

(XV XXQ(T )V XQ

),

which completes the proof.

After the above proof, we pretty much have the expectation that equation (B.134) turns intoequation (B.180) when α → 0. This is true, but seeing how that happens is not directly clearbecause of the second line. To see how exactly To see how exactly equation (B.134) turns intoequation (B.180), we should substitute

−XV2α(T )

4α+e4αTXV (T )

4α=− XV

2α − eA2αTXV2αe

AT2αT

4α+e4αTXV − eA2αTXV eA

T2αT

4α(B.197)

=− XV2α −XV

4α+ eA2αT

XV2α −XV

4αeA

T2αT − 1− e4αT

4αXV

=−XXV

2α + eA2αTXXV

2α eAT2αT − 1− e4αT

4αXV

=−XXV

2α (T )− 1− e4αT

4αXV

in equation (B.134). (Note the use of theorem A.7.) If we then again use l’Hôpital’s rule and applyequation (B.87), we indeed see that equation (B.134) turns into equation (B.180) as α→ 0.

B.3 Using matrix exponentials to find E[JT ] and V[JT ]Previously we have assumed that matrices like A, Aα and such were Sylvester. If we let go of thatassumption, then we need some a whole different method of solving our integrals. These methodsare discussed in appendix section A.2. Using these theorems, we will start with a new expressionfor E[JT ]. A new expression for V[JT ] will follow directly afterward.

62

Page 65: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Theorem B.14. Consider the system x = Ax+ v. Define C as

C =

−AT2α Q 00 A V0 0 −AT

(B.198)

and write eCt as

eCt =

Ce11(t) Ce12(t) Ce13(t)Ce21(t) Ce22(t) Ce23(t)Ce31(t) Ce32(t) Ce33(t)

. (B.199)

Then we haveE[JT ] = tr

(eA

T2αT (Ce12(T )Σ0 + Ce13(T ))

). (B.200)

Proof. To prove this, we will split up the expression for E[JT ] in two terms again, just like we didearlier in the proof of theorem B.10. For T1 we can recycle equation (B.71). That is,

T1 = tr(Σ0X

Qα (T )

), (B.201)

where XQα (T ) can be found through theorem A.14. In fact, if we take t1 = 0, t2 = T , k1 = 2 and

k2 = 0, and replace A by AT because we want to find XQα (T ) instead of XQ

α (T ), then we have

T1 = tr(

Σ0eAT2αT

[I 0

]exp

([−AT2α Q

0 A

])[0I

]). (B.202)

That solves for T1. For T2 we have to examine the integral of equation (B.72). That is,

T2 = tr

(∫ T

0

∫ t

0

e2αtV eAT (t−s)QeA(t−s) ds dt

). (B.203)

If A or Aα is not Sylvester, we cannot use Lyapunov solutions to solve this matrix. Instead, wecan rewrite it to

T2 = tr

(eA

T2αT

∫ T

0

∫ t

0

e(−AT2α)(T−t)QeA(t−s)V e(−AT )s ds dt

). (B.204)

If we subsequently replace t by s and T by t, then we can directly apply theorem A.11 to find that

T2 = tr

eAT2αT [I 0 0]

exp

−AT2α Q 00 A V0 0 −AT

T0

0I

. (B.205)

This equation is a bit long though. So to shorten it, we define C as

C =

−AT2α Q 00 A V0 0 −AT

(B.206)

and write eCt as

eCt =

Ce11(t) Ce12(t) Ce13(t)Ce21(t) Ce22(t) Ce23(t)Ce31(t) Ce32(t) Ce33(t)

. (B.207)

Now T2 can be written very briefly as

T2 = tr(eA

T2αTCe13(T )

). (B.208)

The nice thing is that we can also use this result to rewrite T1. (This follows from theorem A.11.)To be precise, we have

T1 = tr(

Σ0eAT2αTCe12(T )

). (B.209)

63

Page 66: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Now, to find E[JT ], we need to add up T1 and T2. Making convenient use of the cyclic property ofthe trace function, we find that

E[JT ] = tr(eA

T2αT (Ce12(T )Σ0 + Ce13(T ))

). (B.210)

This is what we wanted to show.

It’s interesting to note that this theorem holds for any matrix A and any discount exponent α,including non-Sylvester matrices A and zero values for α.

Now let’s turn our attention to the variance V[JT ]. We can find an expression for that in asimilar way.

Theorem B.15. Consider the system x = Ax+ v. Define C as

C =

−AT2α Q 0 0 0

0 A V 0 00 0 −AT Q 00 0 0 A2α V0 0 0 0 −AT−2α

(B.211)

and write eCt as is done in theorem B.14. Then we have

V[JT ] =2tr(((Ce44(T ))TCe12(T )Σ0 + (Ce44(T ))TCe13(T ))2 − 2(Ce44(T ))T (Ce14(T )Σ0 + Ce15(T ))

)− 2(µT0 (Ce44(T ))TCe12(T )µ0)2. (B.212)

Proof. We will set up this proof in a similar way as previously in theorem B.12. That is, we willfind terms T1, T2, T3, T4,A and T4,B . What we will subsequently do is express these terms in Ceij(T )parameters, using theorem A.13.

Let’s start with T1. From equation (B.136) we know that

T1 = tr(Σ0XQα (T ))2 + 2tr(Σ0X

Qα (T )Σ0X

Qα (T ))− 2µT0 X

Qα (T )µ0µ

T0 X

Qα (T )µ0. (B.213)

We’ve seen earlier, in the proof of theorem B.14, that XQα (T ) = eA

T2αTCe12(T ). If we also use the

fact that Ce44(T ) = eA2αT , then we have

T1 = tr(Σ0(Ce44(T ))TCe12(T )

)2+ 2tr

((Σ0(Ce44(T ))TCe12(T ))2

)− 2(µT0 (Ce44(T ))TCe12(T )µ0)2.

(B.214)Next, we find T2. From equation (B.137) we have

T2 = 2E

[∫ T

0

e2αt1xTa (t1)Qxa(t1) dt1

]E

[∫ T

0

e2αt2xTb (t2)Qxb(t2) dt2

]. (B.215)

We’ve found both terms in the proof of theorem B.14. Using those while replacing eAT2αT by

(Ce44(T ))T will give us

T2 = 2tr(Σ0(Ce44(T ))TCe12(T )

)tr((Ce44(T ))TCe13(T )

). (B.216)

The next step is finding T3. Now things get a bit tougher. We’re going to start with equa-tion (B.141), which said that

T3 = 4tr

(∫ T

0

∫ T

0

∫ min(t1,t2)

0

e2α(t1+t2)eAt2Σ0eAT t1QeA(t1−s)V eA

T (t2−s)Qds dt2 dt1

). (B.217)

The integration bounds are given by 0 ≤ s ≤ t1 ≤ T and 0 ≤ s ≤ t2 ≤ T . If we interchange theinner and the middle integral, keeping the same integration bounds, we find that

T3 = 4tr

(∫ T

0

∫ t1

0

∫ T

s

e2α(t1+t2)eAt2Σ0eAT t1QeA(t1−s)V eA

T (t2−s)Qdt2 ds dt1

). (B.218)

64

Page 67: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

The integration limits of the inner integral are now somewhat inconvenient. If we are to use anexpression from theorem A.13, we want to have a lower bound of 0 and an upper bound of s. Itcan be shown that mere subtitutions can never put it in that form without messing up the otherintegrals. Instead, we’ll split the integral up into two integrals through

T3 = 4tr

(∫ T

0

∫ t1

0

∫ T

0

. . . dt2 ds dt1 −∫ T

0

∫ t1

0

∫ s

0

. . . dt2 ds dt1

). (B.219)

We denote the above two parts of T3 as T3,A and T3,B . So we have

T3,A =4tr

(∫ T

0

∫ t1

0

∫ T

0

e2α(t1+t2)eAt2Σ0eAT t1QeA(t1−s)V eA

T (t2−s)Qdt2 ds dt1

)(B.220)

=4tr

(Σ0

(∫ T

0

∫ t1

0

e2αt1eAT t1QeA(t1−s)V e−A

T s ds dt1

)(∫ T

0

e2αt2eAT t2QeAt2 dt2

))

=4tr

(Σ0

(eA

T2αT

∫ T

0

∫ t1

0

e(−AT2α)(T−t1)QeA(t1−s)V e(−AT )s ds dt1

)(eA

T2αTCe12(T )

))=4tr

(Σ0

((Ce44(T ))TCe13(T )

) ((Ce44(T ))TCe12(T )

)).

Note that we make repeated use of theorem A.13 to replace integrals by elements of eCT . AfterT3,A, there’s of course also T3,B . It equals

T3,B =− 4tr

(∫ T

0

∫ t1

0

∫ s

0

e2α(t1+t2)eAt2Σ0eAT t1QeA(t1−s)V eA

T (t2−s)Qdt2 ds dt1

)(B.221)

=− 4tr

(Σ0e

AT2αT

∫ T

0

∫ t1

0

∫ s

0

e(−AT2α)(T−t1)QeA(t1−s)V e(−AT )(s−t2)QeA2αt2 dt2 ds dt1

)=− 4tr

(Σ0(Ce44(T ))TCe14(T )

). (B.222)

As a result, the full expression for T3 becomes

T3 = 4tr(Σ0(Ce44(T ))T

(Ce13(T )(Ce44(T ))TCe12(T )− Ce14(T )

)). (B.223)

Next, we look into T4. This will again be a step harder than the previous term. For T4,A

starting point is equation (B.152), which states that

T4,A = tr

(∫ T

0

∫ t

0

e2αtV eAT (t−s)QeA(t−s) ds dt

)2

. (B.224)

We’ve already solved this using matrix exponentials when solving equation (B.203). The resultwill be

T4,A = tr((Ce44(T ))TCe13(T )

)2. (B.225)

For T4,B our starting point is equation (B.154), which claimed that

T4,B = 2

∫ T

0

∫ T

0

∫ mi

0

∫ mi

0

e2α(t1+t2)tr(V eA

T (t1−s1)QeA(t1−s2)V eAT (t2−s2)QeA(t2−s1)

)ds2 ds1 dt2 dt1,

(B.226)where mi is still short for min(t1, t2). We want to get the integral in the form of the integral forCe15(T ) (see equation (A.100)). It’s not possible to obtain this using only substitutions. Instead,we will interchange integrals, occasionally splitting them up. If we put the fourth integral (theinner one, with respect to s2) on position two, we get

T4,B = 2

∫ T

0

∫ t1

0

∫ T

s2

∫ min(t1,t2)

0

. . . ds1 dt2 ds2 dt1. (B.227)

65

Page 68: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

We can now split this integral up according to

T4,B = 2

∫ T

0

∫ t1

0

∫ T

0

∫ min(t1,t2)

0

. . . ds1 dt2 ds2 dt1 − 2

∫ T

0

∫ t1

0

∫ s2

0

∫ min(t1,t2)

0

. . . ds1 dt2 ds2 dt1.

(B.228)In the integral on the right-hand side we have t2 ≤ s2 ≤ t1 and hence min(t1, t2) can be reduced tot2. For the left-hand side this trick doesn’t work. Instead, we’ll rearrange the order of the integralson the left-hand side. The integration bounds are given by 0 ≤ s2 ≤ t1 ≤ T , 0 ≤ s1 ≤ t2 ≤ T ands1 ≤ t1. Taking this into account, we can find that

T4,B = 2

∫ T

0

∫ t2

0

∫ T

s1

∫ t1

0

. . . ds2 dt1 ds1 dt2 − 2

∫ T

0

∫ t1

0

∫ s2

0

∫ t2

0

. . . ds1 dt2 ds2 dt1. (B.229)

We now apply the same trick of splitting up the integral. This results in

T4,B =2

∫ T

0

∫ t2

0

∫ T

0

∫ t1

0

. . . ds2 dt1 ds1 dt2 − 2

∫ T

0

∫ t2

0

∫ s1

0

∫ t1

0

. . . ds2 dt1 ds1 dt2 (B.230)

− 2

∫ T

0

∫ t1

0

∫ s2

0

∫ t2

0

. . . ds1 dt2 ds2 dt1.

If we examine the second and third set of integrals, we notice that they’re exactly the same, exceptthat t1 and t2 are interchanged, as well as s1 and s2. If we make these same substitutions in theintegrand, and use the cyclic property of the trace function, we find that the two integrals areequal. Hence,

T4,B = 2

∫ T

0

∫ t2

0

∫ T

0

∫ t1

0

. . . ds2 dt1 ds1 dt2 − 4

∫ T

0

∫ t1

0

∫ s2

0

∫ t2

0

. . . ds1 dt2 ds2 dt1. (B.231)

This means that T4,B consists of two parts. The first part, which we ‘conveniently’ call T4,B,A (thisnotation is kind of getting out of hand, isn’t it?), equals

T4,B,A = 2

∫ T

0

∫ t2

0

∫ T

0

∫ t1

0

e2α(t1+t2)tr(V eA

T (t1−s1)QeA(t1−s2)V eAT (t2−s2)QeA(t2−s1)

)ds2 dt1 ds1 dt2.

(B.232)We can split this integral up according to

T4,B,A =2tr

(∫ T

0

∫ t

0

e2αteAT tQeA(t−s)V e−A

T s ds dt

)2 (B.233)

=2tr

(eAT2αT ∫ T

0

∫ t

0

e(−AT2α)(T−t)QeA(t−s)V e(−AT )s ds dt

)2

=2tr((

(Ce44(T ))TCe13(T ))2)

.

That solves for the first part. Note that the term within the trace function is squared and not theresult of the trace function itself, which would result in something entirely different. For T4,B,B wehave

T4,B,B = −4

∫ T

0

∫ t1

0

∫ s2

0

∫ t2

0

e2α(t1+t2)tr(V eA

T (t1−s1)QeA(t1−s2)V eAT (t2−s2)QeA(t2−s1)

)ds1 dt2 ds2 dt1.

(B.234)We can rewrite this to

−4tr

(eA

T2αT

∫ T

0

∫ t1

0

∫ s2

0

∫ t2

0

e(−AT2α)(T−t1)QeA(t1−s2)V e(−AT )(s2−t2)QeA(t2−s1)V e(−A−2α)T s1 ds1 dt2 ds2 dt1

).

(B.235)

66

Page 69: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

By applying theorem A.13 we then arrive at the beautifully short expression

T4,B,B = −4tr((Ce44(T ))TCe15(T )

). (B.236)

As a result, we have

T4,B = 2tr((

(Ce44(T ))TCe13(T ))2 − 2

((Ce44(T ))TCe15(T )

)). (B.237)

This is a lot shorter than what we had in earlier theorems!The next step is to assemble everything and subsequently subtract E[JT ]2. First we’ll make an

overview of the terms we’ve found. For simplicity of writing, we will get rid of the (T ) addition.So we write (Ce44(T ))T simply as (Ce44)T and similarly for other terms. We then have

T1 =tr(Σ0(Ce44)TCe12

)2+ 2tr

((Σ0(Ce44)TCe12)2

)− 2(µT0 (Ce44)TCe12µ0)2,

T2 =2tr(Σ0(Ce44)TCe12

)tr((Ce44)TCe13

),

T3 =4tr(Σ0(Ce44)T

(Ce13(Ce44)TCe12 − Ce14

)),

T4,A =tr((Ce44)TCe13

)2,

T4,B =2tr((

(Ce44)TCe13

)2 − 2((Ce44)TCe15

)).

So this is what we need to add up. However, we also ought to subtract E[JT ]2. From theorem B.14we know that

E[JT ]2 =tr(eA

T2αT (Ce12Σ0 + Ce13)

)2

(B.238)

=tr(Σ0(Ce44)TCe12

)2+ 2tr

(Σ0(Ce44)TCe12

)tr((Ce44)TCe13

)+ tr

((Ce44)TCe13

)2.

Just as in previous calculations of V[JT ], we see that T4,A, T2 and the first part of T1 drop out.We hence remain with

V[JT ] =2tr((Σ0(Ce44)TCe12)2

)− 2(µT0 (Ce44)TCe12µ0)2 + 4tr

(Σ0(Ce44)T

(Ce13(Ce44)TCe12 − Ce14

))+ 2tr

(((Ce44)TCe13

)2 − 2((Ce44)TCe15

)). (B.239)

By combining terms within brackets, we can rewrite the above to

V[JT ] = 2tr(((Ce44)TCe12Σ0 + (Ce44)TCe13)2 − 2(Ce44)T (Ce14Σ0 + Ce15)

)− 2(µT0 (Ce44)TCe12µ0)2.

(B.240)Amazing! It’s an equation for V[JT ] which fits on one line. It’s also equal to what we wanted toprove, so we’re done.

With the above theorem, we’ve derived a relatively easy way of finding V[JT ] for any matrixA and any value of α. And once we’ve calculated eCT to find V[JT ], we can even use the result todirectly find E[JT ] through

E[JT ] = tr((Ce44(T ))T (Ce12(T )Σ0 + Ce13(T ))

). (B.241)

The main downside is that this method of using matrix exponentials doesn’t work for T → ∞.For that case, we still need Lyapunov solutions. Otherwise, the necessity of Lyapunov solutions isgone.

67

Page 70: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

Bibliography

[AM90] Brian D. O. Anderson and John B. Moore. Optimal Control: Linear Quadratic Methods.Prentice Hall, 1990.

[Ant05] Athanasios C. Antoulas. Approximation of Large-Scale Dynamical Systems. Society forIndustrial and Applied Mathematics (SIAM), 2005.

[BKM08] Okko H. Bosgra, Huibert Kwakernaak, and Gjerrit Meinsma. Design Methods for ControlSystems. Dutch Institute of Systems and Control (DISC), 2008.

[BS72] R.H. Bartels and G.W. Stewart. Solution of the matrix equation AX +XB = C. Commu-nications of the ACM, 15:820–826, 1972.

[CH08] Richard Conway and Roberto Horowitz. A quasi-Newton algorithm for LQG control designwith variance constraints. In Proceedings of the Dynamic Systems and Control Conference,Ann Arbor, Michigan, USA, 2008.

[CS99] Emmanual G. Collins and Majura F. Selekwa. Fuzzy quadratic weights for variance con-strained LQG design. In Proceedings of the 38th IEEE Conference on Decision & Control,Phoenix, Arizona, USA, 1999.

[Dav80] Robbert B. Davies. Algorithm AS 155: The distribution of a linear combination of χ2

random variables. Applied Statistics, 29:323–333, 1980.

[KAW14] Bei Kang, Chukwuemeka Aduba, and Chang-Hee Won. Statistical control for perfor-mance shaping using cost cumulants. IEEE Transactions on Automatic Control, 59:249–255,2014.

[Ken81] David A. Kendrick. Stochastic Control for Economic Models. McGraw-Hill, 1981.

[MP92] Arakaparampil M. Mathai and Serge B. Provost. Quadratic Forms in Random Variables.Taylor & Francis, 1992.

[Å70] Karl J. Åström. Introduction to Stochastic Control Theory. Academic Press, 1970.

[Ric44] Stephen O. Rice. Mathematical analysis of random noise. Bell System Technical Journal,23:282–332, 1944.

[Sch70] Morton I. Schwartz. Distribution of the time-average power of a Gaussian process. IEEETransactions on Information Theory, 16:17–26, 1970.

[SL71] Michael K. Sain and Stanley R. Liberty. Performance-measure densities for a class of LQGcontrol systems. IEEE Transactions on Automatic Control, 16:431–439, 1971.

[SP05] Sigurd Skogestad and Ian Postlethwaite. Multivariable Feedback Control: Analysis andDesign. John Wiley & Sons, 2005.

[vL78] Charles F. van Loan. Computing integrals involving the matrix exponential. IEEE Trans-actions on Automatic Control, 23:395–404, 1978.

68

Page 71: jwvanwingerden/documents/LQGInternalReport.pdfChapter 1 Introduction ThisinternalreportisasupportingdocumentforapapertobepublishedintheIEEETransactions …

[WAG14] Niklas Wahlström, Patrix Axelsson, and Fredrik Gustafsson. Discretizing stochastic dy-namical systems using Lyapunov equations. In Proceedings of the 19th IFAC World Congress,2014.

[WG02] Chang-Hee Won and Kodikara Thanuja Gunaratne. Performance study of LQG, MCV,and risk-sensitive control methods for satellite structure control. In Proceedings of the Amer-ican Control Conference, Anchorage, Alaska, USA, 2002.

[WSM08] Chang-Hee Won, Cheryl B. Schrader, and Anthony N. Michel. Advances in Statisti-cal Control, Algebraic Systems Theory, and Dynamic Systems Characteristics. BirkhäuserBoston, 2008.

69