automatic control of state space systems - pantelis sopasakis

8/3/2019 Automatic Control of State Space Systems - Pantelis Sopasakis

1/93

The Realm of Linear State Space Control Systems

Draft Version

Pantelis Sopasakis

Dipl. Chem. Eng., Msc. Appl. Math.,

National Technical University of Athens

November 5, 2011
mailto:[email protected]:[email protected]


2/93

Preface

Usually, the starting point for the study of Automatic Control is the LaplaceSpace of the complex variable s C, sometimes referred to as the ComplexFrequency Domain, or the

Z-transform for digital systems. Using the Laplace

transform, the differential equations that accrue from the various principles ofmechanics or chemistry, become algebraic equations which can be easily manip-ulated. However, certain assumptions render this approach rather restrictive.Firstly, the Laplace Transform cannot be applied on arbitrary functions or insome cases it becomes cumbersome to invert it. Moreover, the requirement forzero initial conditions yields a rather stiff theory.

The State Space approach copes directly with the differential equations of thesystem and exhibits the following advantages:

1. Single Input Single Output (SISO) and Multiple Input Multiple Output(MIMO) systems are treated using the same framework.

2. New concepts such as Controllability and Observability are defined andstudied methodologically. The state space representation is more appro-priate for the theoretical study of the underlying system.

3. Offers a better insight into the systems structure as its states are explicitlystudied in contrast to a sigle-input single-output black box.

4. Analytical solutions are often available. In general, numerical solutionalgorithms (e.g. The Runge-Kutta algorithm)

5. The state-space formulation is the basis for the study of nonlinear systems.

The results outlined in these notes apply mainly to linear time-invariantsystems. The author has endeavoured to provide a concise and rigorous but

yet understandable presentation of some of the most interesting results in lineardynamical systems theory.

Finally, the State Space representation of dynamical systems is the basis forthe modern theory of Model Predictive Control and the ubiquitous LyapunovTheory of Stability.

Please cite these notes as follows:Pantelis Sopasakis, The Realm of Linear State Space Control Systems, 2011,Available online at http://users.ntua.gr/chvng/en (Accessed on: November 5,2011).

1
http://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methodshttp://users.ntua.gr/chvng/en/http://users.ntua.gr/chvng/en/http://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods


3/93

Nobody knows why,but the only theories which work are the mathematical ones.,

(Michael Holt, in Mathematics in Art.)


4/93

Contents

1 Introduction 4

2 Classes of State Space Models 62.1 Linear Time Invariant Models . . . . . . . . . . . . . . . . . . . . 62.2 Linear Time Variant Models . . . . . . . . . . . . . . . . . . . . . 72.3 Input Affine Nonlinear Models . . . . . . . . . . . . . . . . . . . 72.4 Bilinear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3 Coordinates Transformations 8

4 Realizations of LTI systems 12

4.1 Equivalence of LTI systems . . . . . . . . . . . . . . . . . . . . . 124.2 Diagonal Realization . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.2.1 Diagonalizability Criteria . . . . . . . . . . . . . . . . . . 164.2.2 Only real eigenvalues . . . . . . . . . . . . . . . . . . . . . 18

4.2.3 Complex Diagonalization . . . . . . . . . . . . . . . . . . 194.2.4 Real Diagonalization with complex eigenvalues . . . . . . 204.2.5 Why is diagonalization useful? . . . . . . . . . . . . . . . 22

4.3 Jordan Canonical Form . . . . . . . . . . . . . . . . . . . . . . . 244.3.1 The Jordan Decomposition . . . . . . . . . . . . . . . . . 244.3.2 Utility of the Jordan form . . . . . . . . . . . . . . . . . . 27

4.4 Canonical Controllable Form . . . . . . . . . . . . . . . . . . . . 294.5 Canonical Observable Form . . . . . . . . . . . . . . . . . . . . . 32

5 Trajectories of LTI systems 33

5.1 General Properties of Solutions . . . . . . . . . . . . . . . . . . . 335.2 The state transition matrix . . . . . . . . . . . . . . . . . . . . . 405.3 Responses of LTI systems . . . . . . . . . . . . . . . . . . . . . . 41

5.3.1 The Impulse Response . . . . . . . . . . . . . . . . . . . . 415.3.2 The Step Response . . . . . . . . . . . . . . . . . . . . . . 42

6 Controllability 44

6.1 Controllability of LTI systems . . . . . . . . . . . . . . . . . . . . 446.2 Controllability and Equivalence . . . . . . . . . . . . . . . . . . . 486.3 Advanced topics . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

6.3.1 Controllability under feedback . . . . . . . . . . . . . . . 566.3.2 Connection to the Diagonal Realization . . . . . . . . . . 576.3.3 Controllability Gramian . . . . . . . . . . . . . . . . . . . 57

2


5/93

6.4 Controllability and Cyclicity . . . . . . . . . . . . . . . . . . . . . 60

6.4.1 Introduction to cyclicity . . . . . . . . . . . . . . . . . . . 606.4.2 From cyclicity to controllability . . . . . . . . . . . . . . . 65

7 Stability 68

7.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687.2 Some preliminary results . . . . . . . . . . . . . . . . . . . . . . . 717.3 Pole Placement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

7.3.1 Pole Placement for single input systems . . . . . . . . . . 737.3.2 Pole Placement for systems with multiple inputs . . . . . 79

7.4 Stabilizability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 817.5 Lyapunov Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

7.5.1 Direct Lyapunov Theory . . . . . . . . . . . . . . . . . . . 83

3


6/93

Chapter 1

Introduction

The Laplace transform and the use of transfer functions offer great flexibilityregarding the study of a systems dynamics and stability. However, the require-ment for the initial state to be the origin proves to be an important drawback.The state space approach provides a more generic framework for the study ofdynamical systems. Although in what follows we will stick to the study of lin-ear systems, state space models can be used for the representation and study ofnonlinear systems and systems that are not Laplace-transformable. The systemdynamics evolves through the state variables:

x(t) = [x1(t) x2(t) xn(t)] Rnwhich come to the outer world of the system as output variables:

y(t) = [y1(t) y2(t) yp(t)] = h(x(t), u(t)) Rp

where h : RnRm Rp is a function that defines the state-to-output behaviourof the system. Finally, let u : R t u(t) Rm be the input variable to thesystem which we assume that is a controlled variable. In what follows we willnot consider any disturbances.

A state space model consists of two main parts which describe the input-to-state dynamics and the state-to-output relations of the underlying system. Inparticular these are:

1. The state equations: They describe the dynamic relationship betweenthe input variable u and the state vector x. Usually these are ordinarydifferential equations of the form: x(t) = f(t, x(t), u(t)) (time-variantsystem) or x(t) = f(x(t), u(t)) (time-invariant system). In these noteswe will study only time-invariant systems where the vector field f : Rn Rm Rn is a linear functional of the form f(x, u) = Ax + Bu where A

and B are matrices of proper dimensions.

2. The output equations: Describe the state-to-output behaviour of thesystem through relations of the form y(t) = h(x(t), u(t)).

Overall, a state space dynamical system admits the following general real-ization:

x(t) = f(x(t), u(t)) (1.1a)

y(t) = h(x(t), u(t)) (1.1b)

4
http://en.wikipedia.org/wiki/Laplace_transformhttp://en.wikipedia.org/wiki/Laplace_transform


7/93

Hereinafter we shall use the notation [x, u, y] to refer to a system with

state variable x, input variable u and output variable y. The notion of anequilibrium point applies to state space models as follows:

Definition 1 (Equilibrium point). A point (x, u) Rn Rm is called anequilibrium point for (1.1a) if f(t, x, u) = 0 for all t R

5


8/93

Chapter 2

Classes of State Space

Models

First we consider of the most general form of state space systems which isnonlinear and time-variant. We shall refer to this class of systems as G.These have the following structure:

x(t) = f(t, x(t), u(t)) (2.1a)

y(t) = h(t, x(t), u(t)) (2.1b)

In what follows we use the symbols n, m and p respectively to refer to thedimensions of the state, the input and the output vectors; usually it is m nand p n. Little can be inferred from this general representation consistingof (2.1a) and (2.1b) without any further assumptions. In most cases, statespace models have a more specific structure which allows us to derive particularresults. Dominantly, we have the following classes of systems according to theform of the vector fields f and h.

2.1 Linear Time Invariant Models

These are models in which the vector field f has the special form:

f(t, x, u) = Ax(t) + Bu(t) (2.2)

Where A Mn(R) is a square matrix and B Mnm(R). The state-to-outputrelation is described by:h(t, x, u) = Cx(t) + Du(t) (2.3)

where C Mpn(R) and D Mpm(R) while it is quite common that D = 0therefore Linear Time-Invariant (LTI) models appear to be even simpler. AnLTI system is represented as:

LTI = (A, B, C, D) (2.4)

6


9/93

Here is an example of an LTI system with n = 2 and m = p = 1:

x(t) = 2 3

1 1

x(t) + 1

0

u(t) (2.5)

y(t) =

1 0

x(t) = x1(t) (2.6)

For LTI systems, according to definition 1, the pair (x, u) is an equilibriumpoint if Ax + Bu = 0. If we choose x = 0 then the requirement becomesBu = 0 or equivalently u kerB.

2.2 Linear Time Variant Models

Linear Time-Variant (LTV) models is a more general class than the above de-

scribed one of LTI systems in which the coefficients A, B, C, D can vary withtime. LTV models are represented as follows:

x(t) = f(t, x(t), u(t)) = A(t)x(t) + B(t)u(t) (2.7)

y(t) = h(t, x(t), u(t)) = C(t)x(t) + D(t)u(t) (2.8)

Fruitful results arise with the assumption that these matrices are periodic func-tions of t. LTV systems will be referred to hereinafter as LTV.

2.3 Input Affine Nonlinear Models

In this case the function f has the following form:

f(t, x, u) = (x) +mi=1

gi(x) ui(t) = (x) + g(x) u (2.9)

where : Rn Rn and g : Rn Rn are smooth functions. These modelsdescribe a wide range of physical systems however in the most general case andwithout any further assumptions the derived results are only local; around aneighbourhood of an equilibrium point.

2.4 Bilinear Models

Bilinear models form a subclass of input affine models particularly useful for

modelling of chemical processes. Additionally they possess a simpler structurethan the one of input affine models therefore one can derive richer results. Bi-linear models have the form:

x(t) = f(t, x(t), u(t)) = Ax(t) + Bu(t) + x(t)Su(t) (2.10)y(t) = h(t, x(t), u(t)) = Cx(t) + Du(t) (2.11)

where S Mnm(R).

7


10/93

Chapter 3

Coordinates

Transformations

A coordinate transformation is a technique used to simplify the structure ofa given model and can be applied to any of the categories of chapter 2. Acoordinate transformation or change of coordinates is defined as follows:

Definition 2 (Coordinates transformation). A function : S S whereS Rn an open set, is called a coordinates transformation on S (or a localdiffeomorphism on S) if the following hold:

1. is injective, i.e. for all y S there is a s S such that (s) = y.

2. is invertible, i.e. there is a function1 such that 1((x)) = x forall x S

3. The functions and 1 are infinitely many times differentiable.

Additionally, if S = Rn is said to be a global change of coordinates.

Given a system [x, u] with state variable x and input variable u and withstate equation:

x(t) = f(x, u) (3.1)

and a change of coordinates

z = (x) (3.2)

Then

z(t) =d

dt(x(t)) =

d

dx(x(t))x(t) (3.3)

= (x(t))f(x, u) (3.4)

But since is invertible (see definition 2), from (3.2) we have that x = 1(z).Hence:

z(t) = (1(z))f(1(z), u) f(z, u) (3.5)

8


11/93

The output of the transformed system will be:

y(t) = h(x, u)

= h(1(z), u) (3.6)

Example 1. Consider of an LTI system LTI = (A, B, C, D) and a linearcoordinates transformation z = (x) = Tx where T is a nonsingular matrix(i.e. invertible). [It is easy for the reader to verify that is indeed a coordinatestransformation]. Since T is invertible, one has that:

x = T1z (3.7)

Substitution to the differential equation of the system yields:

x

=Ax

+Bu

(3.8) T1z = AT1z + Bu (3.9) z = TAT1z + TBu (3.10)

The same way, we have that the state-to-output relation in the new coordinateswill be:

y = Cx + Du (3.11)

= CT1z + Du (3.12)

So the transformed system is again the LTI system LTI

[z, u, y] where:

LTI[z, u, y] = (TAT1, TB, CT1, D) (3.13)

We leave it to the reader as an exercise to verify that the mapping:

z = (x) = x2x (3.14)is a coordinates transformation over Rn. Show that:

d

dx(x) = 2(1x)x (3.15)

where 1 = [1 1 . . . 1] and 1x =

xi. Show that the mapping is invertible.Calculate the inverse mapping 1(z) = (x); for that purpose note that takingthe norm on both sides of (3.14) we have that z2 = x3. Finally, apply (3.14)on an LTI system using (3.5) and (3.6).

Definition 3. Two square matrices A and A are called similar if there is aninvertible matrix T such that A = TAT1.

Proposition 1. Let A, A Mn(R) be two similar matrices. Then they havethe same eigenvalues.

Proof. Since A and A are similar there is a T GLn(R) so that A = TAT1.Let C be an eigenvalue of A and v Rn be the corresponding eigenvector.Then

Av = v

9


12/93

Left-multiplying both sides with T yields:

TAv = Tv

Since T is full-rank (which is true, since T is invertible) there is a vector u sothat u = T1v. Therefore:

TAT1u = u Au = uwhich proves that is an eigenvalue ofTAT1 with eigenvector u = T1v.

Proposition 2. Let A and B be two similar matrices with B = TAT1 withT GL(n,R). If v is an eigenvector of A, then Tv is an eigenvector of B.Proof. Let v be an eigenvalue of A and the corresponding eigenvalue. Thenit holds that:

Av = v (3.16)

T1BTv = v (3.17) B(Tv) = (Tv) (3.18)

which suggests that Tv is an eigenvalue of B.

Changes of coordinates are also used to establish the notion of equivalencebetween dynamical systems. Intuitively speaking, two systems are said to beequivalent if they exhibit qualitatively similar dynamic behaviour.

Definition 4 (Equivalent systems). Two systems 1[x, u] and2[z, u] are saidto be equivalent if there is a change of coordinates that transforms 1 into

2. In that case we use the notation 1 2. In particular when we need toemphasize that two systems are equivalent using a change of coordinates , wewrite 1 2.

As expected, the binary relation is transitive, that is for three systems 1,2 and 3, if 1 2 and 2 3, then 1 3. Is is also obvious that thisrelation is symmetric meaning that if 1 2 then 2 1 as well. Finally forevery system it holds that . As a result, is an equivalence relation.

Whenever we need to tell whether two systems 1 and 2 are equivalent,according to the definition we have to find a coordinates transformation sothat 1 2. It is however easier to determine a system so that 1 and 2 . In the next chapter we will describe this procedure in detail.

Example 2. The following LTI system is given:

x(t) =

1 0 00 1 1

0 0 2

x(t) +

12

3

u(t) (3.19a)

y(t) = x(t) (3.19b)

And the change of coordinates z = Tx where:

T =

1 0 00 1 1

0 0

2

(3.20)

10


13/93

The matrix T is invertible (as it can be easily seen that |T| = 2 = 0) soz = Tx establishes a linear coordinates transformation. The inverse matrix ofT is:

T1 =

1 0 00 1 22

0 022

(3.21)

This change of coordinates yields the following LTI system:

z(t) =

1 0 00 1 0

0 0 2

z(t) +

11

3

2

u(t) (3.22a)

y(t) = 1 0 00 1 22

0 022

z(t) (3.22b)Notice that in the transformed system, the matrix that multiplies the state vec-tor in (3.22a) is diagonal which simplifies the structure of the system and anyconclusion will be derived much easier.

11


14/93

Chapter 4

Realizations of LTI systems

In this chapter we will introduce certain linear coordinates transformations thatallow us to simplify the structure of LTI systems. Equivalence between LTIsystems will be studied in more detail and various criteria for equivalence willbe formulated.

4.1 Equivalence of LTI systems

Consider of the following state space model:

x(t) = Ax(t) + Bu(t) (4.1a)

y(t) = Cx(t) + Du(t) (4.1b)

and let us assume that x(0) = 0. Let us use the notation U(s) L{u(t)}(s)and X(s) L{x(t)}(s) for the Laplace transforms ofu(t) and x(t) respectively.Applying the Laplace transform on (4.1a) and (4.1b) we get:

4.1a sX(s) = AX(s) + BU(s) (4.2) (sI A)X(s) = BU(s) (4.3) X(s) = (sI A)1BU(s) (4.4)

and for the second equation we have:

4.1b Y(s) = CX(s) + DU(s) (4.5)

Y(s) = C(sI A)1B + DU(s) (4.6)So the way the input affects the output is reflected in the following transferfunction:

H(s) = C(sI A)1B + D (4.7)which is defined for all s C except for those for which the matrix sI A issingular (i.e. its determinant is zero: |sI A| = 0), that is for the eigenval-ues of the matrix A. Before proceeding to the statement of a very importantproposition we give the following result:

12


15/93

Proposition 3. Let1 and2 be two LTI systems with 1 2. Then there isa nonsingular matrix T Mn(

R

) so that the change of coordinates z = (x) =Tx is such that 1 2.Proof. The proof is left to the reader as an exercise. Hint: Use ( 3.6).

The meaning of proposition 3 is that if two linear time-invariant systems areequivalent then they are equivalent by means of a linear transformation. It isquite easy to intuitively guess that this holds but it is good to verify it rigorously.

A very interesting result is stated in the following proposition:

Proposition 4. Let1 = (A, B, C, D) be an LTI system andH1(s) its transfer function and let2 be an LTI system with 1 2 and transfer function H2(s).Then H1(s) = H2(s).

Proof. Since 1 is equivalent to 2 = (A, B, C, D), according to proposition 3,there is a change of coordinates z = F(x) so that 2 F1 and since 1, 2 areLTI, F(x) is a linear function, that is there is an invertible matrix T such thatF(x) = Tx. The transfer function of 2 will then be:

H2(s) = C(sI A)1B + D (4.8)= CT1(sI TAT1)1TB + D (4.9)= C(sI T1TAT1T)1B + D (4.10)= C(sI A)1B + D (4.11)= H1(s) (4.12)

Note that in the above algebraic manipulations we used the fact that for everyinvertible matrices A and B, AB is invertible and (AB)1 = B1A1. Equa-tion (4.12) holds for all s C except for those that are eigenvalues of A. Thiscompletes the proof.

Corollary 5. Two LTI systems are equivalent if and only if they have the sameimpulsive response.

The following proposition allows us to compare any LTI systems:

Proposition 6. Let 1 = (A, B, C, D) and 2 = (A, B, C, D) be two LTIsystems, where A, A Mn(R), B, B Mnm(R), C, C Mpn(R) andD, D Mpm(R). The following are equivalent:

1. 1 22. D = D and for k = 0, 1, 2, . . . , n 1 it holds that CAkB = CAkB

Proof. 1 = 2. If the LTI systems 1 and 2 are equivalent, according toproposition 3 there is an invertible matrix T Mn(R) so that 2 T 1. ThenD = D and :

CAkB = CT1(TAT1)kTB (4.13)= CT1TAkT1TB (4.14)= CAkB (4.15)

13


16/93

and it holds for all k = 0, 1, . . . , n 1.

2 = 1. In order to show that the two systems are equivalent, accordingto corollary 5 it suffices to show that they have the same impulsive response.Without infringing the generality we may assume that D = D = 0. The impul-sive response of 1 is:

y1(t) = L1{H1(s)} = CeAtB (4.16)Applying the Taylor expansion formula on (4.16) around t = 0 we get:

y1(t) = CB + CABt + CA2B

t2

2!+ + CAkB t

k

k!+ (4.17)

And the impulsive response of 2 will be:

y2(t) = CB + CABt + CA2B

t2

2!+ + CAkB tk

k!+ (4.18)

According to the Cayley-Hamilton theorem from Linear Algebra if CAkB =CAkB holds for k = 0, 1, 2, . . . , n 1 then it also holds for all k N andconsequently y1 = y

2 which completes the proof.

Note: The matrices k = CAkB Mn(R) for k = 0, 1, . . . , n 1 are known as

Markov parameters of the LTI system.

Example 3. We will show that the following systems are equivalent. System1 is given by:

x(t) = 1 0 00 1 10 0 2

x(t) + 123

u(t) (4.19)y(t) = x(t) (4.20)

and 2 is given by:

z(t) =

1 0 00 1 0

0 0 2

z(t) +

11

3

2

u(t) (4.21)

y(t) =

1 0 0

0 122

0 022

z(t) (4.22)

Here, we have n = 3. We calculate the Markov parameters of the two systems:

CB =

12

3

and CB =

1 0 00 1 22

0 022

11

3

2

=

12

3

= CB (4.23)

The same way we have:

CAB =

15

6

(4.24)

14


17/93

and

CAB =

1 0 00 1 220 0

22

1 0 00 1 00 0 2

113

2

(4.25)

=

12

3

= CAB (4.26)

(4.27)

And finally:

CA2B =

11112

(4.28)

while

CA2B =

1 0 00 1 22

0 022

1 0 00 1 0

0 0 2

2 11

3

2

(4.29)

=

111

12

= CA2B (4.30)

(4.31)

Let us note here that if = diag{dj}jJ is a diagonal matrix, then its kth poweris given by k = diag

{dkj }jJ.

In the sequel we introduce certain linear change of coordinates so that alinear system acquires a convenient form which allows its further study.

4.2 Diagonal Realization

In example 3 the equivalent system the matrix A had a diagonal form. This isin general feasible when A has n linearly independent eigenvectors as stated intheorem 7.

Theorem 7 (Spectral Decomposition). Assume that A Mn(R) has n linearlyindependent eigenvectors. Then there are matrices V Mn(R) and Mn(C)so that A is decomposed into:

A = VV1 (4.32)

In particular

=

1(A)2(A)

. . .

n(A)

(4.33)

is a diagonal matrix with the eigenvalues of A and V = [v1 vn] is thematrix of eigenvectors of A where vi is the eigenvector that corresponds to theeigenvalue i(A). Then A V and A is said to be diagonalizable.

15


18/93

Proof. Assume that there is some V GL(n,R) so that V1AV = =diag{i}

n

i=1 is diagonal (i.e. A admits a diagonal realization). A and aresimilar, hence they have the same eigenvalues, namely {i}ni=1. The corre-sponding eigenvectors of are e1 = (1, 0, . . . , 0), e2 = (0, 1, . . . , 0), ... anden = (0, 0, . . . , 1). Then the eigenvectors ofA are the vectors Ve1, Ve2, . . . , Ven(see proposition 2) which are again linearly independent.

Assume now that A has n linearly independent eigenvectors, namely {ti}ni=1and define:

V

t1 t2 . . . tn

(4.34)

Let {i}ni=1 be the eigenvalues of A. Then one has that:

V

diag

{i

}ni=1 = t1 t2 . . . tn

12

.. .

n

(4.35)

=

1t1 2t2 . . . ntn

(4.36)

=

At1 At2 . . . Atn

(4.37)

= AV (4.38)

Therefore,

diag{i}ni=1 = V1AV (4.39)which completes the proof.

The diagonalization of an LTI system is carried out using the linear trans-formation z = (x) = V1x. Then, the matrix A = V1AV is diagonal (recallA = TAT1 and here T = V1). Of course there are systems that do notaccept a diagonal form (In that case the Jordan Canonical Form is employed).Diagonalization, if possible, provides complete input-to-state decoupling, i.e. itbecomes clear which inputs affect which states and how much. The dynamicsbecome much simpler and it becomes easy to solve analytically the system dy-namics from which the state trajectory accrues. In what follows we assume thatA is diagonalizable.

4.2.1 Diagonalizability Criteria

We know that a matrix A Mn(R) is diagonalizable if and only if it has nlinearly independent eigenvectors, i.e. when its eigenvectors span the whole Rn.This implies that one needs to calculate all eigenvectors of A to check whetherit is diagonalizable and this is not an easy task from the computational pointof view. Here we give an alternative criterion which can be checked easier. Wegive the following necessary definitions from Linear Algebra:

Definition 5 (Characteristic Polynomial). Let A Mn(K) where K is R orC. The following polynomial:

A() = det(A I) (4.40)is called the characteristic polynomial of A and is a polynomial overC.

16


19/93

Definition 6 (Eigenvalues, Algebraic Multiplicity). If is a root of the polyno-

mial A

of multiplicity , it is called an eigenvalue of A and is its algebraicmultiplicity. An eigenvalue is called simple if its algebraic multiplicity is 1.

Hereinafter we shall denote the algebraic multiplicity of an eigenvalue by ().

Definition 7 (Eigenspace, Geometric Multiplicity). Let be an eigenvalue ofA. The following vector space:

V() = ker(A I) = {x Rn|(A I)x = 0} (4.41)is called the eigenspace of and its dimension is the geometric multiplicity ()of the eigenvalue .

Note: The geometric multiplicity of an eigenvalue can be calculated using the

fact that for any matrix X it holds that dim ker X = nrank X (from the rank-nullity theorem, see [Rom00, pp. 57-58]). Therefore () = dim ker(A I) =n rank(A I).Proposition 8. For every eigenvalue of a matrix A it holds that

() () (4.42)Definition 8 (Semisimple eigenvalue). ss An eigenvalue is said to be semisim-ple if () = ().

Proposition 9. A matrix is diagonalizable if and only if all its eigenvalues aresemisimple.

Proofs to propositions 8 and 9 can be found in [Mey00]. From proposition 9 itis clear that if a matrix in Mn has n distinct eigenvalues, then all of them aresimple and a fortiori semisimple, thus the matrix is diagonalizable.

Example 4 (Diagonalizability). We need to check whether the following matrixis diagonalizable:

A =

5 1 05 1

5

(4.43)

Its characteristic polynomial is:

A() = ( 5)3 (4.44)

So, it has one eigenvalue = 5 with algebraic multiplicity (5) = 3. Thecorresponding eigenspace is:

V(5) = ker(A 5I) = ker 0 1 00 0 1

0 0 0

(4.45)

From the rank-nullity theorem we have that:

dimV(5) = dim ker 0 1 00 0 1

0 0 0

= 3 rank

0 1 00 0 1

0 0 0

= 1 (4.46)

17


20/93

That is

(5) = 1 < (5) (4.47)Therefore = 5 is not semisimple and A is not diagonalizable.

In this particular case we could reach the same conclusions a bit differently.Using reductio ad absurdum, let us assume that A is diagonalizable. Then isshould be similar to the following matrix:

L =

5 0 00 5 0

0 0 5

= 5I (4.48)

Then, there exists a T GL(n,R) such that:

A = T1LT (4.49)= T15IT (4.50)= 5I (4.51)

which is obviously not true. In fact, the only matrix that is similar to L is itself !

4.2.2 Only real eigenvalues

If A is diagonalizable and has only real eigenvalues then the system in the newcoordinates accepts the following representation In the new coordinates, thesystem looks as follows:

d

dt

z1z2...

zn

=

1 2. . .

n

z1z2...

zn

+

bt1

bt2...

btn

u1...

um

(4.52)

where bi Rm and each state (in the new coordinates z = V1x) satisfies anequation of the form:

dzi(t)

dt= izi(t) + b

tiu(t) (4.53)

Which is easily solved to give (For details see proposition 19):

zi(t) = eitzi(0) +

t

0

ei(ts)btiu(s)ds (4.54)

Uncontrollable states show up as the bi coefficient will be 0. This way it isunderstood that there is no way the input can affect some states - a fact thatwas not obvious in the initial formulation. Equation (4.54) for bi = 0 becomes:

zi(t) = eitzi(0) (4.55)

from which it is now obvious that zi is not affected by the input.

Note: To this end, we examine the special case where the input is scalar(m = 1) and we provide a coordinates transformation so that the elements

18


21/93

of B = V1B are either 0 or 1. So, in that case the systems dynamics are

given by an equation of the following form:

d

dt

z1z2...

zn

=

12

. . .

n

z1z2...

zn

+

b1b2...

bn

u(t) (4.56)

For all i = 1,...,n such that bi = 0 we introduce the variable i(t) = zi(t)bi andin case bi = 0 we have i(t) = zi(t). For example in case all bi = 0 the systembecomes:

ddt

1

2...n

= 1

2 . . .

n

1

2...n

+ 1

1...1

u(t) (4.57)

4.2.3 Complex Diagonalization

The aforementioned procedure applies also when A has complex eigenvalues.However this leads to linear differential equations with complex coefficients thatare hard to interpret. So for example the system:

d

dt

x1x2

=

1 1

1 1

x1x2

+

11

u(t) (4.58)

is diagonalized using the complex mapping z = Tx with T GL(n,C) (meaningthat z Cn) given by:

T =

1 j1 j

(4.59)

In the new coordinates the system becomes:

d

dt

z1z2

=

1 j

1 +j

z1z2

+

1 +j1 j

u(t) (4.60)

Obviously, the states of this system are complex - not exactly the simplified ver-sion of the system we had in mind. Optionally, we may introduce new variablesin order to normalize all input coefficients to 1:

1 =z1

1 +j, 2 =

z21 j (4.61)

and the system in -coordinates becomes:

d

dt

12

=

jj

12

+

11

u(t) (4.62)

Although, one may use the complex diagonalization of a system to draw con-clusions regarding its dynamics, it is generally advisable to use the real diago-nalization technique that is provided in the next section.

19


22/93

4.2.4 Real Diagonalization with complex eigenvalues

Complex Eigenvalues of Real matrices appear in conjugate pairs. So, if i isa complex eigenvalue of A with i = a + jb (where j =

1) then there issome other eigenvalue k, k = i of A so that k = i = a jb (Without lossof generality we assume that k = i + 1). This gives rise to a pair of differentialequations of the form:

zi(t) = (a +jb)zi(t) + u(t) (4.63)

zi+1(t) = (a jb)zi+1(t) + u(t) (4.64)Assume for simplicity and without loss of generality that the state and outputvector have the same dimension. Then, at the same time, the state-to-outputrelation is given by:

y(t) = c1 + j j cn x(t) + du(t) (4.65)From (4.63) and (4.64) we make the simple observation that zi(t) and zk(t) areconjugate. Assume that zi(t) is written in the form:

zi(t) = (t) +j(t) (4.66)

where (t) and (t) stand for the real and the imaginary parts of zi(t). Then

zi+1(t) = (t) j(t) (4.67)Therefore the dynamics of zi+1(t) can be simply retrieved from the dynamics ofzi(t) - a fact that renders zi+1(t) redundant. Now, if we substitute (4.66) into(4.63) we get:

zi(t) (t) +j(t) = (a +jb)((t) +j(t)) + u(t) (4.68)= (a(t) b(t) + u(t)) +j(a(t) + b(t)) (4.69)

This yields the following pair of differential equations:

(t) = a(t) b(t) + u(t) (4.70)(t) = b(t) + a(t) (4.71)

or in matrix form:

d

dt

=

a bb a

+

10

u(t) (4.72)

If we replace zi and zi+1 with and we come up with the following systemof equations:

d

dt(t) =

12

. . . a bb a

. . .

n

(t) +

11...10...1

u(t) (4.73)

20


23/93

where

(t)

z1z2......

zn

(4.74)

Notice that in this case the matrix that appears in ( 4.73) is no more diagonal,but it is block-diagonal. In particular, for every real eigenvalue ofA in the initialcoordinates, the corresponding diagonal entry is the eigenvalue itself, while foreach conjugate complex pair of eigenvalues of the form ajb the correspondingdiagonal entry is the block: a b

b a

(4.75)

To this end, only one question remains unanswered; what is the linear coordi-nates transformation that yields the aforementioned equivalent realization as in(4.73)? It suffices to find a linear transformation that takes the complex diago-nal form to the aforementioned real diagonal form. Thus, we need a similaritytransformation S MnC which carries out the following transformation:

a +jb 00 a jb

S0

a bb a

(4.76)

It is easy to show (how?) that the following matrix is such a similarity trans-formation:

S0 = j 1

1 j

(4.77)

In the sense that:

S0

a +jb 0

0 a jb

S10 =

a bb a

(4.78)

Now suppose that the complex diagonal form of a matrix is:

A = diag{1, . . . , k1, a +jb,a jb,k+2, . . . , n} (4.79)Then the similarity transformation S that yields the corresponding real Jordanform is:

S =

k k+1

11

. . .

k j 1k+1 1 j

. . .

1

(4.80)

or equivalently, in block diagonal notation:

S = block diag{Ik, S0, Ink2} (4.81)

21


24/93

The same can be extended easily when more than one pair of eigenvalues is

present.

4.2.5 Why is diagonalization useful?

If a matrix is diagonalizable and we know its diagonal form then first of all wehave our system in a much simpler representation in which inputs and outputsare decoupled. It becomes clear which input affects which output (in the newcoordinates) and how much - see for example equation (4.57). Additionally, if = diag{i}ki=1 is a diagonal matrix, then operations with it become muchsimpler. For instance, the powers of are calculated using the formula:

m = diag{mi }ki=1 (4.82)

What is the same, the inverse of the matrix (if it exists) is given by:

1 = diag{ 1i

}ki=1 (4.83)

The matrix exponential is given by:

e = diag{ei}ki=1 (4.84)And in general for any matrix mapping f : Mn Mn it holds that:

f() = diag{f(i)}ki=1 (4.85)In fact, this gives rise to the following extension of a function over the complex

numbers to the group dg(n,C) of n n diagonal matrices with entries from C[Spr00, Chapter 4]:

Definition 9 (Function Extension 1). Let f : C C be a function defined ata1, a2, . . . , an and D(C) is the diagonal matrix diag{ai}ni=1. Then we wedefine the function f : dg(C) dg(C) so that f() = diag{f(ai)}ni=1.This way we have extended all functions on complex numbers to the set ofdiagonal matrices over C. It is possible to extend a function f : C C to afunction over the the set of diagonalizable matrices which we shall denote byD(n,C).

Definition 10 (Function Extension 2). Let f : C C be a function and fits extension over D(

C

). Let A be a diagonalizable matrix which is similar tothe diagonal matrix of its eigenvalues and there is a T Mn(C) such thatA = TT1. If f is defined at every eigenvalue of A we define the functionf : D(n,C) D(n,C) as f(A) = Tf()T1.Example 5. LetA D(n,R) andf : R R withf(t) t. LetA = TDT1with D d(n,R+) and T GL(n,R). Then we define the extension of f tobe a function f : D(n,R) D(n,C) so that f(A) = Tdiag{di}ni=1T1. Forexample let us take

A =

1 22 1

D(2,R) (4.86)

22


25/93

This matrix is diagonalizable and

A = 1 1

1 1

13

1 11 1

1(4.87)

Hence

f(A) =

1 11 1

1 3

1 11 1

1(4.88)

=1

2

3 +j

3 j

3 j 3 +j

(4.89)

Usually for brevity we use the same symbol for f and f so in this example wecan write

A instead of f(A).

There is another well known approach for extending an analytical functionfrom C to Mn(C) that employs the Taylor expansion of the initial function.According to this approach, if a function f over C admits the expansion:

f(t) =i=0

aif(i)(0)

ti

i!(4.90)

where f0(t) = 1, f(1)(t) = f(t) and f(i)(t) is the i-th order derivative of f, then

its extension over Mn(C) is defined to be a function f : Mn(C) Mn(C) sothat:

f(A) =

i=0 aif(i)(0)

Ai

i!

(4.91)

and Ai is defined through matrix multiplication and A0 = I. An extensionderived using the Taylor expansion approach will be exactly the same with theone that we get using diagonalization.

Diagonalization stands more as a theoretical tool rather than as a compu-tational one. Without going into much detail, let us consider the followingfunction which plays a very important role in the study of stability:

f(t) = eAt; t R (4.92)where A is a diagonalizable matrix. f is a mapping f : R Mn(R). We needto know whether the following limit exists and converges:

limt f(t) (4.93)

There is a T GL(n,R) so that A = TDT1 where D is the diagonal matrixwith the eigenvalues of A. One has that:

f(t) = eAt = TeDtT1 = T

e1t

e2t

. . .

ent

T1 (4.94)

23


26/93

The limit converges if and only if all functions eit converge. Ifi R then eit

converges if and only if i < 0. If i C

, that is i = ai +jbi theneit = eaitejbit = eait(cos t +j sin t) (4.95)

and eit = eait (4.96)And it is now obvious that we have convergence in C if and only if Re(i) < 0.As a result we say that f(t) converges as t if and only if the real part of alleigenvalues ofA is strictly negative. If there is one eigenvalue with positive realvalue then limt f(t) = . Note that throughout this procedure we didnot diagonalize any matrices; we just used the fact that A can be representedin a simpler fashion to draw certain results. However, our approach has thedrawback that it works only for diagonalizable matrices. In the next section we

provide a simple realization for not necessarily diagonalizable matrices.

4.3 Jordan Canonical Form

4.3.1 The Jordan Decomposition

Diagonalization applies to matrices whose eigenvectors are linearly independent(otherwise the linear transformation V as it was defined in theorem 7 is singu-lar). But what ifA cannot be diagonalized? The requirement that eigenvectorsof A are linearly independent imposes a quite restrictive requirement. The Jor-dan decomposition of a matrix is an extension of the diagonal decomposition.All matrices admit a Jordan form which in case the matrix is diagonalizablecoincides with its diagonal form.

Definition 11 (Jordan Block). A Jordan Block of size k is defined as thefollowing matrix:

Jk() =

1 1

. . .. . .

1

(4.97)

where Jk() Mk(C) and C.A special case of Jordan block appears when k = 1. Then J1() is a scalar:

J1() = (4.98)

Accordingly, the 2-dimensional Jordan block looks like:

J2() =

10

(4.99)

Definition 12 (Jordan segment). A matrix that possesses the following block-diagonal form is called a Jordan segment:

J() =

Jk1()Jk2()

. . .

Jks()

(4.100)

24


27/93

where k1 k2 . . . ks 1. More explicitly we wil l be referring to Jordansegments using the notation Jk1,k2,...,ks().Definition 13 (Jordan Matrix). A block-diagonal matrix of the following formis called a Jordan Matrix:

J =

J(1)J(2)

. . .

J(p)

(4.101)

where all J(i) for i = 1, 2, . . . , p are Jordan segments.

Example 6. The following matrix is in Jordan form:

J = J3,1(7) J2(0) (4.102)which is expanded as:

J =

7 17 1

77

0 10

(4.103)

As we will see in what follows, every matrix in Mn(K) - where K is either Ror C - is similar to a Jordan matrix with entries in C. Additionally, if a matrix isdiagonalizable then its Jordan form coincides with its diagonal form, therefore

Jordan decomposition can be perceived as a generalization of diagonalization.First of all, we need to give the following definition:

Definition 14 (Index of eigenvalue). Let be an eigenvalue of a matrix A Mn(K) - where K is either R or C. We call index of the eigenvalue thesmallest positive integer k so that

rank(A I)k = rank(A I)k+1 (4.104)Hereinafter, we denote the index of an eigenvalue by i()

Remark: Since is an eigenvalue of A, det(A I) = 0 from which it followsthat the matrix A I is not invertible and not full-rank rank (A I) < n.It also holds that:

rank(A I) rank(A I)2

. . . rank(A I)k

(4.105)

We can now give the following theorem:

Theorem 10 (Jordan Form). Let A Mn(K) - where K is eitherR or C -with distinct eigenvalues {1, 2, . . . , p}. ThenA is similar to a Jordan matrixJ Mn(C):

J =

J(1)J(2)

. . .

J(p)

A (4.106)

25


28/93

which contains as many Jordan segments as the number of distinct eigenvalues

of A. Additionally:1. Each Jordan segment J(i) in J consists of (i) Jordan blocks - where

(i) is the geometric multiplicity of i.

2. The index of i determines the size of the largest Jordan block in J(i).

3. Let rj(i) rank(A I)j. Then the number of j j Jordan blocks inJ(i) is given by:

cj(i) = rj1(i) 2rj(i) + rj+1(i) (4.107)Note: If an eigenvalue is semisimple then (and only then) the correspondingJordan segment in the Jordan matrix is a diagonal matrix.

Example 7. We will find the Jordan form of the following 5-by-5 matrix:

A =

2 5 7 6 72 3 2 8 20 1 5 2 01 4 0 9 1

3 4 3 4 8

(4.108)

The eigenvalues of A are 1 = 1 withalgebraic multiplicity(1) = 2 and2 = 5with (5) = 3. We already know that the Jordan matrix consists of two Jordansegments (one for each eigenvalue). We now calculate the following ranks forthe first eigenvalue:

r1(1) = rank(A I) = 4 (4.109)

r2(1) = rank(A

I

)

2

= 3 (4.110)r3(1) = rank(A I)3 = 3 (4.111)

therefore i(1) = 2 thus the largest block in J(1) has dimensions 2 2. From therank-nullity theorem (see [Rom00, pp. 57-58]) we have that(1) = 5r1(1) = 1,therefore only 1 Jordan block corresponds to the eigenvalue 1 = 1. This isapparently J2(1). Now for the eigenvalue 2 = 5 we have:

r1(5) = rank(A 5I) = 3 (4.112)r2(5) = rank(A 5I)2 = 2 (4.113)r3(5) = rank(A 5I)3 = 2 (4.114)

therefore i(1) = 2 thus the largest block in J(5) has dimensions 2 2 - thatwill be J2(5). Also, (5) = 5

r1(5) = 2, hence we have two Jordan blocks

corresponding to this eigenvalue - one of which is J2(5). The other Jordan blockmust have dimensions 1 1 so that J is 5 5 (But we can also use equation(4.107) for this purpose). Finally, the Jordan form of A is:

J =

J2(5) J1(5)

J2(1)

=

5 15

51 1

1

(4.115)

what we have not yet determined here is a matrix T GL(n,C) so that J =TAT1.

26


29/93

4.3.2 Utility of the Jordan form

In section 4.2.5 we expounded how the diagonalization of a matrix - if possible -allows us to calculate easily all functions of matrices (powers, the inverse matrixif it exists, polynomials, the matrix exponential and the matrix logarithm) - seedefinition 10. The same is possible for non-diagonalizable matrices using theJordan form of the matrix.

Definition 15 (Function Extension 3). Letf : C C be a k 1 times differ-ential function at C. Then we define the extension of this function over theset of Jordan blocks to be

f(Jk()) =

f() f() f()2!

f(k1)()(k1)!

f() f(). . .

...

. . . . . . ...f() f()

f()

(4.116)

Example 8 (Matrix Polynomial). Consider the function f : C C defined byf(z) = z3 + 1. We want to apply the extension of f as given by definition 15 tothe Jordan block J3(1), that is:

J3(1) = 1 1 00 1 1

0 0 1

(4.117)

f is twice differentiable and f(z) = 3z2 and f(z) = 6z. Therefore, f appliedto J3(1) gives:

f(J3(1)) = f(1) f(1) f

(1)2

0 f(1) f(1)0 0 f(1)

=

0 3 30 0 3

0 0 0

(4.118)

The function f corresponds to the matrix polynomial f(J) = J3 + I.

Example 9 (Matrix exponential). We extend the exponential function f(z) =ez from the complex numbers domain to the set of Jordan blocks according todefinition 15. This way we define the function f over the set of Jordan blocksand we denote f(J) = eJ. According to the definition let us calculate:

f(J2(1)) = eJ2(1) =

f(1) f(1)0 f(1)

= e e

0 e

(4.119)

A Jordan matrix J is a block-diagonal matrix consisting of Jordan blocks, i.e.J = block diag{Ji(ij )}. A function is extended to the space of Jordan matricesas follows:

Definition 16 (Function Extension 4). Let f : C C be a k1 1 timesdifferentiable function at . Then we define the extension of this function overthe set of Jordan segments to be

f(Jk1,k2,...,ks()) = block diag {f(Jkj ())}sj=1 (4.120)

27


30/93

and the extension of f to the space of all Jordan matrices as:

f(J) = block diag {f(J(1))} (4.121)Example 10. We will calculate the matrix polynomial f(J) = J2 + 3I (i.e. theextension of (z) = z2 + 3 for z C) for the Jordan matrix:

J =

J3,2(1) J2(2)

1

(4.122)

that will be:

J2 + 3I =

J3,2(1)

2 + 3IJ2(2)

2 + 3I(

1)2 + 3

(4.123)

where:

J3,2(1)2 + 3I =

J3(1)

2 + 3IJ2(1)

2 + 3I

(4.124)

and it is now easy to employ equation (4.116) to calculate:

J3(1)2 + 3I =

4 2 14 2

4

(4.125)

Following this procedure we derive to the following result:

J2 + 3I =

4 2 14 2

44 2

47 4

74

(4.126)

However, what is most important is that we can extend any function (thatis adequately smooth) to the whole Mn(C).

Definition 17 (Function extension 4). Letf : C C be an adequately smooth function. Then for every matrixA Mn(C) for which there exists a Jordanmatrix such that A = TJT1 we define the extension of f over the set Mn(C)

to be a function f : Mn(C) Mn(C) such that:f(A) = Tf(J)T1 (4.127)

Note: Hereinafter we use a common symbol for f, f , f and f. In this sense wemay write:

f(A) = Tf(J)T1 (4.128)

For example:eA = TeJT1 (4.129)

The following results is of extraordinary importance in control theory and oneof the most important results for the study of LTI systems stability.

28


31/93

Proposition 11. The function f : R Rn defined by f(t) = eAtx convergesto 0 as t for every x

Rn

if and only if the real part of all eigenvaluesof A is strictly negative. If there is one eigenvalue of A with positive real partthen limt f(t) = .Proof. The proof is left to the reader as an exercise. Hint: Use the Jordandecomposition of A.

Note: A function f(t) : R Rn is said to converge to a vector x as t (we denote limt f(t) = x) if:

limt

f(t) x = 0 (4.130)

Or, what is exactly the same, let f(t) =

f1(t) f2(t) . . . f n(t)

, where

fi :R

R

, we say that f converges tox

= x1 x2 . . . xn as t ifand only if:limt

fi(t) = xi (4.131)

4.4 Canonical Controllable Form

Let = (A, B, C, D) be an LTI system with a single input. Using a propercoordinates transformation, and under certain conditions, it is possible to bringthe system in the following form:

z =

0 1 0 00 0 1 0.

..

.

.... .

.

..0 0 0 1

a1 a2 an

z(t) +

00.

..01

u(t) (4.132)

This is known as the Canonical Controllable Form of an LTI system. Using thenew system of coordinates z = Tx, we observe that the input affects directlyonly one of the state variables, namely zn(t). Indeed, by the last row equationof (4.132), we have that:

zn(t) = a1z1(t) a2z2(t) anzn(t) + u(t) (4.133)while for all other state variables, it holds that:

zk(t) = zk+1(t), for k = 1, 2, . . . , n

1 (4.134)

For a given pair of matrices (A, B) and under certain conditions, there is alinear change of coordinates z = Tx such that the transformed system is in thecanonical controllable form. Our goal here, is to determine an invertible matrixT Mn(R) such that:

TAT1 =

0 1 0 00 0 1 0...

.... . .

...0 0 0 1

a1 a2 an

(4.135)

29


32/93

and

TB =

00...01

(4.136)

Multiplying (4.135) from the right with T yields:

TA =

0 1 0 00 0 1 0...

.... . .

...0 0 0 1

a1

a2

an

T (4.137)

Now let T be written in the following form:

T =

t1t2...

tn

(4.138)

where ti Rn are row-vectors. Substituting (4.138) into (4.137) one gets:

t1t2...

tn

A =

0 1 0 00 0 1 0... ... . . . ...0 0 0 1

a1 a2 an

t1t2...

tn

(4.139)

Carrying out the multiplication we have:

t1At2A

...tn1A

tnA

=

t2t3...

tn1a1t1 a2t2 . . . antn

(4.140)

From the first n 1 rows we have that:t1A = t

2, t

2A = t

3, , tnA = tn1 (4.141)

Thus we have the recursive formula:

tk+1 = tkA for k = 1, 2, . . . , n 1 (4.142)

30


33/93

So if we know t1 we can determine the whole sequence ofti for i = 1, 2, . . . , n1.

The vector t1 will be determined by equation (4.136) as follows:

(4.136,4.138)

t1t2...

tn

B =

0...01

(4.143)

t1t1A

...t1A

n2

t1An1

B =

0...01

(4.144)

t1B

t1AB...

t1An2B

t1An1B

=

0...01

(4.145)

To this end we give the following definition:

Definition 18 (Controllable Pair). For a pair of matrices (A, B), the matrix

C(A, B) B AB A2B An1B (4.146)is called the controllability matrix of (A, B). If

C(A, B) is non-singular, then

the pair (A, B) is called controllable.

If the pair (A, B) is controllable, then (4.145) can be solved as follows:

(4.145) t1C(A, B) =

0 . . . 0 1

(4.147)

t1 =

0 . . . 0 1 C(A, B)1 (4.148)

that is, the vector t1 is actually the last row of the matrix C(A, B)1. After-wards, t2, t3, . . . , tn are calculated using the recursive formula (4.142). Equation(4.145) is in fact a linear system which can be solved using any of the availablenumerical solution methods (e.g. Gauss-Seidel, Krylov Space methods etc). Thelinear system is:

C(A, B)t1 =

0

...01

(4.149)Note : Regarding equation (4.147) we used the following fact. Let t1 M1n(R) be a row vector and x, y Rn be column vectors. Then

t1xt1y

21

=

x y

2n

t1 =

x

y

2n

t1 (4.150)

31


34/93

Thus, the left-hand side of equation (4.145) can be written as:

t1B

t1AB...

t1An1B

= B AB A2B An1B t1= C(A, B)t1

Example 11. In this example we will convert the following system into itscanonical controllable form:

x(t) =

1 2 13 5 00 1 0

x(t) +

21

3

u(t) (4.151)

We first calculate the controllability matrix of the system (Note: In MATLAB,one may consider using the commandctrb):

C(A, B) = 2 3 41 1 14

3 1 1

(4.152)

This matrix is invertible and its inverse is:

C(A, B)1 = 1151

15 7 3843 10 32

2 11 5

(4.153)

Using (4.147) we have

t1 = t1 =

0 . . . 0 1

C(A, B)1 = 1151

2 11 5 (4.154)

Then we are using the recursive formula tk+1 = tkA for k = 1, 2 and we have:

t2 = t1A =

1

151

31 56 2 (4.155)

and

t3 = t2A =

1

151

199 340 31

(4.156)

which yields the similarity matrix:

T = 1151

2 11 531 56 2199 340 31

(4.157)which transforms the given system into its canonical controllable form which is:

z(t) =

0 1 00 0 1

3 1 6

z(t) +

00

1

u(t) (4.158)

4.5 Canonical Observable Form

32


35/93

Chapter 5

Trajectories of LTI systems

In this chapter we study the input-to-state relationships in an LTI system. Inparticular we examine the various properties of these differential equations aswell as their solutions.

5.1 General Properties of Solutions

Definition 19. Any function (t) : R Rn such thatd

dt(t) = f((t), u(t)) (5.1)

for some function u(t) : R

Rm, is called a solution of

d

dtx(t) = f(x(t), u(t)) (5.2)

Sometimes, in order to emphasize that is a solution with respect to theinput u, we use the notation u. A solution which satisfies the initial condition(t0) = x0 is denoted by (t; t0, x0) or u(t; t0, x0). A very important propertyof a solution u(t; t0, x0) which follows directly from (5.1), is the following:

u(t; t0, x0) = x0 +

t0

f(u(; t0, x0), u())d (5.3)

The first result is due to Caratheodory and provides sufficient conditions forthe existence of a solution for (5.2).

Theorem 12 (Caratheodory). Let f : Rn Rm Rn be continuous andLipschitz in the first argument in the sense that for all > 0 there exists aL > 0 such that f(x, u) f(y, u) Lx y for all x, y B(R) andu B(Rm); where B(R) is the Euclidean ball ofRn of radius . Then forall x0 Rn and every locally Lebesque-integrable function u, the initial valueproblem:

d

dtx(t) = f(x(t), u(t)) (5.4)

x(t0) = x0 (5.5)

has a unique solution u(t; t0, x0).

33


36/93

For a proof as well as more results on existence and uniqueness see [Son98,

Section C3]. We should now clarify the meaning ofuniqueness in Caratheodorystheorem. The system (5.2) accepts a unique solution if for every two functions1(t), 2(t) such that

d

dt1(t) = f(1(t), u(t))

d

dt2(t) = f(2(t), u(t))

the following implication holds true:

1(t0) = 2(t0) 1(t) = 2(t) for all t t0 (5.6)The second important result we will prove states that u(t; x0) is continuous

with respect to x0, but for that we need first the following Lemma which is dueto Bellman and Gronwall. Readers not familiar with measure theory may eitherskip the proof or read locally integrable as continuous and for almost allas for all.

Lemma 1 (Bellman-Gronwall). Let I R be an open interval in R and aconstant c 0. Also, let : I R+ is a locally integrable function and : I R+ is continuous. Assume that for some t0 I it holds that:

(t) c +tt0

()()d (5.7)

for allt

I such that t

t0. Then:

(t) cet

t0()d

(5.8)

Proof. We define

(t)

tt0

()()d (5.9)

for t I with t t0. Then (t) = (t)(t) (t)(t), therefore(t) (t)(t) 0 (5.10)

for almost all t. Now let us define:

p(t) (t)e t

t0()d

(5.11)

Then p(t) is locally absolutely continuous and therefore differentiable almosteverywhere with

p(t) = [(t) (t)(t)] et

t0()d

(5.12)

and obviously p(t) 0 and p is non-increasing, sop(t) p(t0) (5.13)

where p(t0) = c which completes the proof.

Using the Bellman-Gronwall lemma we can prove the following:

34


37/93

Theorem 13. Let u(t; x) be the solution of (5.2) with f satisfying the con-

ditions of the Caratheodory theorem. Then

u

(t; x) is continuous in x in thefollowing sense: For everyt 0 and > 0 there is a > 0 (which depends ont and ) such that for every x1, x2 Rn,

x1 x2 < = u(t; x1) u(t; x2) < (5.14)Proof. Using the integral form of the solution as in (5.3), the norm u(t; x1) u(t; x2) becomes:

u(t; x1) u(t; x2) =

x1 +

t0

f(u(; x1), u())d x2 t0

f(u(; x2), u())d

and using the triangle inequality we proceed like:

x1 x2 +t0

[f(u(; x1), u()) f(u(; x2), u())] d

and since f is Lipschitz in x with some constant L 0:

x1 x2 + Lt0

[u(; x1) u(; x2)] d

Therefore:

u(t; x1) u(t; x2) x1 x2 + Lt0

u(; x1) u(; x2) d (5.15)

Let (t) = u(t; x1) u(t; x2). From (5.15) and by the Bellman-Gronwalllemma (with (t) = L and c = x1 x2) we have:

u(t; x1) u(t; x2) x1 x2 et0 Ld (5.16)

= x1 x2 eLt (5.17)For (, t) = et, from the last equation we have u(t; x1) u(t; x2) < for all x1 x2 < which completes the proof.Note : This proposition offers not only a very important result on the continuityof the trajectories over a system with respect to its initial conditions but alsoan estimate of the norm u(t; x1) u(t; x2) that is:

u

(t; x1) u

(t; x2) x1 x2 eLt (5.18)We will now state and prove that the solutions of systems with similar dynamicsare also similar. In particular we show that two systems x = f(x, u) and x =g(x, u) under the same initial conditions and the same input have trajectoriesthat are close to each other as long as f and g are also close to each other.For that we need an upper bound on their trajectories which depends on thedistance between f and g. In order to define the distance between the two flowfunctions f and g, for a given u we introduce the norm:

f g supxRn

f(x, u) g(x, u) (5.19)

35


38/93

Proposition 14. Let u(t; x0) and u(t; x0) be the solutions of the systems

x = f(x, u) and x = g(x, u) with initial conditions x(0) = x0. Assume that fis Lipschitz with constant L and that:

f g (5.20)Then

u(t; x0) u(t; x0) L

(eLt 1) (5.21)Proof. We have that:

u(t; x0) u(t; x0)

t0

f(u(; x0), u()) g(u(; x0), u()) d (5.22)

We now add and subtract f(u

(; x0), u()) to get:

(5.22) t0

f(u(; x0), u()) f(u(; x0), u()) +f(u(; x0), u()) g(u(; x0), u()) d (5.23)

t0

(Lu(; x0) u(; x0) + f g) d (5.24)

t0

(Lu(; x0) u(; x0) + ) d (5.25)

Now, in order to apply the Bellman-Gronwall lemma we perform the followingmanipulations:

u(t; x0) u(t; x0) Lt0

u(; x0) u(; x0) + L

d (5.26)

And adding L

on both sides of the relation yields:

u(t; x0) u(t; x0) + L

L

+

L

t0

u(; x0) u(; x0) +

L

d (5.27)

And applying the Bellman-Gronwall lemma with parameters (t) = u(t; x0)u(t; x0) + L , c = L and (t) = L we get the result right away.Remark : Note that u was considered as constant, otherwise is a function

of u and the inequality becomes:

u(t; x0) u(t; x0) (u)L

(eLt 1) (5.28)

With this proposition we have proven the following continuity property of thesolutions of a dynamical system which is known as continuity with respect to theflow - the term flow is usually used to pertain to the vector field f.

Corollary 15 (Continuity with respect to the flow). For every t > 0 and > 0there is a > 0 (which depends on t and ) so that :

f g < u(t; x0) u(t; x0) < (5.29)

36


39/93

The previous proposition can be slightly modified to cater for time-varying

systems.Proposition 16. Let u(t; x0) and

u(t; x0) be the solutions of the systemsx = f(t, x, u) and x = g(t, x, u) with initial conditions x(0) = x0. Assume thatf is Lipschitz with constant L and that for some time interval T:

suptT

supxRn

f(t, x, u) g(t, x, u) (5.30)

Then for all t T

u(t; x0) u(t; x0) L

(eLt 1) (5.31)

Proof. It is left to the reader as an exercise to repeat the steps taken in the

proof of proposition 14 and adapt them to this one. Note that as before, hereu is handled as a constant input signal.

The last continuity property has to do with the input to the system.

Proposition 17 (Continuity wrt the inputs). Letf : RnRm Rn be a func-tion that satisfies the conditions of the Caratheodory theorem and let u(t) andv(t) be two locally Lebesgue integrable functions. Letf be uniformly continuouswith respect to its second argument and assume that for every t 0:

u(t) v(t) (5.32)Then there is a M > 0 such that:

u(t, x0) v(t, x0) ML

(eLt 1) (5.33)

Proof. The proof is based on proposition 14 and is left to the reader as anexercise. Hint: Define Fu(x) = f(x, u) and Fv(x) = f(x, v). Based on theuniform continuity property of f we have that there is some M > 0 such that:

Fu(x) Fv(x) < M (5.34)The rest is left as an exercise.

One of the most important properties of u(t; t0, x0) is the semigroup prop-erty which is stated as follows:

Theorem 18 (Semigroup Property). Consider the system (5.2) and let u be agiven input signal so that the system admits a unique solution. Then

u(t; t0, x0) = u(t t0; 0, x0), t t0 (5.35)

Proof. The function y(t) = u(t; t0, x0) is a solution of (5.2), thus

y(t) = f(y(t), u(t)) (5.36a)

y(t0) = x0 (5.36b)

37


40/93

Let (t) = u(t t0; 0, x0). Then(t) = u(t t0; 0, x0)

= f(u(t t0; 0, x0), u(t))= f(y(t), u(t))

= y(t)

Additionally

(t0) = u(0;0, x0)

= x0 = y(t0)

So, (t) is also a solution of (5.2) with the same initial value as y(t) which dueto the uniqueness of solutions implies:

(t) = (t), t t0 u(t; t0, x0) =

u(t t0; 0, x0), t t0which completes the proof.

Note: This result is particularly useful for the study of time-invariant systemssince one can see that the initial time is of no significance for the solution, hencewe may always assume without loss of generality that t0 = 0. For this reasonone may omit t0 and simply write

u(t; x0) to refer to u(t; t0, x0).

Proposition 19. The solution of an LTI system = (A, B, C, D) with initialcondition x(0) = x0 is given by

u(t; x0) = eAtx0 +

t0

eA(t)Bu()d (5.37)

Proof. The proof is based upon the observation that:

d

dt(eAt(t)) = eAt(t) eAtA(t) (5.38)

So, from the state equation of the LTI system we have:

(t) = A(t) + Bu(t) (5.39)eAt (t)

eAtA(t) = eAtBu(t)

(5.40)

ddt

(eAt(t)) = eAtBu(t) (5.41)

The last equation by integration from 0 to t gives:t0

d

dt(eAt(t)) =

t0

eABu() (5.42)

eAt(t) eA0(0) =t0

eABu() (5.43)

eAt(t) = (0) +t0

eABu() (5.44)

38


41/93

And left-multiplying by (eAt)1 = eAt we have:

(t) = eAt(0) + eAtt0

eABu() (5.45)

(t) = eAt(0) +

t0

eA(t)Bu() (5.46)

Therefore

u(t; x0) = eAtx0 +

t0

eA(t)Bu()d (5.47)

which is exactly (5.37).

Proposition 20 (Superposition Principle). The solution u(t; 0) of an LTIsystem is linear with respect to u.

Proof. Let us pick two input functions u1

and u2

and two real numbers 1

, 2 R. Then:

1u1+2u2(t; 0) =

t0

eA(t)B(1u1() + 2u2())d (5.48)

= 1

t0

eA(t)Bu1()d + (5.49)

2

t0

eA(t)Bu2()d (5.50)

= 1u1(t; 0) + 2

u2(t; 0) (5.51)

which completes the proof.

Note: Actually it is the superposition principle that qualifies a system as aLinear Dynamical System. There are systems that are linear in this sense (i.e.satisfy the superposition principle) but cannot be realized in the standard LTIform we have been studying so far. A quite simple example is a system of asingle input u(t) and a single output y(t) whose dynamics are given by thefollowing system of equations:

y(t) =d

dtu(t) (5.52)

In principle, a linear system is represented in the form:

y(t) = H(u)(t) (5.53)where H : V W is a linear mapping between from the input vector space Vto the output vector space

W. So for example the linear system (5.52) can be

written as:

y(t) = Du(t) (5.54)

where D : C C is the differentiation operator. A prime example of suchlinear systems is the ones that are defined through convolution. In particular,for h L2(R) we define the following dynamical system:

y(t) =

h()u(t )d (5.55)

This system satisfies the superposition principle so it is qualified as a LinearSystem, but again it does not admit a state space representation.

39


42/93

5.2 The state transition matrix

If the initial state of the system is the origin, that is x0 = 0, then the solutionbecomes:

u(t; 0) =

t0

eA(t)Bu()d (5.56)

So we can rewrite (5.37) as follows:

u(t; x0) = eAtx0 +

u(t; 0) (5.57)

We give the following definition:

Definition 20. Given an LTI system = (A, B, C, D), the following function

(r, s)

e

A(r

s)

(5.58)is called the State Transition Matrix of the system.

Using the definition of the State Transition Matrix in ( 5.58), equation (5.57)becomes:

u(t; x0) = (t, 0)x0 + u(t; 0) (5.59)

Applying the semigroup property, we have the following very important formula

u(t; t0, x0) = (t, t0)x0 + u(t; t0, 0) (5.60)

The right-hand side of the last equation consists of the term (t, t0)x0 whichdescribes the evolution of the systems state without the action of any inputand the term u(t; t0, 0) which describes the state trajectory for x0 = 0 withthe input action u(t). Indeed, if no input is applied to the system, the solutionbecomes:

(t; t0, x0) = (t, t0)x0 (5.61)

Another important remark is that under zero input action (u = 0), if the systemis at its equilibrium point x = 0 it will remain there meaning that

0(t, 0) = 0 (5.62)

The basic properties of the state transition matrix are summarized in the fol-lowing proposition:

Proposition 21. For all r, s R:1. (r, r) = I

2. (r, 0)(s, 0) = (r+s, 0) and in general for allt, v it is (r, t)(s, v) =(r + s, t + v)

3. (r, 0) = (r, 0)1

4. If A is diagonal then (r, s) is also diagonal.

5. r

(r, s) = A(r, s) and s

(r, s) = A(r, s)

40


43/93

The first property simply implies that u(t0; t0, x0) = x0 regardless of u.

Using the second property, the systems initial state can be retrieved from itsstate at some other time t. Left-multiplying both sides of (5.59) by (t, 0) weget:

5.59 (t, 0)u(t; x0) = (t, 0)(t, 0)x0 + (t, 0)u(t; 0)(5.63) x0 = (t, 0)u(t; x0) (t, 0)u(t; 0) (5.64) x0 = (t, 0)1 (u(t; x0) u(t; 0)) (5.65)

In particular, if no input is present, then u=0(t, 0) = 0 and the last equationbecomes:

x0 = (t, 0)1u(t; x0) (5.66)

The fourth property is due to the fact that if A is diagonal, in the form A =diag(a1, a2, . . . , an) then e

A = diag(ea1 , ea2 , . . . , ean).

5.3 Responses of LTI systems

The dynamic behaviour of LTI systems is better understood studying theirresponses to impulses and steps.

5.3.1 The Impulse Response

Assume that the input to an LTI system is given by:

u(t) = F(t) = F1...

Fm1Fm

(t) (5.67)where (t) is the Dirac functional having the property

lim0+

()d = 1 (5.68)

while(t) = 0, t = 0 (5.69)

Readers familiar with measure theory understand that this definition of (t) iscontradictory since every almost-everywhere zero function has zero integral overits domain. The Dirac functional is sort of a convention; it is a symbol thatfacilitates certain calculus rather than a real function or functional (at least theway we use it here). A good reference for a rigorous definition and some basicfacts on the Dirac functional is provided by Rudin in [Rud73, chap. 6]. In fact,the following property is used to define (t):

> 0 :

f(t)(t)dt = f(0) (5.70)

According to (5.37) we have:

u(t; x0) = eAtx0 +

t0

eA(t)BF()d (5.71)

41


44/93


45/93

So now we have to calculate the integral

t

0eAd. In case A is nonsingular,

using (5.77) one has that:

u(t; x0) = eAtx0 + e

AtA1

I eAtBF (5.86)= eAtx0 + A

1 eAt IBF (5.87)otherwise, using (5.81)

u(t; x0) = eAtx0 + e

At

i=0

Aiti+1

(i + 1)!BF (5.88)

43


46/93

Chapter 6

Controllability

The controllability of a system is related to our ability to steer its state fromany initial state x1 Rn to any final state x2 Rn in finite time. The abilityto study the controllability of a state space system is absent in the Laplacetransform analysis.

6.1 Controllability of LTI systems

We give the definition of controllability for a state space system:

Definition 21. A system with state equation x = f(t, x(t), u(t)) is said tobe controllable if for every (t0, x0) R+ Rn and x1 Rn there is a t1 t0and a function u(t) : [t0, t1]

Rm such that u(t1; t0, x0) = x1.

If a system is time-invariant, i.e. the function f in its state equation is in-dependent of time, then every solution u(t; t0, x0) is actually independent ofthe initial time t0. Therefore, without infringing the generality we may assumethat t0 = 0 and then we denote simply

u(t; t0, x0) u(t; x0) with the prop-erty u(0; x0) = x0. Then the definition of controllability is restated simpler asfollows:

Definition 22. A time-invariant system with state equationx = f(x(t), u(t))is said to be controllable if for every x0 Rn and x1 Rn there is a T 0 anda function u(t) : [0, T] Rm such that u(T; x0) = x1.

In general, for a nonlinear system, the controllability study is far from trivialand usually only local results can be derived. For LTI systems however, things

are much simpler. For example, an LTI system is controllable if and only if wecan steer its state from the origin to any state - an observation that simplifiesour analysis. Note that such a property is not true for nonlinear systems.

Proposition 22. For an LTI system, if for all x Rn there is a T 0 anda u : [0, T] Rm such that u(T; 0) = x, then and only then the system iscontrollable.

Proof. If the system is controllable, the stated property obviously holds. Theother way, the solution of an LTI system u(t; x0) is given by:

u(t; x0) = eAtx0 +

t0

eA(ts)Bu(s)ds (6.1)

44


47/93

and for x0 = 0 we have

u(t; 0) =t0

eA(ts)Bu(s)ds (6.2)

Combining (6.1) and (6.2) we have

u(t; x0) = eAtx0 +

u(t; 0) (6.3)

Let x Rn. We choose a T 0 and a u : [0, T] Rm so thatu(T; 0) = x eATx0 (6.4)

Thenu(T; x0) = e

ATx0 + x

eATx0 = x (6.5)

which completes the proof.

Another property, weaker than controllability, is known as reachability of theorigin. A system is said to have reachable origin if starting from any state wecan steer it to the origin in finite time. Formally, we give the following definition:

Definition 23. A system with state equation x = f(t, x(t), u(t)) is said tohave reachable origin if for every (t0, x0) R+ Rn there is a t1 t0 and afunction u(t) : [t0, t1] Rm such that u(t1; t0, x0) = 0.This definition is simplified for time-invariant systems:

Definition 24. A time-invariant system with state equationx = f(x(t), u(t))

is said to have reachable origin if for every x0 Rn

there is a T 0 and afunction u(t) : [0, T] Rm such that u(T; x0) = 0.

Any controllable system has reachable origin, however the inverse is not ingeneral true. For LTI systems we will show that controllability is identical tothe reachable origin property. Therefore an LTI system is controllable if we cansteer its state from any initial state to the origin.

Proposition 23. An LTI system is controllable if and only if its origin isreachable.

Proof. We assume that an LTI system has reachable origin and we will show thatit is controllable. Let x1 Rn. Then, there is a t1 0 and a u : [0, t1] Rmso that:

u(T; x1) = 0 (6.6)

(t1, 0)x1 + u(t1, 0) = 0 (6.7) (0, t1)(t1, 0)x1 + (0, t1)u(t1; 0) = 0 (6.8) x1 + (0, t1)u(t1; 0) = 0 (6.9) x1 = (0, t1)u(t1; 0) (6.10) x1 =

t10

eAsBu(s)ds (6.11)

45


48/93

Now, we left-multiply both sides of (6.11) with the matrix A = eAt1 , so weget

eAt1x1 =t10

eA(t1s)Bu(s)ds (6.12)

Ax1 = u(t1, 0) (6.13)So starting from the initial state 0, it is possible to reach all states z = Ax withx Rn. Since the matrix A is invertible, any state y Rn can be written asy = A (A1)y), that is setting x1 = A1)y we have y = Ax. So equation(6.13) reveals that there exists proper u : [0, t1] Rm so that the state ofthe system becomes y in finite time, and for any y Rn which according toproposition 22 implies that the system is controllable.

One intuitively understands at this point that0

in the previous two propo-sitions can be replaced by any other point in Rn. Indeed, this is stated in thefollowing proposition. The proof to this is left to the reader as an exercise.

Proposition 24. Letw Rn. The following are equivalent:1. The LTI system x = Ax + Bu is controllable

2. For every x1 Rn there is a t1 0 and a u : [0, t1] Rm so thatu(t1, x1) = w

3. For every x1 Rn there is a t1 0 and a u : [0, t1] Rm so thatu(t1, w) = x1

These propositions have given us an insight regarding the controllability of

LTI systems but the question how do we know whether a given LTI systemis controllable is still to be answered. The answer is given in the followingtheorem which makes use of the controllability matrix of the LTI system (seedefinition 18).

Theorem 25. An LTI system is controllable if and only if its controllabilitymatrix is full rank.

Proof. According to proposition 22, an LTI system is controllable if it can reachany state in Rn in finite time starting from 0 using some input u, that is if

u(t1; 0) =

t10

eA(t1s)Bu(s)ds (6.14)

spans Rn (can be equal to any n-dimensional vector). For this to be true, thefollowing condition should hold [The only vector that is normal to all othervectors in Rn is the zero vector]:

yt10

eA(t1s)Bu(s)ds = 0 = y = 0 (6.15)

where y Rn. Assume that y t10

eA(t1s)Bu(s)ds = 0. Then for all s [0, t1]it holds that yeA(t1s)B = 0. Differentiating the last equation at s = t1 wehave:

d

ds

yeA(t1s)B

s=t1

= 0 yAB = 0 (6.16)

46


49/93

and again taking higher order derivatives on both sides of yeA(t1s)B = 0 we

have: (1)iyAiB = 0 for i = 0, 1, . . . (6.17)According to the Cayley-Hamilton theorem, it suffices to keep only the first nterm, therefore (6.17) becomes

y[B AB A2B An1B] = 0 yC(A, B) = 0 (6.18)In order to have y = 0 for all y that satisfy (6.18) the controllability matrix hasto be full rank.

Remark: A system is controllable if it is possible for an external system (e.g.a controller) to steer its state anywhere from any initial state in finite time.This might be considered to b e too demanding. For example, in practice wemight need only some of the state variable to be controllable while all other

state variables behave in an asymptotically stable manner. In the sequel wegive an example of a non-controllable system which is controlled by a simplelinear feedback controller.

Example 12. The following system

x(t) =

1 00 0

x(t) +

01

u(t) (6.19)

y(t) = x(t) (6.20)

is not controllable. Indeed, one can verify that its controllability matrix is:

C(A, B) =

0 01 0

(6.21)

which obviously has rank 1. Let us use the following linear feedback

u(t) = [0 1]y(t) (6.22)Then, the closed-loop system becomes:

x(t) =

1 00 0

x(t) +

01

[0 1]y(t) (6.23)

=

1 00 1

x(t) (6.24)

The solution of this system is then:

u(t; x0) = et 0

0 et x0 (6.25)Notice now that for every x0 Rn, we have limt u(t; x0) = 0, i.e. thetrajectory of the closed-loop system will approach the origin asymptotically ast . Although the system (6.19) has not reachable origin, we can stabilizeit asymptotically using a simple linear feedback controller. Note that it is notpossible for the trajectory to reach 0 in finite time (not with this, nor with anyother feedback controller), otherwise the system would be controllable.

Note: A notion close to controllability but slightly weaker is the one ofstabilizability. A system is said to be stabilizable is all its uncontrollable statesare stable, hence even if they cannot be controlled all of them will be boundedwhile the controllable states will be moved to the desired position.

47


50/93

6.2 Controllability and Equivalence

The property of controllability is transfered by coordinate transformations (orotherwise put, controllability is not dependent on the choice of coordinates forthe system). The following is a very useful result:

Proposition 26. Let 1 and 2 be two equivalent LTI systems. Then 1 iscontrollable if and only if 2 is controllable.

Proof. Let 1 = (A, B) and 2 = (A, B) and 2 T 1, that is, there is aninvertible matrix T Mn(R) so that A = TAT1 and B = TB. Let us assumethat 1 is controllable. The controllability matrix of 1 is:

C(A, B) = [B AB An1B] (6.26)

is full rank. The controllability matrix for 2 is then:

C(A, B) = [B AB An1B] (6.27)=

TB TAT1TB TAn1T1TB (6.28)

=

TB TAB TA2B TAn1B (6.29)= TC(A, B) (6.30)

and since T is invertible and C(A, B) is full rank, C(A, B) is also full rank and2 is controllable.

It is also easy to show that:

Proposition 27. If the matrices A and A are similar and the pairs (A, B)

and (A, B) are controllable, the system x = Ax + Bu and z = Az + Bu areequivalent.

Proof. The proof is left to the reader as an exercise.

It is quite straightforward to prove the following proposition which providesa useful equivalent to controllable LTI systems; their Canonical ControllableForm.

Proposition 28. Every LTI controllable system is equivalent to its CanonicalControllable Form (See section 4.4).

Proof. The proof is straightforward from (4.147): Let be a given controllable

LTI system. According to theorem 25, its controllability matrix is invertible,therefore, t1 in (4.147) can be calculated. Notice that according to proposi-tion 26, and since is equivalent to its CCF which is controllable, it is itselfcontrollable.

6.3 Advanced topics

Before we proceed to the next result, we should recall the following propertyfrom Linear Algebra. Let K, L Mn(R) and K is full rank. Then:

rank(KL) = rank(L) (6.31)

48


51/93

Also for any matrix H Mlk(R) which is written as:

H = [h1 h2 hk] (6.32)where hi Rl, it holds that for all i = 1, 2, . . . , k

rankH = rank[h1 h2 hk|hi] (6.33)

That is, the rank of a matrix does not change if we augment it by any column-vector that is already in it. It also holds the same for any linear combination ofcolumns in H:

rankH = rank[h1 h2 hk|

ki=1

ihi] (6.34)

Proposition 29.Let

A Mn(

R

),B

Mnm(R

) andC

Mpn(R

). Assumethat the pair (A, B) is controllable and the matrix

J =

A B

C 0

(6.35)

is full rank (n +p). Then the system

z = Jz +

B

0

v (6.36)

is controllable.

Proof. First of all, let us define

B

0

(6.37)

Since J is full rank

rank{C(J, )} = rank{J C(J, )} (6.38)The matrix is already a block in C(J, ); appending it once more will notincrease the rank of it, in the sense:

rank{C(J, )} = rank{J C(J, )} = rank{[J C(J, ) ]} (6.39)

We now recall thatC(J, ) = [ J Jn+p1] (6.40)

Therefore

rank{[J C(J, ) ]} = rank[J J2 Jn+p ] (6.41)

49


52/93

And using (6.35) and (6.37) and substituting them into (6.41) we have

rank{C(J, )} = rank

AB A2B AnB BCB CAB CAn1B 0

= rank

A C(A, B) BC C(A, B) 0

= rank

A B

C 0

C(A, B) 00 Ip

= rank J C(A, B) 0

0 Ip

= rank

C(A, B) 00 Ip

= n +p

which corroborates our initial claim and completes the proof.

The controllability matrix has very interesting algebraic properties. Let usrecall from Linear Algebra that every matrix defines a linear subspace throughits image. Let us briefly give a couple of definitions:

Definition 25 (Image). Let K Mn(R). The image (or column-space) of amatrix is the following set

Im{K} R(K) = {y Rn|x Rn : y = Kx} K Rn (6.42)It is easy to verify that for every K Mn(R), its image is a linear subspace

in Rn. The rank of a matrix is actually the dimension of its image and if a

matrix is full-rank then its image isRn

. We now need to introduce the notionof complementary linear spaces:

Definition 26 (Complementary Spaces). LetVbe a linear subspace ofRn (wedenote V Rn). There is a linear space W Rn with the following properties

1. Their only common vector is the zero vector, i.e. V W= {0}2. For every x Rn there is a v Vand a w Wsuch that x = v + w,

i.e. Vand Wtogether spanRn; we denote this by V+ W= Rn.3. For everyv Vand w Wit holds that vw = wv = 0, i.e. v w.

Then W is called the complementary space with respect to V and we denoteW= V.Note 1: We leave it to the reader as an exercise to verify that there is sucha subspace (the definition is well-posed) and that every x Rn is written in aunique way as x = v + w for v Vand w W.

Note 2: As a result of the rank-nullity theorem, if V Rn then:dimV+ dimV = n (6.43)

50


53/93

Example 13 (Complementary Space). Consider of a subspace VofR3 that isspanned by the following vectors:

x1 =

11

0

and x2 =

01

1

(6.44)

i.e. V= span{x1, x2} or what is the same V is the image of the matrix T =x1 x2

. From the rank-nullity theorem we know that dimV = 1. Hence

we are looking for a vector y such that y x1 and y x2, i.e. x1y = 0 andx2y = 0. This can be written in matrix notation as follows:

Ty = 0 y ker{T} (6.45)This way we find:

y =

111

(6.46)We now give the definition of an A-invariant linear subspace:

Definition 27 (Invariant Subspace). Let A Mn(R) and V Rn. V is saidto be A-invariant if

v V Av V (6.47)For the study of controllability of linear system, the linear space induced by

the controllability matrix plays an important role. This defines the controlla-bility space as follows:

Definition 28 (Controllability Space). The image of the controllability matrixof an LTI system is call

automatic control of state space systems - pantelis sopasakis

Documents