kalman filter (pdf, 89 kb)

35
Kalman Filter and Parameter Identification Florian Herzog 2013

Upload: vuliem

Post on 10-Feb-2017

241 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Kalman Filter (PDF, 89 KB)

Kalman Filter and ParameterIdentification

Florian Herzog

2013

Page 2: Kalman Filter (PDF, 89 KB)

Continuous-time Kalman Filter

In this chapter, we shall use stochastic processes with independent incrementsw1(.) and

w2(.) at the input and the output, respectively, of a dynamic system. We shall switch

back and forth between the mathematically precise description of these (normalized)

Brownian motions by their increments and the sloppy description by the corresponding

white noises v(.) and r(.), respectively, which is preferred in engineering circles

Q1/2

(t)dw1(t) = [v(t) − u(t)]dt (1)

R1/2

(t)dw2(t) = [r(t) − r(t)]dt , (2)

where Q1/2(t) and R1/2(t) are the volatility parameters.

We assume a linear dynamical system, where the inputs u(t) and the measurements

y(t) are disturbed by noise.

Stochastic Systems, 2013 2

Page 3: Kalman Filter (PDF, 89 KB)

Continuous-time Kalman Filter

Consider the following linear time-varying dynamic system of order n which is driven

by the m-vector-valued white noise v(.). Its initial state x(t0) is a random vector ξ

and its p-vector-valued output y(.) is corrupted by the additive white noise r(.):

System description in the mathematically precise form:

dx(t) = A(t)x(t)dt+ B(t)dv(t)

= [A(t)x(t) + B(t)u(t)]dt+ B(t)Q1/2

(t)dw1(t) (3)

x(t0) = ξ (4)

y(t)dt = C(t)x(t)dt+ dr(t)

= [C(t)x(t) + r(t)]dt+ R1/2

(t)dw2(t) , (5)

(6)

Stochastic Systems, 2013 3

Page 4: Kalman Filter (PDF, 89 KB)

Continuous-time Kalman Filter

System description in the engineer’s form:

x(t) = A(t)x(t) + B(t)v(t) (7)

x(t0) = ξ (8)

y(t) = C(t)x(t) + r(t) , (9)

where

ξ : N (x0,Σ0) (10)

v(t) : N (u(t), Q(t)) (11)

r(t) : N (r(t), R(t)) . (12)

Note that in the last two lines, we have deleted the factor 1dt which ought to multiply

Q(t) and R(t).

Stochastic Systems, 2013 4

Page 5: Kalman Filter (PDF, 89 KB)

Continuous-time Kalman Filter

The filtering problem is stated as follows: Find the optimal filter with the state vector

x which is optimal in the following sense:

• The state estimation is bias-free:

E{x(t) − x(t)} ≡ 0.

• The error of the state estimate has infimal covariance matrix:

Σopt(t) ≤ Σsubopt(t).

Stochastic Systems, 2013 5

Page 6: Kalman Filter (PDF, 89 KB)

Continuous-time Kalman Filter

With the above-mentioned assumptions that ξ, v, and r are mutually independent and

Gaussian, the optimal filter is the following linear dynamic system:

˙x(t) = A(t)x(t) + B(t)u(t) +H(t)[y(t)−r(t)−C(t)x(t)]

= [A(t)−H(t)C(t)]x(t) + B(t)u(t) +H(t)[y(t)−r(t)] (13)

x(t0) = x0 (14)

with

H(t) = Σ(t)CT(t)R

−1(t) , (15)

where Σ(t) is the covariance matrix of the error x(t)−x(t) of the state estimate

x(t).

Stochastic Systems, 2013 6

Page 7: Kalman Filter (PDF, 89 KB)

Continuous-time Kalman Filter

Σ(t) is the covariance matrix of the error x(t)−x(t) of the state estimate x(t)

satisfying the following matrix Riccati differential equation:

Σ(t) = A(t)Σ(t) + Σ(t)AT(t)

− Σ(t)CT(t)R

−1(t)C(t)Σ(t) + B(t)Q(t)B

T(t) (16)

Σ(t0) = Σ0 . (17)

The estimation error e(t) = x(t) − x(t) satisfies the differential equation

e = x− ˙x

= Ax+ Bu+ B(v−u) − [A−HC]x− Bu−HCx−H(r−r)

= [A−HC]e+ B(v−u) −H(r−r) . (18)

Stochastic Systems, 2013 7

Page 8: Kalman Filter (PDF, 89 KB)

Continuous-time Kalman Filter

The covariance matrix Σ(t) of the state estimation errer e(t) is governed by the matrix

differential equation

Σ = [A−HC]Σ + Σ[A−HC]T+ BQB

T+HRH

T(19)

= AΣ + ΣAT+ BQB

T − ΣCTR

−1CΣ

+ [H−ΣCTR

−1]R [H−ΣC

TR

−1]T

≥ AΣ + ΣAT+ BQB

T − ΣCTR

−1CΣ . (20)

Obviously, Σ(t) is infimized at all times t for

H(t) = Σ(t)CT(t)R

−1(t) . (21)

Stochastic Systems, 2013 8

Page 9: Kalman Filter (PDF, 89 KB)

Continuous-time Kalman Filter

Given (19) for the covariance matrix Σ(t):

Σ = [A−HC]Σ + Σ[A−HC]T+ BQB

T+HRH

T.

Its first differential with respect to H, at some value of H, with increment dH can be

formulated as follows:

dΣ(H, dH) =∂Σ(H)

∂HdH

= − dHCΣ − ΣCTdH

T+ dHRH

T+HRdH

T

= U [HR−ΣCT]TdH , (22)

where T is the matrix transposing operator and U is the operator which adds the

transposed matrix to its matrix argument.

Stochastic Systems, 2013 9

Page 10: Kalman Filter (PDF, 89 KB)

Continuous-time Kalman Filter

Consider the following stochastic system of first order with the random initial condition

ξ : N (x0,Σ0) and the uncorrelated white noises v : N (0, Q) and r : N (0, R):

x(t) = ax(t) + b[u(t) + v(t)]

x(0) = ξ

y(t) = x(t) + r(t) .

Find the time-invariant Kalman filter! The resulting time-invariant Kalman filter is

described by the following equations:

˙x(t) = −√a2 + b2

Q

Rx(t) + bu(t) +

(a+

√a2 + b2

Q

R

)y(t)

x(0) = x0 .

Stochastic Systems, 2013 10

Page 11: Kalman Filter (PDF, 89 KB)

Continuous/discrete-time Kalman Filter

The continuous-time measurement (9) which is corrupted by the white noise error r(t)

with infinite covariance,

y(t) = C(t)x(t) + r(t) ,

must be replaced by an “averaged” discrete-time measurement

yk = Ckx(tk) + rk (23)

rk : N (rk, Rk) (24)

Cov(ri, rj) = Riδij . (25)

Stochastic Systems, 2013 11

Page 12: Kalman Filter (PDF, 89 KB)

Continuous/discrete-time Kalman Filter

At any sampling time tk we denote measurements by x(tk|tk−1) and error covariance

matrix Σ(tk|tk−1) before the new measurement yk has been processed and the state

estimate x(tk|tk) and the corresponding error covariance matrix Σ(tk|tk) after the

new measurement yk has been processed.

As a consequence, the continuous-time/discrete-time Kalman filter alternatingly per-

forms update steps at the sampling times tk processing the latest measurement

information yk and open-loop extrapolation steps between the times tk and tK+1 with

no measurement information available:

Stochastic Systems, 2013 12

Page 13: Kalman Filter (PDF, 89 KB)

Continuous/discrete-time Kalman Filter

The Continuous-Time/Discrete-Time Kalman Filter

Update at time tk:

x(tk|tk) = x(tk|tk−1)

+ Σ(tk|tk−1)CTk

[CkΣ(tk|tk−1)C

Tk +Rk

]−1

× [yk−rk−Ckx(tk|tk−1)] (26)

Σ(tk|tk) = Σ(tk|tk−1)

− Σ(tk|tk−1)CTk

[CkΣ(tk|tk−1)C

Tk +Rk

]−1

CkΣ(tk|tk−1) (27)

or, equivalently:

Σ−1

(tk|tk) = Σ−1

(tk|tk−1) + CTkR

−1k Ck. (28)

Stochastic Systems, 2013 13

Page 14: Kalman Filter (PDF, 89 KB)

Continuous/discrete-time Kalman Filter

Continuous-time extrapolation from tk to tk+1 with the initial conditions x(tk|tk) and

Σ(tk|tk):

˙x(t|tk) = A(t)x(t|tk) + B(t)u(t) (29)

Σ(t|tk) = A(t)Σ(t|tk) + Σ(t|tk)AT(t) + B(t)Q(t)B

T(t) . (30)

Assuming that the Kalman filter starts with an update at the initial time t0, this Kalman

filter is initialized at t0 as follows:

x(t0|t−1) = x0 (31)

Σ(t0|t−1) = Σ0 . (32)

Stochastic Systems, 2013 14

Page 15: Kalman Filter (PDF, 89 KB)

Discrete-time Kalman Filter

We consider the following discrete-time stochastic system:

xk+1 = Fkxk +Gkvk (33)

x0 = ξ (34)

yk = Ckxk + rk (35)

ξ : N (x0,Σ0) (36)

v : N (uk, Qk) (37)

Cov(vi, vj) = Qiδij (38)

r : N (rk, Rk) (39)

Cov(ri, rj) = Riδij (40)

ξ, v, r : mutually independent. (41)

Stochastic Systems, 2013 15

Page 16: Kalman Filter (PDF, 89 KB)

Discrete-time Kalman Filter

We have the following (approximate) “zero-order-hold-equivalence” correspondences to

the continuous-time stochastic system:

Fk = Φ(tk+1, tk) transition matrix (42)

Gk =

∫ tk+1

tk

Φ(tk+1, t)B(t)dt zero order hold equivalence (43)

Ck = C(tk) (44)

Qk =Q(tk)

tk+1−tk(45)

Rk =R(tk)

∆tk∆tk = averaging time at time tk (46)

Stochastic Systems, 2013 16

Page 17: Kalman Filter (PDF, 89 KB)

Discrete-time Kalman Filter

The Discrete-Time Kalman Filter

Update at time tk:

xk|k = xk|k−1

+ Σk|k−1CTk

[CkΣk|k−1C

Tk +Rk

]−1

[yk−rk−Ckxk|k−1] (47)

Σk|k = Σk|k−1 − Σk|k−1CTk

[CkΣk|k−1C

Tk +Rk

]−1

CkΣk|k−1 (48)

or, equivalently:

Σ−1k|k = Σ

−1k−1|k−1 + C

TkR

−1k Ck (49)

Stochastic Systems, 2013 17

Page 18: Kalman Filter (PDF, 89 KB)

Discrete-time Kalman Filter

Extrapolation from tk to tk+1:

xk+1|k = Fkxk|k +Gkuk (50)

Σk+1|k = FkΣk|kFTk +GkQkG

Tk (51)

Initialization at time t0:

x0|−1 = x0 (52)

Σ0|−1 = Σ0 . (53)

Stochastic Systems, 2013 18

Page 19: Kalman Filter (PDF, 89 KB)

Extended Kalman Filter

The linearized problem may not be a good approximation we stick to the following

philosophy:

• Where it does not hurt: Analyze the nonlinear system.

• Where it hurts: use linearization.

We consider the following nonlinear stochastic system:

x(t) = f(x(t), u(t), t) + B(t)v(t) (54)

x(t0) = ξ (55)

y(t) = g(x(t), t) + r(t) (56)

ξ : N (x0,Σ0) (57)

v : N (v(t), Q(t)) (58)

r : N (r(t), R(t)) . (59)

Stochastic Systems, 2013 19

Page 20: Kalman Filter (PDF, 89 KB)

Extended Kalman Filter

The extended Kalman filter

Dynamics of the state estimation:

˙x(t) = f(x(t), u(t), t) + B(t)v(t) +H(t)[y(t)−r(t)−g(x(t), t)] (60)

x(t0) = x0 (61)

with H(t) = Σ(t)CT (t)R−1(t) . The “error covariance” matrix Σ(t) must be

calculated in real-time using the folloing matrix Riccati differential equation:

Σ(t) = A(t)Σ(t) + Σ(t)AT(t)

− Σ(t)CT(t)R

−1(t)C(t)Σ(t) + B(t)Q(t)B

T(t) (62)

Σ(t0) = Σ0 . (63)

Stochastic Systems, 2013 20

Page 21: Kalman Filter (PDF, 89 KB)

Extended Kalman Filter

The matrices A(t) and C(t) correspond to the dynamics matrix and the output

matrix, respectively, of the linearization of the nonlinear system around the estimated

trajectory:

A(t) =∂f(x(t), u(t), t)

∂x(64)

C(t) =∂g(x(t), t)

∂x. (65)

Stochastic Systems, 2013 21

Page 22: Kalman Filter (PDF, 89 KB)

Extended Kalman Filter

Because the dynamics of the “error covariance matrix”Σ(t) correspond to the linearized

system, we face the following problems:

• This filter is not optimal.

• The state estimate will be biased.

• The reference point for the linearization is questionable.

• The matrix Σ(t) is only an approximation of the state error covariance matrix.

• The state vectors x(t) und x(t) are not Gaussian even if ξ, v, and r are.

Stochastic Systems, 2013 22

Page 23: Kalman Filter (PDF, 89 KB)

Extended Kalman Filter

Again, the continous-time measurement corrupted with the white noise r with infinite

covariance

y(t) = g(x(t), t) + r(t) (66)

must be replaced by an “averaged” dicrete-time measurement

yk = g(x(tk), tk) + rk (67)

rk : N (rk, Rk) (68)

Cov(ri, rj) = Riδij . (69)

Stochastic Systems, 2013 23

Page 24: Kalman Filter (PDF, 89 KB)

Continuous-time/discrete-time extended Kalman filter

Update at time tk:

x(tk|tk) = x(tk|tk−1)

+ Σ(tk|tk−1)CTk

[CkΣ(tk|tk−1)C

Tk + Rk

]−1

× [yk−rk−g(x(tk|tk−1), tk)] (70)

Σ(tk|tk) = Σ(tk|tk−1) (71)

− Σ(tk|tk−1)CTk

[CkΣ(tk|tk−1)C

Tk + Rk

]−1

CkΣ(tk|tk−1) (72)

Extrapolation between tk and tk+1:

˙x(t|tk) = f(x(t|tk), u(t), t) + B(t)v(t) (73)

Σ(t|tk) = A(t)Σ(t|tk) + Σ(t|tk)AT(t) + B(t)Q(t)B

T(t) (74)

Stochastic Systems, 2013 24

Page 25: Kalman Filter (PDF, 89 KB)

Continuous-time/discrete-time extended Kalman filter

Again, the matrices A(t) and Ck are the dynamics matrix and the output matrix,

respectively, of the nonlinear system linearized around the estimated trajectory:

A(t) =∂f(x(t), u(t), t)

∂x(75)

Ck =∂g(x(tk|tk−1), tk)

∂x(76)

Initialization at t0:

x(t0|t−1) = x0 (77)

Σ(t0|t−1) = Σ0 . (78)

Stochastic Systems, 2013 25

Page 26: Kalman Filter (PDF, 89 KB)

Kalman Filter and Parameter Identification

The system and the measurement equations in discrete time are given by

xk+1 = Fkxk +Gkuk + [GkQ12k ]w

(1)k (79)

yk = Ckxk + rk + [R12k ]w

(2)k (80)

The update equations of the Kalman Filter are

xk|k = xk|k−1 + Σk|k−1CTk [CkΣk|k−1C

Tk + Rk]

−1vk (81)

vk = [yk − rk − Ckxk|k−1] (82)

Σk|k = Σk|k−1

−Σk|k−1CTk [CkΣk|k−1C

Tk + Rk]

−1CkΣk|k−1 (83)

where vk denotes the prediction error.

Stochastic Systems, 2013 26

Page 27: Kalman Filter (PDF, 89 KB)

Kalman Filter and Parameter Identification

The prediction equations are given by

xk|k−1 = Fk−1xk−1|k−1 +Gk−1uk−1 (84)

Σk|k−1 = Fk−1Σk−1|k−1FTk−1 +Gk−1QkG

Tk−1 (85)

The algorithm is carried out in two distinct parts:

• Prediction Step

The first step consists of forming an optimal predictor of the next observation, given

all the information available up to time tk−1. This results in a so-called a priori

estimate for time t.

• Updating Step

The a priori estimate is updated with the new information arriving at time tk that is

combined with the already available information from time tk−1. The result of this

step is the filtered estimate or the a posteriori estimate

Stochastic Systems, 2013 27

Page 28: Kalman Filter (PDF, 89 KB)

Kalman Filter and Parameter Identification

The prediction error is defined by

vk = yk − yk|k−1 = yk − E[yk|Fk−1] = yk − rk − Ckxk|k−1

and its covariance matrix can be calculated by

Kk|k−1 = Cov(vk|Fk−1)

= E[vkvTk |Fk−1]

= CkE[xk|k−1xTk|k−1|Fk−1]C

Tk + E[w

(2)k w

(2)T

k |Fk−1]

= CkΣk|k−1CTk + Rk

This result is important for forming the log likelihood estimator.

Stochastic Systems, 2013 28

Page 29: Kalman Filter (PDF, 89 KB)

Kalman Filter and Parameter Identification

We start to examine the appropriate log-likelihood function with our frame work of the

Kalman Filter. Let ψ ∈ Rd be the vector of unknown parameters, which belongs to the

admissible parameter space Ψ. The various system matrices of the state space models,

as well as the variances of stochastic processes, given in the previous section, depend

on ψ. The likelihood function of the state space model is given by the joint density of

the observational data y = (yN , yN−1, . . . , y1)

l(y, ψ) = p(yN , yN−1, . . . , y1;ψ) (86)

which reflects how likely it would have been to have observed the date if ψ were the

true values of the parameters.

Stochastic Systems, 2013 29

Page 30: Kalman Filter (PDF, 89 KB)

Kalman Filter and Parameter Identification

Using the definition of condition probability and employing Bayes’s theorem recursively,

we now write the joint density as product of conditional densities

l(y, ψ) = p(yN |yN−1, . . . , y1;ψ) · . . . · p(yk|yk−1, . . . , y1;ψ) · . . . · p(y1;ψ) (87)

where we approximate the initial density function p(y1;ψ) by p(y1|y0;ψ). Further-more, since we deal with a Markovian system, future values of yl with l > k only

depend on (yk, yk−1, . . . , y1) through the current values of yk, the expression in (87)

can be rewritten to depend on only the last observed values, thus it is reduced to

l(y, ψ) = p(yN |yN−1;ψ) · . . . · p(yk|yk−1;ψ) · . . . · p(y1;ψ) (88)

Stochastic Systems, 2013 30

Page 31: Kalman Filter (PDF, 89 KB)

Kalman Filter and Parameter Identification

For the purpose of estimating the parameter vector ψ we express the likelihood function

in terms of the prediction error. Since the variance of the prediction error vk is the

same as the conditional variance of yk, i.e.

Cov[vk|Fk−1] = Cov[yk|Fk−1] (89)

we are able to state the density function. The density function p(yk|yk−1;ψ) is a

Gauss Normal distribution with conditional mean

E[yk|Fk−1] = rk + Ckxk|k−1 (90)

with conditional covariance matrix

Cov[yk|Fk−1] = Kk|k−1 . (91)

Stochastic Systems, 2013 31

Page 32: Kalman Filter (PDF, 89 KB)

Kalman Filter and Parameter Identification

The density function of a n-dimensional Normal distribution can be written in matrix

notation as

1

(2π)n2√

|Σ|e−12(x−µ)

TΣ−1(x−µ)

where µ denotes the mean value and Σ covariance matrix. In our case x−µ is replaced

by yk − rk − Ckxk|k−1 = vk and the covariance matrix is Kk|k−1. The probability

density function thus takes the form

p(yk|yk−1;ψ) =1

(2π)p2

√|Kk|k−1|

e−12vTk K

−1k|k−1

vk , (92)

where vk ∈ Rp.

Stochastic Systems, 2013 32

Page 33: Kalman Filter (PDF, 89 KB)

Kalman Filter and Parameter Identification

Taking the logarithm of (92) yields

ln(p(yk|yk−1;ψ)) = −n

2ln(2π) −

1

2ln |Kk|k−1| −

1

2vTkK

−1k|k−1vk (93)

which gives us the whole log-likelihood function as the sum

L(y, ψ) = −1

2

N∑k=1

n ln(2π) + ln |Kk|k−1| + vTkK

−1k|k−1vk (94)

In order to estimate the unknown parameters from the log-likelihood function in (94)

we employ an appropriate optimization routine which maximizes L(y, ψ) with respect

to ψ.

Stochastic Systems, 2013 33

Page 34: Kalman Filter (PDF, 89 KB)

Kalman Filter and Parameter Identification

iven the simplicity of the innovation form of the likelihood function, we can find

the values for the conditional expectation of the prediction error and the conditional

covariance matrix for every given parameter vector ψ using the Kalman Filter algorithm.

Hence, we are able to calculate the value of the conditional log likelihood function

numerically. The parameter are estimated based on the optimization

ψML = argmaxψ∈Ψ

L(y, ψ) (95)

where Ψ denotes the parameter space. The optimization is either an unconstrained

optimization problem if ψ ∈ Rd or a constrained optimization if the parameter space

Ψ ⊂ Rd.

Stochastic Systems, 2013 34

Page 35: Kalman Filter (PDF, 89 KB)

Kalman Filter and Parameter Identification

Stochastic Systems, 2013 35