optimal structure of continuous nonlinear reduced-order pugachev filter

ISSN 1064�2307, Journal of Computer and Systems Sciences International, 2013, Vol. 52, No. 6, pp. 866–892. © Pleiades Publishing, Ltd., 2013.Original Russian Text © E.A. Rudenko, 2013, published in Izvestiya Akademii Nauk. Teoriya i Sistemy Upravleniya, 2013, No. 6, pp. 25–51.

866

INTRODUCTION

Nonlinear filters for random processes that are most accurate when it comes to singling the currentvalue of the Markovian useful signal out of its non�additive mixture with white noise at each observationinstant (instantaneously optimal) are in demand in radio engineering [1], navigation [2], medicine [3],communication [4], finances [5], etc.

When it comes to their construction in the framework of the classical Bayesian–Stratonovich approach[6, 7] based on the search of an a posteriori probability distribution, the infinite order of the optimal non�linear filter being a dynamic system with lumped parameters is the principal practical challenge. Thishampers its accurate implementation on a cheap processor online without memory, for instance onboardan aircraft. The reason is that the dimension of the state vector of the filter represented as a set of sufficientcoordinates (some numerical characteristics of a posteriori probability) is generally unbounded, resultingin infinite memory and time required for an exact filter to be implemented. This makes us use approximate(suboptimal) finite dimensional filtering algorithms like the extended (nonlinear) Kalman filter [7, 8],losing in estimation accuracy, or create banks of such filters, complicating the computer or increasingmeasurement processing time. To this end, other approximate methods for solving the Stratonovich–Kushner stochastic partial equation are being improved, especially its non�normalized Duncan–Mortensen–Zakai modification or the robust (non�stochastic) version of the latter [9] such as the quasi�stationary approximation [10]. The numerical Monte�Carlo method is common [11, 12], which alsorequires big memory and high performance. One also tried to search for special classes of observable sys�tems such that their exact nonlinear filter is still finite dimensional [13], its order depending on the formof this peculiar system. In terms of practice, it may be critical to decrease the order even for a linear Kal�man filter due to too high dimension of the state vector of the plant while the number of its componentsinformative for the user is small [14]. For a way to decrease the order of suboptimal nonlinear filters onemay refer to, for instance, [15].

Rejecting the highly complicated search for a posteriori distribution, an alternative approach to con�structing nonlinear filters searches directly for an ordinary differential state equation of the fast filter that,like the linear Kalman–Bucy filter, has the order of the observable plant. Generally, one part of its equa�tion, viz. the drift function, is found so that the estimate is unbiased while the other part, viz. the diffusionfunction (the gain coefficient of the updating process), is synthesized only using the integral criterion(on average, on the time segment) [16, 17], which does not ensure the filter to be the best at each instant.To construct an instantaneously optimal nonlinear filter of the plant’s order, Pugachev [7, 18] proposed alocal parametric approach setting all its structural functions (both drift and diffusion) and finding theirvariable coefficients so that to minimize the increment of the mean�square criterion over the infinitesimaltime interval after the previous measurement, i.e., under local optimality conditions. Similar to the gaincoefficient of the linear Kalman–Bucy filter, these coefficients that are not sufficient coordinates are

CONTROL IN STOCHASTIC SYSTEMS AND UNDER UNCERTAINTY CONDITIONS

Optimal Structure of Continuous Nonlinear Reduced�Order Pugachev Filter

E. A. RudenkoMoscow Aviation Institute, Volokolamskoe sh. 43, Moscow, 125993 Russia

Received April 22, 2013; in final form, July 12, 2013

Abstract—A non�parametric synthesis problem is considered for a fast nonlinear low�memory filterconsisting of the same number of equations as the number of only information components of the dif�fusion Markovian state vector of the observation plant. The algorithm for finding mean�square locallyoptimal structural functions of the filter and the reduced Fokker–Planck–Kolmogorov equation to beused to find the respective instantaneously conditional probability distribution are obtained. The pro�posed filter in its full order is proved to coincide with the linear Kalman–Bucy filter in various linearGaussian cases. Ways to construct Gaussian and linearized suboptimal filters are proposed. The exam�ple is given where the latter are compared with their analogues.

DOI: 10.1134/S1064230713060099

JOURNAL OF COMPUTER AND SYSTEMS SCIENCES INTERNATIONAL Vol. 52 No. 6 2013

OPTIMAL STRUCTURE 867

found beforehand prior to the estimation process. This requires solving the Fokker–Planck–Kolmogorov(FPK) partial differential equation with respect to the joint probability distribution of the current states ofthe plant and the filter along with the system of equations for the sought coefficients rather than the dif�ferential Riccati equation for the covariance matrix. The system of equations for the sought coefficientsincludes only mathematical expectations of functions of these states therefore it also turned handy tosearch for coefficients of the Pugachev filter he called conditionally optimal (COF) numerically by theMonte�Carlo method applied offline rather than onboard an aircraft. This also helps eliminate the origi�nal restriction of the classical approach, where the continuous white noise of the measurement could beonly additive, now allowing it to be multiplicative as well. The structure of such filter that obviously influ�ences estimation accuracy was proposed to be searched only heuristically [19] based on comparing itsequation with the equation used for estimating from some approximation to the Stratonovich filter calledabsolutely optimal filter (AOF) for the sake of definiteness in what follows.

In this work, we develop the Pugachev approach, viz. the existing way to synthesize the optimal non�linear structure of the filter of the plant’s order [20–22], which is given in [23, 24] for discrete time. Here,the case of continuous time is presented more strictly and is not bounded by rigid existence conditions ofprobability densities. Moreover, we generalize it for the problem of estimating only the informative part ofstate variables so that we can decrease the filter order even more, which naturally results in the certain lossin accuracy. The linear Kalman–Bucy filter is shown to be a special case of the proposed optimal structurefilter (OSF). The nonlinear drift and diffusion functions of the latter can be expressed explicitly via theinstantaneously conditional (rather than classical a posteriori) probability distribution of the vector to beestimated, its condition consisting only of current (instantaneous) value of the filter state and, may be, thevalue of the last measurement rather than of the entire set of previous measurements. As a result, to syn�thesize OSF, we preliminary solve the new deterministic partial differential equation with respect to thisconditional distribution. In fact, it is the result of decomposition of the FPK equation with respect to therespective joint density into two equations—one for the conditional density and one for the partial densityof its conditions. This can also be done beforehand by the Monte�Carlo method yet it would involvesophisticated construction of histograms of the sought functions. Therefore, we also consider two simplestapproximate ways to find the OSF structure analytically based either on only the Gaussian approximationof the probabilistic distribution or also on the additional linearization of the equations of the plant andsensor in the neighborhood of the estimate. We also found that the structure of equations of such subop�timal OSF is similar to the structure of the part of equations of similar approximations to AOF, with therandom a posteriori covariance in them replaced by deterministic numerical parameters that can be easilyfound by a standard procedure of the Monte�Carlo method. Finally, we give a simple scalar example thatcompares the given suboptimal versions of the proposed one�dimensional OSF both with one�dimen�sional COF and with the similar two�dimensional approximations to AOF and demonstrates advantagesof the former in terms of both implementation simplicity and estimation accuracy.

1. STATEMENT OF THE FILTER SYNTHESIS PROBLEM

We give the mathematical models of the observable system and the proposed filter as well as the idea ofan alternative approach. Methodically, for the principal case, we take accurate measurements of part ofstate variables of the observable system (accurate incomplete measurements) since the variant of inaccuratemeasurements of the plant’s states with white noise (complete stochastic measurements) more common inpublications is its special case. We understand all stochastic differential equations in the Ito sense.

1.1. Equations of the Plant, Sensor, and Filter

Suppose t is time, Xt is the n�dimensional vector of the useful signal, Yt is the �dimensional vectorof its indirect measurement, and Wt is the l�dimensional standard Wiener process independent of theirrandom initial values X0 and Y0. Generally, one considers a variant when the equations of the observationplant (the forming filter of the signal Xt) and the sensor are given separately [8]

We consider this case in Section 6. First, we study more general, combined system of equations of theobservation plant and sensor [6]

(1.1)

≥ 0 m

= , + , , = , + , .( ) ( ) ( ) ( )t t t t t t t tdX a t X dt B t X dW dY c t X dt D t X dW

= , , + , , ,

= , , + , , .

( ) ( )

( ) ( )t t t t t t

t t t t t t

dX a t X Y dt B t X Y dW

dY c t X Y dt D t X Y dW

868


RUDENKO

Here, the vector functions and and the matrix functions and of the vari�

ables , are given. In what follows, we denote its noise intensity matrices as ,

, and ; since the latter is positive definite, the white noise of the measurement

is non�degenerate.

Suppose the initial non�measurable vector X0 of system (1.1) is given by the known conditional prob�

ability (probabilistic measure) , where is an arbitrary Borel subset of the

Euclidean space Rn while the distribution law , for the initial measurement Y0 isnot necessarily given in the general case. If it is given, the pair (X0, Y0) has a known joint distribution law

that can be found as the Lebesgue–Stieltjes integral of with respect to the measure

Using all previous measurements at each instant t, estimate the most interestingn'�dimensional part of of the non�measurable vector Xt. Without loss of generality, we compose it of its

first components. To find the quality of the estimate of this information vector at anyinstant , we use conventional requirements for it to be unbiased

(1.2)

and for its mean square error to be minimum

(1.3)

Here M is the operator of mathematical expectation, is the quadratic lossfunction with the positive definite continuously differentiable matrix of weight coefficients .

It is for the mentioned uncertainty of the distribution of the measurement Y0 that the classical theoryof optimal nonlinear filtering yields the estimate of the entire vector Xt most accurate in terms of (1.3).

Indeed, to find a posteriori probability density using the Stratonovich–Kushner stochastic par�tial differential equation [6, 7], it is sufficient to know only the conditional initial density since

. The equation itself holds only when the intensity of the measurement noise R(·) is inde�pendent of the vector Xt

(1.4)

Here and in what follows, the equality emphasizes that the function is independent of (invari�ant to) one of its arguments: . Moving from a posteriori density to its numericalcharacteristics, one can obtain the known equation of infinite�dimensional Stratonovich AOF

(1.5)

where is the vector of sufficient coordinates, with the estimate of the entire vector Xt includedin it. Its alternative finite�dimensional Pugachev COF

(1.6)

includes the pre�chosen structural functions , of any dimensions and while

, , and are its coefficients to be optimized.

Unlike this, we search the estimate Zt of part of the vector Xt in the class of implicit functionals of all

measurements , which is narrower than the classical class of explicit functionals leading to (1.5). We give it by the following differential equation of the �dimensional OSF and its initialcondition

(1.7)

, ,( )a t x y , ,( )c t x y , ,( )B t x y , ,( )D t x y

, ∈Rnt x ∈ R my TQ BB=

TS BD=

T 0R DD= >

tdD Wdt

η = ∈ =| P |0 0 0( ) [ ]A y X A Y y ⊂ R nA

= ∈P0 0( ) [ ]q B Y B ⊂ R mB

,0( )p A B η |0( )A y 0( )q B

, = η .∫ |0 0 0( ) ( ) ( )

B

p A B A y q dy

{ }τ

= : ≤ τ <0 0tY Y t'tX

≤'n n ∈R 'ntZ 'tX

≥ 0t

− =M[ ]' 0t tX Z

= ψ , , → .M[ ]'( ) mint t tI t X Z

ψ , , = − −( ' ) ( ' ) ( ' )Ttt x z x z L x z

> 0tL

,� | 0( )tp t x Y

|0 00( )tp | x Y=

= η� �

, , = , = .inv( ) ( ) ( )R t x y R t y x

, = inv( ) ( )f u uv

, = , ∀ , ,( ) ( )f u f w u wv v v

= ϕ , , + Ψ , , , = ξ ,0 0( ) ( ) ( )t t t t t tdS t Y S dt t Y S dY S Y

∞

∈RtS ˆtX

ˆ ˆ ˆ( ) ( )t t t t t t t t tdX t Y X dt t Y X dY dt= α ξ , , + β Θ , , + γ

ξ , , ∈R( ) pt y z Θ , , ∈ R( ) qmt y z p q

α ∈Rnpt β ∈Rnq

t γ ∈Rnt

= 0[ ]tt tZ F Y = 0

ˆ ˆ[ ]tt tX F Y

'n

= , , + , , , = .0 0( ) ( ) ( )t t t t t tdZ f t Y Z dt G t Y Z dY Z h Y



Here, the drift , diffusion and initial functions are arbitrary non�linearities that control the estimation process at the current and initial instants, respectively. As usual, wefind the function of the initial state of the filter under the condition of minimum of the variable I0.We obtain the drift and diffusion functions so that estimate is unbiased (see (1.2)) and mean squareerror (1.3) is minimum at each instant . For the sake of brevity, we combine these functions in onefilter control function

(1.8)

and distinguish between its section at the instant t (the function of the current control) and the set of

all previous sections (previous controls)

For admissible sets and of Borel measurable, sought OSF functions (1.7) and functionsof original system (1.1) and its initial conditions, we make the following assumptions.

(1) The function of the initial state of the filter and initial conditions of system (1.1) have the

finite second moments so that .

(2) The filter control function is piecewise right continuous with respect to time t: doesnot necessarily coincide with . Therefore, the current control does not depend on previouscontrols

(3) The functions of the filter and the original system , , , and for any t satisfy theconditions of their Lipschitz continuity with respect to and bounded rate of change for

, which is sufficient [25] for the strong solution of system of equations (1.1),(1.7) to exist and be unique.

Note that conditions (1) and (3) ensure that all second moments are finite, i.e. for any as well. In particular, this ensures existence of accuracy characteristics (1.2), (1.3) for chosenclass of estimates (1.7).

1.2. Justifying Local Criterion

It is challenging to find the structure of finite�dimensional filter (1.7) that minimizes mean square error (1.3)at any instant t, i.e., instantaneously optimal, by the traditional method of Lagrange multipliers. Indeed,there is a family of Mayer terminal control problems parameterized by time t, on the segment forstochastic differential system (1.1), (1.7) with criterion (1.3). Given certain smoothness conditions, thelatter can be represented as a common integral

(1.9)

where is the joint probability density of the states of the observable system and filterat the instant . Here and in what follows, unless otherwise stated, the integrals are calculated over allrespective Euclidean space. The density on the entire segment satisfies the respective FPKpartial differential equation, with its drift and diffusion coefficients including the sought control function

. Hence, at the instant t, the density is a functional of all previous controls , and to syn�thesize the filter, one needs to solve the optimal control problem of the system with distributed parametersgiven by the FPK equation. For any t, this software control should ensure minimum of terminal cri�terion (1.9), which is linear with respect to the stated density. By the Pontryagin maximum principle, thenecessary optimality condition of each of the problems of this family represents a two�point boundaryproblem for the system of two partial differential equations. The latter consists of the FPK equation forthe probability density and the equation with respect to the corresponding conjugate function .The right boundary condition (at the instant ) for this function is given by the loss function [22]. As a result, the solution of a “short” problem from this family is not necessarily a part of the solutionof a “longer” problem.

, , ∈ R '( ) nf t y z , , ∈ R '( ) n mG t y z ∈ R '( ) nh y

( )h y⋅( )f ⋅( )G

> 0t

( ) ( ( ) ( ))u t y z f t y z G t y z, , = , , , , ,

⋅()tu

⋅0()tu

{ }0( ) ( ) () () 0ttu y z u t y z u u t

τ, = , , , ⋅ = ⋅ : ≤ τ < .

Φ ∋ ⋅()u Ψ ∋ ⋅( )h

⋅ ∈ Ψ( )h

+ + < ∞M2 2 2

0 0 0( )X Y Z

⋅ ∈ Φ( )u−

⋅0()tu

+⋅ = ⋅0() ()t tu u

⋅ = ⋅ .inv 0() [ ()]ttu u

⋅ ∈ Φ( )u ⋅( )a ⋅( )B ⋅( )c ⋅( )D, ,x y z

+ + → ∞x y z , ,( )t t tX Y Z

+ + < ∞M2 2 2( )t t tX Y Z

> 0t

τ ∈ ,[0 ]t

= ψ , , , , , ,∫ �( ' ) ( )tI t x z r t x y z dxdydz

τ, , ,�( )r x y zτ τ τ, ,X Y Z

τ

τ, ⋅�( )r τ ∈ ,[0 ]t

τ, ,( )u y z , ⋅�( )r t⋅0()tu

λ , , ,( )t x y zτ = t ψ , ,( ' )t x z

870


RUDENKO

Therefore, following Pugachev, we choose structure (1.8) of OSF equation (1.7) (the software control )from the admissible set Φ using another condition based on the following fitting (alignment) idea. Suppose

at the arbitrary instant the value of criterion (1.3) is already attained at the expense of some

choice on the interval [0, t) of the previous controls , and, by assumption (2) dealing with piecewisecontinuity, it is independent of the current control

(1.10)

We take arbitrary short future time segment , , and choose the control on it, min�imizing the value of the criterion

Passing to the limit for and since

,

we have the problem of searching for the best section

(1.11)

To solve it, we consider the increment of the criterion over this interval that is given

by the nearest future control . We introduce the right�hand derivative of the criterion with respectto time

(1.12)

that depends directly only on the sought section since the current control is independent of alreadyselected previous controls . Then, instead of (1.11), we have the following simpler condition.

T h e o r e m 1. If the criterion It has derivative with respect to time (1.12) and it is finite , for

fixed current value It its value at the future, arbitrarily close instant is optimized by choosingthe best section of the control so that this derivative is minimum

(1.13)

The principal propositions are proved in the Appendix.Emphasizing the need to take into account independence (1.10) of the attained value of the criterion

It of the sought section, we write condition (1.13) as

(1.14)

where Jt is derivative (1.12) found given this independence

(1.15)

Functional (1.15) is known as the local optimality criterion [26], and condition (1.14) was applied incontrol problems for distributed systems [27]. As we showed above, it reduces the conditional extremum

of the parameterized functional It to the unconditional extremum of its right�hand derivative .

C o r o l l a r y 1. If we choose all controls on the interval at each instant τ to be locally optimal ,they ensure optimality of the criterion as well

(1.16)

⋅( )u

> 0t⋅0[ ()]t

tI u

⋅0()tu

⋅ = ⋅ .inv0[ ()] [ ()]tt tI u u

, + Δ[ ]t t t Δ > 0t +Δ

⋅()t ttu

+Δt tI

+Δ

+Δ+Δ

+Δ

⋅ ∈Φ

⋅ = ⋅ , ⋅ .

�

0( )

() arg min [ () ()]t tt

t t tt tt t tt

uI u uu

Δ ↓ 0t

+Δ

Δ ↓

⋅ = ⋅

0() lim ()t t

t tt

u u

0( ) 0

() arg min (lim [ () ()])t t

t t tt t tt

u tI u uu

Δ +Δ+Δ

⋅ ∈Φ Δ ↓

⋅ ⋅ , ⋅ .=

�

,Δ +ΔΔ = −t t t t tI I I

+Δ

⋅()t ttu

0

1[ ()] lim [ ()]t ttt t t t

t

dIu I u

dt t+Δ

,ΔΔ ↓

⋅ = Δ ⋅

Δ

⋅()tu⋅0()tu

< ∞tdI

dt

+Δt tI + Δt t

⋅

� ()tu

[ ()] [ ()] () 0t tt t tt

dI dIu u tudt dt

⋅ ≤ ⋅ ∀ ⋅ ∈Φ ,∀ > .

�

⋅ ∈Φ

⋅ = ⋅ ∀ > ,

�

( )() arg min [ ()] 0

t t

t ttu

J u tu

= ⋅

⋅ = ⋅ .inv[ ( )][ ()] [ ()]t t

tt t t I u

dIJ u u

dt

tdIdt

τ ∈ ;[0 )t⋅

�

0()tu

0 00() () () 0t ttt tI I u u tu ⎡ ⎤⋅ ≤ ⋅ ∀ ⋅ ∈Φ, ∀ > .⎡ ⎤⎣ ⎦ ⎣ ⎦�



In what follows, we omit the optimality symbol .

Thus, local optimality (1.14) of the estimate Zt of proposed filter (1.7) for all leads to its globalinstantaneous optimality (1.3) but only among estimates from this bounded class of implicit measurement

functionals. Unlike this, the estimate of Stratonovich AOF (1.5) is instantaneously optimal in abroader class of all explicit measurement functionals. Similarly to corollary 1, both of them are also inte�gral optimal on any time segment. Obviously, OSF class (1.7) includes Pugachev COF (1.6) for .However, being finite�dimensional, filter (1.7) itself is included in the set of infinite�dimensional Stra�tonovich AOF (1.5). Therefore, with respect to estimation accuracy, the proposed OSF must take an inter�mediate position being a fortiori worse than AOF yet potentially better than COF

(1.17)

2. FINDING OPTIMAL FILTER STRUCTURE

First, we find the explicit dependence of derivative (1.12) of mean square error (1.3) on the respectivesection of control function (1.8) by filter (1.7).

2.1. Explicit Form of Local Criterion

Suppose A , , and are arbitrary Borel subsets of the respective Euclidean spaces; = is the joint probability (the probabilistic measure) of random vari�

ables ; and is its section at the fixed instant t.In what follows, for the sake of brevity, we use angle brackets to designate the mathematical expectation

of the random variable , where is a Borel function on . It can be also speci�fied as the Lebesgue–Stieltjes integral of the function averaged with respect to the measure

(2.1)

Then, mean square error (1.3) has the form of the functional more general than (1.9) and is linear withrespect to

(2.2)

This and independence of the mean square error of the current control means that thesection of probability is independent of it as well.

Combining (1.1) with (1.7), we have that the vector has the stochastic differential, where the functions

Suppose is an arbitrary non�random scalar function continuously differentiable once withrespect to t and twice with respect to . Then, the random process also has a stochastic differentialthat can be found by the Ito formula [7, 25]

(2.3)

where K* is the conjugate generating operator of the process Vt

(2.4)

Here, is the gradient operator and tr is the matrix trace operator.

If the function has the mathematical expectation , we can obtain from (2.3) given that

the product on is independent of and given the equality

�

> 0t

ˆtX

='n n

AOF OSF COF( ) ( ) ( )t t tI I I≤ ≤ .

⊂ R n⊂ R mB

′

⊂ R nC, , ,( )r t A B C ∈ , ∈ , ∈P[ ]t t tX A Y B Z C

, ,t t tX Y Z , ,( )tr A B C

φ , , ,( )t t tt X Y Z φ , , ,( )t x y z + +R 'n m n

⋅()tr

φ , , , , = φ , , , = φ , , , , , , .∫M( ) [ ( )] ( ) ( )t t t tt x y z r t X Y Z t x y z r t dx dy dz

⋅()tr

= ψ , , , .( ' )t tI t x z r

= ⋅inv[ ()]t tI u⋅ = ⋅inv() [ ()]t tr u

=

T T T T( )t t t tV X Y Z= ε , + Θ ,( ) ( )t t t tdV t V dt t V dW

ε = , Θ = .+

T T T T TT T T [ ( ) ][ ( ) ] B D GDa c f Gc

ϕ ,( )t vv ϕ ,( )tt V

{ }∂ϕ , = ϕ , + ϕ , + ∇ ϕ , Θ , ,∂

T*( ) ( ) [ ( )] ( ( )) ( )t t t t t td t V t V K t V dt t V t V dWt

= ε ∇ + ΘΘ ∇∇ .

T T T* tr[ ]12

K

∇ = ∇ ,∇ ,∇T T T T( )x y z

ϕ ,( )tt V ϕ ,M[ ( )]tt V

∇ ϕ ,T ( )tt V Θ ,( )tt V tdW =M[ ] 0tdW

{ }∂ϕ , = ϕ , + ϕ , ,∂

M M *[ ( )] ( ) [ ( )]t t td t V t V K t V dtt

872


RUDENKO

or, using the designations and (2.1),

(2.5)

Since is arbitrary, this relation is linear integro�differential identity that, together with the obviousinitial condition

, (2.6)

completely defines the probabilistic measure differentiable with respect to t. Here, is theindicator of the set C: for and for .

If this measure is absolutely continuous with respect to the Lebesgue measure, i.e., there exists the den�sity such that

(2.5) gives the generalized solution of the respective FPK equation [28]. One can use any existing methodfor analyzing stochastic differential systems [7, 22, 26, 29] to find the probability from (2.5),(2.6) given the known distribution of the initial measurement Y0.

In what follows, we use one or two strokes to designate the n'�row upper or �row lower blocks of

any n�row matrix

respectively.

L e m m a 1. If assumptions (1)–(3) are met, the local mean square optimality criterion (1.15) of filter (1.7)represents the following explicit functional of sections of its structural functions (1.8)

(2.7)

where the section of the measure and the term are independent of them: .

2.2. Sufficient Condition for Estimate to Be Unbiased

Since the section of probability was proved in lemma 1 to be independent of the section of the con�trol , one can choose the latter without taking into account this influence, unlike it is usually done.Note that the control function of filter (1.7) is independent of the variable x. Therefore, we intro�duce the partial probability only for the variables

(2.8)

and the conditional probability . The latter is given as the Radon–Nikodym derivative of the joint measure with respect to the partial probability

(2.9)

so that the formula for multiplying probabilities takes the form

(2.10)

The conditional probability exists since the joint measure is absolutely continuous with respectto the partial measure [30, 31].

= , ,( )x y zv

,∂ϕ , , = ϕ , + ϕ , , , ∀ϕ , ∈ .∂

* C1 2( ) ( ) [ ( )] ( )t td t r t K t r tdt t

v v v v

ϕ ,( )t v

, , = η χ∫0 0 0( ) ( ) [ ( )] ( )C

B

r A B C A | y h y q dy

, , ,( )r t A B C χ ( )C zχ =( ) 1C z ∈z C χ =( ) 0C z ∉z C

, , ,�( )r t x y z

× ×

, , , = , , , ,∫ �( ) ( )

A B C

r t A B C r t x y z dxdydz

, , ,( )r t A B C

0( )q B

′−( )n n

Σ∈Rnq

T ' 'T T' '' ' R '' R( )[ ] ,n q n n q−

Σ = Σ Σ , Σ ∈ , Σ ∈

( )T

T T

tr{ '

'

[ ] [2 ( ) ( ) ( ) ( )

2 ( ) ( ) ( ) ( ) ( )]} ( )

t t t t t t t

t t t

J f G r L f y z G y z c t x y z x

S t x y G y z G y z R t x y G y z C t

, = , , + , , , −

− , , , + , , , , + ,

⋅()tr ( )C t ; = .inv( ( )) [ ]t tr C t u

⋅()tr⋅()tu

,( )tu y z,t tY Z

( ) ( )s t B C r t dx B C, , = , , ,∫ρ , , = ∈ = , =| P |( ) [ ]t t tt A y z X A Y y Z z

⋅( )r

, , ⋅, ⋅ρ , , = , ,

, ⋅, ⋅

( )( ) ( )

( )dr t A

t A | y z y zds t

, , , = ρ , , , , .|( ) ( ) ( )r t dx dy dz t dx y z s t dy dz

ρ ⋅() ⋅( )r⋅( )s



We use curly brackets, similar to (2.1), to designate the integral of the mathematical expectation of thefunction that depends on the measurement Yt and estimate Zt and is independent of Xt

(2.11)

and double square brackets to designate the integral of the conditional averaging of the function only with respect to its argument Xt with measure (2.9)

(2.12)

or, for even more brevity, the overline so that

Then, the following relation of three averages (2.1), (2.11), and (2.12) follows from (2.10)

(2.13)

T h e o r e m 2. For the current estimate Zt of filter (1.7) to be unbiased for any partial probabilities ,it is sufficient for the condition

(2.14)

to be met.

We can also write (2.14) as or ; hence, we have . The latter showsthat the current measurement Yt does not influence the estimate Zt so that it does depend only on its pre�

vious values .

C o r o l l a r y 2. The sufficient unbiasedness condition (2.14) ensures for any that the error is orthogonal both to the estimate and to the measurement

(2.15)

Here and in what follows, O is the zero matrix.

2.3. Minimizing Local Criterion

Using the relation between averages (2.13) and given that the sought functions and areindependent of , we can represent functional (2.7) as the integral over the partial measure

Here, the function of the group of variables , and has the form

It is separable with respect to e and H, , with its first term linear withrespect to e

while the second term is quadratic with respect to H

(2.16)

It follows from condition (2.14) that for any e, therefore . As a result, thelocal criterion of unbiased filter depends only on its diffusion function , where

(2.17)

so one only needs to find the minimum of the last quadratic functional.

ξ , ,( )t tt Y Z

{ } M( ) [ ( )] ( ) ( )t t tt y z s t Y Z t y z s t dy dzξ , , , = ξ , , = ξ , , , ,∫

φ , , ,( )t t tt X Y Z

� � M | |( ) [ ( ) ] ( ) ( )t t t t t tt x y z t X Y Z Y y Z z t x y z t dx y zφ , , , , ρ = φ , , , = , = = φ , , , ρ , ,∫

� �( ) ( ) ( ) tt x y z t y z t x y zφ , , , = φ , , = φ , , , , ρ .

� �φ , , , , = φ , , , = φ , , , ,ρ , .( ) { ( ) } { ( ) }t t t tt x y z r t y z s t x y z s

⋅( )ts

� �' 0tx z t,ρ = , ∀ >

='x z , =M |'[ ]tX y z z = ,M |'[ ]t t t tZ X Y Z

0tY

⋅()ts= −'t t tE X Z

× ×, = , , = , ∀ > .cov cov' ' '[ ] [ ] 0t t n n t t n mE Z O E Y O t

,( )tf y z ,( )tG y zx ⋅( )s

{ }, = ξ , , , , , , , + .[ ] ( ( ) ( )) ( )t t t t t tJ f G t y z f y z G y z s C t

ξ ⋅() ζ = , ,( )t y z ∈ R 'ne ∈R 'n mH

T Ttr{ ' ' }( ) [2( )[ ( )] [2 ( ) ( )] ]t te H L z x e Hc t x y S t x y HR t x y Hξ ζ, , = − + , , − , , − , , ,ρ .� ��

ξ ζ, , = ξ ζ, + ξ ζ,1 2( ) ( ) ( )e H e H

� �T Ttr( ) '1( ) 2 [2( ') ] 2 ( )t t t tt y z e L z x e e L z xξ , , , = − , ρ = − ,ρ� �

� ��

T T Ttr( ' ' )2( ) [ ( ) 2( ( ) ( ) ) ]t tt y z H L HR t x y H S t x y x z c Hξ , , , = , , − , , + − ,ρ .� ��

ξ ζ, =1( ) 0e ξ ζ, , = ξ ζ,2( ) ( )e H H, = +�[ ] [ ] ( )t t t ttJ f G G C tJ

{ }= ξ , , , , , ,�

2[ ] ( ( ))t t t tJ G t y z G y z s

874


RUDENKO

T h e o r e m 3. If unbiasedness condition (2.14) is met, the unconditional minimum of local criterion (2.7)is attained on the solution of the equation

(2.18)

2.4. Equation for Conditional Probability

We show that uncertainties of both the function h(y) of the initial state of filter (1.7) and the probabi�listic measure of the initial measurement result in the fact that only partial probability (2.8) of therandom values rather than the entire joint probability of the vectors , , and is arbi�trary while conditional probability (2.9) satisfies some equation. When in [22] an attempt was made to dowithout this equation and use only the identity for and its respective FPK equation, it resulted in anexcessive term in the drift function that worsened the estimation accuracy.

L e m m a 2. The partial probability is given by the identity

(2.19)

with the operator

(2.20)

whose coefficients depend on the conditional probability and with the initial condi�tion

It follows from lemma 2 that the uncertainties and make the partial probability arbi�trary for any .

T h e o r e m 4. If the conditional probability is twice continuously differentiable with respectto y and z, it is given by the integro�differential identity

(2.21)

with the initial condition .

We show that if the conditional probability has the density so that, resulting identity (2.21) gives the generalized solution to some partial differen�

tial equation for this density.

C o r o l l a r y 3. Suppose the conditional probability density exists and is twice continuouslydifferentiable with respect to variables x, y, and z while the part of the drift function of originalsystem (1.1) and the part of its diffusion function are continuously differentiable with respect to

once and twice, respectively. Then, this density satisfies the equation

(2.22)

with the initial condition in the form of the known conditional probability density of the random initialvalue X0 of the vector to be estimated: .

R e m a r k 1. In the expanded writing, (2.22) takes the form

i.e., it is a nonlinear integro�differential deterministic equation. Here, for the sake of brevity, we use sub�scripts y and z to designate the vectors of the first and matrices of the second partial derivatives of the scalar

T' ' '( ) ( ) ( ) ( ) ( ) 0t n mG t y z R t x y x z c t x y S t x y O t×

, , , , − − , , − , , ,ρ = , ∀ > .

� ��

0( )q B 0Y,t tY Z , , ,( )r t A B C tX tY tZ

⋅( )r⋅( )f

, ,( )s t B C

{ }⎧ ⎫ , ,⎪ ⎪⎨ ⎬⎪ ⎪⎩ ⎭

∂ ∗ξ , , , = ξ , , + ξ , , , , ∀ξ , , ∈∂

C1 2 2( ) ( ) [ ( )] ( )t yz td t y z s t y z L t y z s t y zdt t

∗ = ∇ + + ∇ + ∇ ∇ + ∇ + ∇ ∇ ,T T T T Ttr[ ]1( ) (2 )

2yz y z y y y z zL c f Gc R GR G

, , , , ,( ) ( )c t y z R t y z ρ ⋅()t

0 0( ) [ ( )] ( )C

B

s B C h y q dy, = χ .∫( )h y 0( )q B , ,( )s t B C

≥ 0t

ρ , ,|( )t A y z

� � � �* C1 2( ) ( ) [ ( )] ( ) ( )t t yz tt t K t L t tt t

,∂ ∂ ∗ϕ , ,ρ = ϕ , + ϕ , ,ρ − ϕ , ,ρ , ∀ϕ , ∈∂ ∂

� ��

v v v v v

ρ , = η| |0 0( ) ( )A y z A y

ρ , ,|( )t A y z ρ , ,� |( )t x y zρ , , = ρ , ,�| |( ) ( )t dx y z t x y z dx

ρ , ,� |( )t x y z, ,( )a t x y

, ,( )B t x yx

T T| | | tr[ ]1( ) [ ( )] [ ( )]2x yz x x x xt x y z M t x y z L t x y z M a Q

t∂ ∗ρ , , = ρ , , − ρ , , , = −∇ + ∇ ∇∂

� � �

| |0 0( ) ( )x y z x yρ , = η� �

T T T T

T Ttr[ ]

( ) ( )

1 ( ) ( 2 )2

x y z z

x x yy yz zz

a G c dx ft

Q G G G R dx

∂ ρ = −∇ ρ − + ρ −ρ ρ ρ∂

+ ∇ ∇ ρ − + + ρ ,ρ ρ ρ

∫∫

� � ��

� ��



function with respect to the corresponding variables, for instance, , . In whatfollows, we never use the latter equation.

2.5. Explicit Form of Structural Functions of the Filter

To find the function of the initial state of filter (1.7), we minimize the functional

on the set of all Borel functions . By the known theorem on the best mean square regression [30],

the optimal function exists and is the conditional mean

(2.23)

that is given by the known conditional probability . In what follows, we omit the superscript o stand�

ing for the optimal. The optimal estimate does have the finite second moment stated in assumption (2), is unbiased , and its error possesses orthogonalityproperties (2.15) as well.

From extremum condition (2.18), given that the diffusion function is independent of the inte�gration variable and the assumption , we have

(2.24)

To obtain the explicit form of the function , we differentiate condition (2.14) with respect to t.

T h e o r e m 5. If sufficient unbiasedness condition (2.14) is met and identity (2.21) holds for the con�ditional probability, the drift function of filter (1.7) has the form

(2.25)

As a result, using (2.25), we can write optimal filter equation (1.7) in the brief form

and, taking into account (2.24), in the detailed form, via conditional means (2.12) of the functions of sys�tem (1.1)

Here, is the repetitive argument.

2.6. Algorithm for Synthesizing Optimal Filter

Thus, the drift function f(t, y, z) and diffusion function G(t, y, z) of filter (1.7) can be expressed by for�mulas (2.24) and (2.25) via integrals over the conditional probabilistic measure . Therefore, if theinitial probability is unknown, OSF construction is reduced to the complicated problem of findingthis measure from nonlinear identity (2.21) or to solving nonlinear partial integro�differential equation (2.22)for the conditional probability density. Before we do that, we need to substitute and expressed via

into these relations, which makes it even more complicated to solve them.

In the special case, if the measure is given, the joint measures and (2.6) are known. Then,we can search for the joint probability from linear identity (2.5) or integrate a simpler FPKequation for the joint density if it exists. We also need to use formulas (2.24), (2.25) and representation (2.9) ofthe conditional probability via the joint probability. In this case, we can apply the known methods of ana�lyzing stochastic differential systems. However, this way to obtain OSF is also very challenging.

A handy alternative for the latter is a numerical way that helps find the functions and suffi�ciently accurately via the following procedure of the Monte�Carlo method. One needs to perform multi�ple step�by�step statistical simulation of stochastic differential equations of system (1.1) and filter (1.7)

ρ ⋅�() = ∇ ρρ �� yy = ∇ ∇ ρρ ��

Ty zyz

⋅ = − −

TM0 0 0 0 0 0' '[ ( )] [( ( )) ( ( ))]I h X h Y L X h Y

⋅ ∈ Ψ( )h

( )oh y

� �M 0 0 0'( ) [ ] 'oh y X Y y x= = = , η

η |0( )A y

= M |0 0 0'[ ]Z X Y < ∞M2

0Z

− =M[ ]0 0' 0X Z = −0 0 0'E X Z

, ,( )G t y zx , , >( ) 0R t x y

−

, , = − , , + , , , , .

T' ' 1( ) [( ) ( ) ( )] ( )G t y z x z c t y z S t y z R t y z

⋅( )f

, , = , , − , , , , .( ) '( ) ( ) ( )f t y z a t y z G t y z c t y z

= , , + , , − , , ,'( ) ( )[ ( ) ]t t t t t t t tdZ a t Y Z dt G t Y Z dY c t Y Z dt

[ ]−

= ,ϒ + − ,ϒ + ,ϒ ,ϒ − ,ϒ .

T' ' ' 1( ) [( ) ( ) ( )] ( ) ( )t t t t t t tdZ a t dt x z c t S t R t dY c t dt

ϒ = ,( )t t tY Z

ρ , ,|( )t A y z

0( )q B

⋅( )f ⋅( )Gρ ⋅()

0( )q B ,0( )p A B, , ,( )r t A B C

⋅( )f ⋅( )G

876


RUDENKO

using existing difference schemes [32]. For instance, by the Euler method for a sufficiently small integra�tion step , we have

(2.26)

where is the number of step, Vk is the approximate value of the random vector Vt at the clock

instant while is the Gaussian vector of increment during the time Δt of thestandard Wiener process Wt that has the zero mathematical expectation and the scalar covariance matrix

, where E is the unit matrix. Here and in what follows, is the density of Gaussian (normal)distribution of the random vector U of the dimension with the parameters m and D

(2.27)

Statistical simulation is to be performed step by step so that using the existing batch of realizations of

random variables to find sample approximations to the functions of conditionalmeans (2.12) of various nonlinearities of system (1.1) needed to obtain the structural functions ,

using (2.24) and (2.25) is alternated with generating the set of realizations of the noise andusing (2.26) to calculate the new batch . Conditional means (2.12) are found similarly tohow the histogram of the density is obtained. Taking into account that by (2.23) the function ofthe initial condition of the filter is also a conditional mean, we obtain the following scheme fornumerical synthesis of filter (1.7) by the Monte�Carlo method applied step by step.

P r o p o s i t i o n 1. To synthesize optimal filter (1.7) by statistical simulation, one needs to performthe following chain of calculations

(2.28)

However, the numerical procedure that finds structural functions of the filter as analogues of the his�togram of the multidimensional distribution of the system of random variables Yk and Zk is sufficientlybulky. The second drawback of algorithm (2.28) is that these functions are obtained only as large arrays oftheir values at the points of the grid over all arguments. To use them further, one needs additional process�ing of the arrays to obtain more practical analytical representations. Therefore, what we consider beloware analytical ways of accurate and approximate OSF synthesis.

3. ACCURATE ANALYTICAL SYNTHESIS OF THE FILTER

We give two, almost obvious examples to test the algorithm.

3.1. Incomplete Linear Gaussian Case

T h e o r e m 6. Suppose the right�hand sides of equations of system (1.1) are linear only with respectto the part Xt of its state vector that we estimate

(3.1)

(3.2)

and its initial value X0 is conditionally Gaussian

(3.3)

with the parameters . Then, if we need to estimate the entirevector Xt, OSF (1.7) degenerates into the semi�linear Liptser�Shiryaev filter optimal for this case [6, 7]

(3.4)

(3.5)

Δ t

+

+

+ +

= + , , Δ + , , Δ ,

= + , , Δ + , , Δ ,

= + , , Δ + , , − ,

1

1

1 1

( ) ( )

( ) ( )

( ) ( )( )

k k k k k k k k k

k k k k k k k k k

k k k k k k k k k k

X X a t X Y t B t X Y W

Y Y c t X Y t D t X Y W

Z Z f t Y Z t G t Y Z Y Y

= , , ...0 1k= Δkt k t Δ Δ ,Δ∼ ( 0 )k kW N w tE

Δ tE ,|( )N u m Ddim( )u

dim T| det exp[ / ]( ) 1/2 1( ) [(2 ) ] ( ) ( ) 2uN u m D D u m D u m− −

, = π − − − .

{ }, ,k k kX Y Z ϕ , ,( )kt y z, ,( )kf t y z

, ,( )kG t y z { }Δ kW{ }+ + +

, ,1 1 1k k kX Y Z, ,�( )ks t y z

( )h y

p0 A B,( ) X0 Y0,{ } h y( ) Z0{ } G0 y z,( ) f0 y z,( ) …

… Xk Yk Zk, ,{ } Gtky z,( ) ftk

y z,( ) Xk 1+ Yk 1+ Zk 1+, ,{ } …→

generate (2.23) (1.7) (2.24) (2.25) (2.26)

(2.26)(2.25)(2.26) (2.24)

[ ]= , + , + , ,( ) ( ) ( )t t t t t tdX A t Y X e t Y dt B t Y dW

[ ]= , + , + , ,( ) ( ) ( )t t t t t tdY C t Y X g t Y dt D t Y dW

= ,η� | | 0 00( ) [ ( ) ( )]x y N x m y D y

= = ,M |0 0 0( ) [ ]m y X Y y = =cov |0 0 0( ) [ ]D y X Y y

[ ] [ ]{ }= , + , + , − , + , ,( ) ( ) ( ) ( ) ( )t t t t t t t t tdZ A t Y Z e t Y dt G t Y dY C t Y Z g t Y dt

Γ Λ = Λ Γ Λ + Γ Λ Λ + Λ − Λ Λ Λ ,T T( ) [ ( ) ( ) ( ) ( ) ( ) ( ) ( ) ( )]t t t t t t t t td A A Q G R G dt



where the argument , the initial conditions have the form , , and thegain matrix of the updating process is given by

(3.6)

R e m a r k 2. In [7, p. 238] generalization (3.4), (3.5) of the linear Kalman–Bucy filter is treated as thefilter of the order n(n + 3)/2 while one can also consider it as a filter of the order included in class (1.7).Indeed, there is no need to solve matrix Riccati equation (3.5) as the results of observations Yt are obtained

since it is a non�stochastic differential equation with the parameter . Therefore, we can also obtain

the matrix beforehand if we solve this equation for each admissible .

3.2. Complete Linear Gaussian Case

In what follows, we use the symbol of the Moore�Penrose pseudoinverse of the quadratic matrix withnot necessarily complete rank.

C o r o l l a r y 4. Suppose system of equations (1.1) is linear with respect to both variables

(3.7)

while the entire vector of its initial conditions is Gaussian

(3.8)

with the parameters , , . Then, if we need to estimate the entirevector Xt, OSF (1.7) degenerates into the linear Kalman–Bucy filter optimal for this case that, in our caseof incomplete accurate measurements, is described by the following equations [7]

(3.9)

with the initial conditions

(3.10)

and the gain matrix of the updating process has the form

(3.11)

4. STRUCTURE OF SUBOPTIMAL GAUSSIAN FILTER

In the general case, when the equations of system (1.1) are nonlinear or the conditional probability of the part of its initial state is non�Gaussian, and since algorithm (2.38) for finding the structural func�tions and of filter (1.7) accurately is complicated, approximate ways to calculate them analyticallyare topical. We consider the simplest one based on Gaussian approximation.

We approximate the joint probability by the Gaussian one with density (2.27) and define its param�eters to equal the respective mathematical expectations and covariances of the approximating distribution

(4.1)

To find conditional means (2.12), we need the conditional probability.

L e m m a 3. For (4.1), the conditional probability can be approximated by the Gaussian one

(4.2)

Λ = ,( )t tt Y =0 0 0( )Z m Y 0 0 0(0 ) ( )Y D YΓ , =

−

, = Γ , , + , , .

T 1( ) [ ( ) ( ) ( )] ( )G t y t y C t y S t y R t y

n

=tY y

Γ ,( )t y ∈ R my

⊕

[ ]

[ ]

= + + + ,

= + + + ,

( ) ( ) ( ) ( )

( ) ( ) ( ) ( )t t t t

t t t t

dX A t X K t Y i t dt B t dW

dY C t X L t Y j t dt D t dW

,0 0( )X Y

, = , , , , ,� | 0 0 0 0 00( ) ( )x y x y xyx y N x y m m D D Dp

λ= ΛM0 0[ ]m λ

= Λcov0 0[ ]D λθ= Λ ,Θcov0 0 0[ ]D

[ ] [ ]{ }= + + + − + + ,

Γ = Γ + Γ + −T T

( ) ( ) ( ) ( ) ( ) ( ) ( )

[ ( ) ( ) ( ) ( ) ( ) ( )]

t t t t t t

t t t

dZ A t Z K t Y i t dt G t dY C t Z L t Y j t dt

d A t A t Q t G t R t G t dt

⊕ ⊕= + − , Γ = − ,

T0 0 0 0 0 0 0 0 0 0 0[ ]x xy y y x xy y xyZ m D D Y m D D D D

−

= Γ + .

T 1( ) [ ( ) ( )] ( )tG t C t S t R t

η ⋅0()

⋅( )f ⋅( )G

⋅( )r

× ×

, , , ≈ , , , , , , , , , , .∫ |( ) ( )x y z x y z xy xz yzt t t t t t t t t

A B C

r t A B C N x y z m m m D D D D D D dxdydz

ρ , , ≈ μ , , ,Γ∫| |( ) [ ( ) ]tA

t A y z N x t y z dx

878


RUDENKO

with the function of conditional mean linear with respect to the variables y and z

(4.3)

and the conditional covariance that depends on time only

(4.4)

Here, the gain matrix has the form

(4.5)

are its last rows, and the shift vector can be found as

(4.6)

Further, we use the caret symbol over a function to denote its Gaussian mean

(4.7)

and use the respective operator . Function (4.7) is known [26] as the statistical linearizationcharacteristics of nonlinearity of with respect to the variable x.

A s s e r t i o n 1. Structural functions of the Gaussian approximation to OSF (1.7) based on approxi�mation (4.1) can be expressed via nonlinearities of observable system (1.1) using its Gaussian means (4.7)by the formulas

(4.8)

(4.9)

where is function (4.3) and is the Jacobi matrix of partial derivatives. The argumentsof the Gaussian means include parameters (4.4)–(4.6).

As a result, we have from (1.7), (4.9) the equation of Gaussian approximation to OSF

(4.10)

where the random variable is found using the second function in (4.3). This equation has

almost the same form and uses the same nonlinearities as the first equation of the Gaussian approx�imation to AOF known as the normal approximation filter (NAF) [7, 8]

(4.11)

(4.12)

Here, is the estimate of the entire state vector, is the matrix of its a poste�

riori covariances, is a bulky argument,

* is the symbol of the multiplication operation of the three�dimensional matrix by the vec�tor

Elements of this matrix are given by

T T T T T T'' ''( ) [ ( )] ( ) [ ]t tt y z z t y z t y z y zμ , , = , , , , , = κ + Δv v

Γ = − Δ .T T T[ ]x yx zx

t t t t tD D D

[ ]y yz

xy xz t tt t t zy z

t t

D DD D

D D

⊕⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

Δ = ,

Δ''t −( ')n n

T T T'''' ''[ ]x y zt t t t tm m mκ = − Δ .

ϕ , , ,μ,Γ = ϕ , , , μ,Γ = ϕ , , , μ,Γ∫M |ˆ( ) [ ( ) ] ( ) ( )Nt y z t X y z t x y z N x | dx

⋅ μ,ΓM [ ]N |ϕ , , ,( )t x y z

( ) ( ) ( )−

µ, , = Γ , ,μ , , ,Γ + , ,μ , , ,Γ , ,μ , , ,Γ ,

T ' 1ˆ'( ) [ ( ) ( ) ] ( )ˆN

t t t tG t y z t y t y z S t y t y z R t y t y zc

( ) ( ), , = , ,μ , , ,Γ − , , , ,μ , , , Γ ,ˆ ˆ( ) ' ( ) ( ) ( )N Nt tf t y z a t y t y z G t y z c t y t y z

µ , ,( )t y z /ˆ ˆc cµ = ∂ ∂μ

( ) ( )[ ]= , , , ,Γ + , , − , , , ,Γ ,ˆ ˆ' ( )Nt t t t t t t t t t t tdZ a t Y Z V dt G t Y Z dY c t Y Z V dt

= , ,( )t t tV t Y Zv

, ,ˆˆ ˆa c S

= , , , + , , , − , , , ,ˆ ˆ ˆ ˆˆ ˆ( ) ( )[ ( ) ]t t t t t t t t t t tdX a t Y X P dt H t Y X P dY c t Y X P dt

[ ]

T Tˆ ˆ

1

ˆˆ ˆ[ ( ) ( ) ( ) ( ) ( ) ( )]

ˆ( ) ( ) ( )

t x t t t x t t t t t

t t t t

dP a P P a H R t Y H Q dt

R t Y dY c dt−

= Λ + Λ − Λ , Λ + Λ

+Ξ Λ ∗ , − Λ .

= M[ | ]0ˆ t

t tX X Y = cov 0[ ]tt tP X | Y

Λ = , , ,ˆ( )t t t tt Y X P

−

, , , = , , , + , , , , ,

T 1ˆˆ( ) [ ( ) ( )] ( )zH t y z P P c t y z P S t y z P R t y

= ...

, = ...Ξ = Ξ

1

1

k mijk i j n

=

Ξ ∗ = Ξ .∑1

m

ijk k

k

u u

Ξ , , , = + + , ,� � � � �M( ) [ ]ijk N i j k i jk j ikt y z P X X c X S X S z P



where , . The initial conditions for these equations are random valuesof functions of the mathematical expectation and covariance of the conditional distribution :

, . Equations (4.10) and (4.11) even coincide when restriction (1.4) is metand the entire vector Xt is estimated when n' = n and the argument Vt vanishes because then .The principal difference between (4.10) and (4.11) lies in arguments on nonlinearities. The random cova�riance Pt, which is obtained in the NAF algorithm by special equation (4.12) that increases the order ofthis filter up to n(n + 3)/2 and includes the complicated matrix , is replaced in (4.10) by the determin�istic covariance Γt calculated beforehand using (4.4).

Finally, for the given probability , we can find a linear approximation to nonlinear function (2.23)of the initial OSF state by the similar Gaussian approximation of the distribution of the entire initial state

of system (1.1)

Then, by the theorem on normal correlation, we immediately obtain the approximation itself and itsparameters

(4.13)

Therefore, the initial condition for (4.10) can be chosen to be both accurate andapproximately Gaussian .

After the form of functions of the Gaussian approximation (4.8) and (4.9) is found, one needs to findthe numerical parameters of Gaussian OSF (4.10) to synthesize it. These are the deterministic covariance and the coefficients and the free terms of linear functions (4.3), (4.13). If an OSF of the fullorder n' = n is synthesized, only are left. For the unknown probability q0(B), to obtain the currentparameters (4.4)–(4.6), one needs to find the conditional probability beforehand from identity (2.21),where one can use approximations (4.8) and (4.9). If the probability q0(B) is given, they can be found usingidentity (2.5).

However, one can note that all these parameters can be expressed via first two moments and D of theknown random variables. Therefore, the problem is reduced to finding only mathematical expectationsand covariances of both the vector of the initial conditions and the current state vector of nonlinear stochastic system (1.1), (4.10). To solve this problem, one can also apply Monte�Carlomethod (2.28) step by step. Yet, now having used difference schemes (2.26) with functions (4.8) and (4.9)rather than complex construction of histograms, we need to apply a simple procedure of calculating sam�ple values of the sought moments and then use (4.4)–(4.6) to find the current parameters and (4.13) tofind the initial parameters. This leads to the following algorithm.

P r o p o s i t i o n 2. To find the parameters of Gaussian filter of suboptimal structure (4.10), one needsto perform the following computations

(4.14)

where , are the sets of moments.

At the same time, at each simulation cycle k, one can easily obtain sample approximations to the

moments and of estimation errors to check whether the estimate is unbiased, i.e.

, and analyze the accuracy of the Gaussian filter since mean square error (1.3) is expressed via them

as .

° = −X X z ° = , , − , , ,ˆ( ) ( )c c t X y c t y z P

η0

= M |0 0 0ˆ [ ]X X Y = .cov |0 0 0[ ]P X Y

µ , , =( )t y z z

Ξ ⋅( )

0( )q B

,0 0( )X Y

×

, ≈ , , , , , .∫ |0 0 0 0 0 0( ) ( )x y x y xy

A B

p A B N x y m m D D D dxdy

' '0 0 0 0 0 0 0 0 0( ) ,N x y y x yh y H y e H D D e m H m⊕

= + , = = − .

= M |0 0 0'[ ]Z X Y

= +0 0 0 0Z H Y e

Γt

Δ , 0''t H κ , 0''t e, ,Γ0 0 tH e

ρ ⋅()

m

,0 0( )X Y , ,( )t t tX Y Z

p0 A B,( ) X0 Y0,{ } m0x m0

y D0x D0

y D0xy

, , , ,( ) H0 e0,( ) Z0{ } …→

… Xk Yk Zk, ,{ } mk Dk,( ) ΔtkΓtk

κtk''→

μtky z,( ) Gtk

N y z,( ) ftk

N y z,( ) Xk 1+ Yk 1+ Zk 1+, ,{ } …,→

generate (4.13) (1.7) (2.26)

(2.26) (4.5) (4.4) (4.6)

(4.3) (4.8) (4.9) (2.26)

= , ,( )x y zk k k km m m m = , , , , ,( )x y z xy xz yz

k k k k k k kD D D D D D D

ε

ktm ε

ktD = −'t t tE X Zε

≈ 0ktm

ε ε ε

= +

Ttr[ ]( )t t t t tI L D m m

880


RUDENKO

5. SUBOPTIMAL LINEARIZED FILTER

We consider a simpler way of approximate analytical search for structural functions of Gaussian OSF (4.10)that helps avoid complicated integral procedure (4.7) used to calculate statistical linearization character�istics of nonlinearities. Since it leverages additional Taylor approximation of nonlinearities of the observ�able system, one can apply it given their certain smoothness and for small intensities of perturbations andspread of the initial conditions.

Suppose nonlinearities of system (1.1) are continuously differentiable with respect to the variable x.Then, its equations can be linearized with respect to the components and of the vector Xt in theneighborhood of their Gaussian estimates Zt and , respectively, where is the functionfrom (4.3), assuming that the errors are sufficiently small. This yields approximations tononlinearities that are linear with respect to x

(5.1)

where is the repetitive argument and is the Jacobi matrix. Then, to find theapproximations of statistical linearization characteristics, we apply Gaussian averaging (4.7) to these

functions. Since, by (4.3), and (4.2) leads to , , we can easilyobtain

and . Substituting these expressions to (4.8), (4.9), we arrive at the fol�lowing obvious conclusion.

A s s e r t i o n 2. The structural functions of the linearized approximation to OSF (1.7) based on Gaus�sian (4.1) and Taylor (5.1) approximations can be directly expressed via nonlinearities of observable sys�tem (1.1) and partial derivatives of one of them by the formulas

(5.2)

where is the function from (4.3) whereas .

As a result, the equation of the linearized approximation to OSF has the form

(5.3)

It resembles the first equation of the linearized approximation to AOF known as the extended Kalmanfilter (EKF) [7, 8]

(5.4)

where is the repetitive argument and the gain coefficient has the form

Like the pair (4.10), (4.11), these equations also almost coincide when the entire vector Xt is estimatedand restriction (1.4) is met, the deterministic covariance Γt being their important difference.

R e m a r k 3. If we do not use Gaussian feature (4.1) of the joint distribution for linearization, wecan again obtain, similarly, approximations (5.2) for optimals (2.24), (2.25) directly from (2.12), (4.2),and (5.1) yet with the unknown functions and .

To conclude, note that the parameters , , of linearized OSF (5.3) can be calculated fullyaccording to above given algorithm (4.14) for finding parameters of the Gaussian OSF.

'tX ''tX= , ,( )t t tV t Y Zv ⋅( )v

− , −' ''t t t tX Z X V

' ''

' ''

( ) ( ) ( )( ) ( )( ) ( ) ( )

( ) ( ) ( )( ) ( )( ) ( ) ( )z

z

a t x y a a x z a x B t x y B

c t x y c c x z c x D t x y D

, , ≈ λ + λ − + λ − , , , ≈ λ ,

, , ≈ λ + λ − + λ − , , , ≈ λ ,

v

v

v

v

λ = , , ,( )t z yv = ∂ ∂/za a z

µ =T T T[ ]z v μ,Γ =M |'[ ]N tX z μ,Γ =M |''[ ]N tX v

' '( ' 'ˆ ˆˆ ˆ( ) ) ( ) ( ) ( ) ( ) ( ) ( )a t y a c t y c S t y S R t y R, ,μ,Γ ≈ λ , , ,μ,Γ ≈ λ , , ,μ,Γ ≈ λ , , ,μ,Γ ≈ λ ,

[ ]µ , ,μ,Γ ≈ , , , , , ,ˆ ( ) ( ) ( )zc t y c t z y c t z yv

v v

( ) ( ) ( )

( ) ( )

−

, , = Γ , , , , , + , , , , , , , , , , ,

, , = , , , , , − , , , , , , , ,

T ' 1'( ) [ ( ) ( ) ] ( )

( ) ' ( ) ( ) ( )

Lt

L L

G t y z F t z t y z y S t z t y z y R t z t y z y

f t y z a t z t y z y G t y z c t z t y z y

v v v

v v

, ,( )t y zv [ ]( ) ( ) ( )zF t z y c t z y c t z y, , , = , , , , , ,v

v v v

[ ]= , , , , , + , , − , , , , , .'( )( ) ( ) ( ( ) )Lt t t t t t t t t t t tdZ a t Z t Y Z Y dt G t Y Z dY c t Z t Y Z Y dtv v

= , , + , , , − , , ,

= Λ + Λ + Λ − Λ , , Λ , ,T T

ˆ ˆ ˆ ˆ( ) ( )[ ( ) ]

[ ( ) ( ) ( ) ( ) ( ) ( )]

t t t t t t t t t

t z t t t z t t t t t t t

dX a t X Y dt K t X Y P dY c t X Y dt

dP a P P a Q K P R t Y K P dt

Λ = , ,ˆ( )t t tt X Y

−

, , , = , , + , , , .

T 1( ) [ ( ) ( )] ( )zK t z y P Pc t z y S t z y R t y

⋅( )r

′Γ , ,( )t y z , ,( )t y zv

Δ''t κ''t Γ't



6. INACCURATE MEASUREMENTS

We consider the equations of the special case of observable system (1.1)

(6.1)

with the initial conditions . Then, the equations of Stratonovich AOF (1.5) andPugachev COF (1.6) have simpler forms [7, 18]

(6.2)

(6.3)

We also simplify OSF (1.7) to be synthesized. Now, we search the following equation with the deter�ministic initial condition for the estimate Zt

(6.4)

We find its structural functions and , as above, under unbiasedness requirements (1.2) ofthe estimate Zt and its local optimality (1.14) while we find the variable h by minimizing the initial meansquare error I0 directly.

We also consider assumptions similar to (1) and (2) but without the variable y and the vector Y0 to bemet while the sufficient existence and uniqueness conditions of the strong solution of system ofequations (6.1) and (6.4) [33] are met instead of (3). Therefore, for any , the second moments of the

plant’s and filter’s states are also finite , which ensures existence of mean square error (1.3)for class of estimates (6.4) as well.

As a result, we have the problem that is the special case of the one considered above; to solve it, oneonly needs to remove the variable y from the previous results. This solution is obtained directly from theproblem statement in [22] yet for a simpler case of estimating all components of the state vector Xt, noisesof the plant and sensor being independent S = BDT = O and given that there exists the probability density

so that we could apply the FPK equation. However, we did not use the equation for conditionaldensity (2.22), which resulted in excessive term in the drift function that worsened the estimationaccuracy. Therefore, we consider the following reasoning.

It follows from (6.1) and (6.4) that now there is a diffusion process with the following drift

and diffusion functions. It is fully characterized by the joint probabil�ity that is defined by integral identity (2.5) as well, now with the function

, that has the same operator (2.4) yet with the gradient and its initial condition (2.6)transforms to . Then, the expression for local optimality criterion (2.7) remains validas well if we exclude the variable y from the functions’ arguments. Unbiasedness (2.14) and local optimal�ity (2.18) conditions also remain as they were yet with the conditional probability , and estimationerror (2.15) stops being orthogonal to the measurement Yt. The conditional probability itself is also givenby identity (2.21) yet with the function and the initial condition . Similar to theequation for conditional density (2.22), this identity is simplified at the expense of the subtrahend, oper�ator (2.20) of which now has the form

The conditional means themselves are given as

instead of (2.12).

As a result, the expressions for optimal functions are also similar to (2.24), (2.25)

(6.5)

= , + , , = , + ,( ) ( ) ( ) ( )t t t t t t t tdX a t X dt B t X dW dY c t X dt D t X dW

η ,∼0 0( )X A =0 0Y

0( ) ( )t t t tdS t S dt t S dY S= ϕ , + Ψ , , = ξ,

= α ξ , + β Θ , + γ .( ) ( )t t t t t t tdZ t Z dt t Z dY dt

0( ) ( )t t t tdZ f t Z dt G t Z dY Z h= , + , , = .

,( )f t z ,( )G t z

,( )t tX Z≥ 0t

+ < ∞M2 2( )t tX Z

, ,�( )r t x z⋅( )f

,( )t tX Z

ε = +T T T[ ( ) ]a f Gc T T T[ ( ) ]B GDΘ =

, = ∈ , ∈P( ) [ ]t t tr A C X A Z C

( )t x zϕ , , ∇ = ∇ ∇ ,T T T[ ]x z

, = η χ0 0( ) ( ) ( )Cr A C A h

ρ |( )t A z

( )t x zϕ , , |0 0( ) ( )A z Aρ = η

∗ = + ∇ + ∇ ∇ .T T Ttr[ ]1( )

2z z z zL f G c G R G

� � |( ) ( ) ( ) ( ) ( )tt x z t z t x z t x z t dx zφ , , = φ , = φ , , , ρ = φ , , ρ ,∫

−

, = − , + , , , , = , − , , ,

T' ' '1( ) [( ) ( ) ( )] ( ) ( ) ( ) ( ) ( )G t z x z c t x S t z R t z f t z a t z G t z c t z

882


RUDENKO

and the initial value of the estimate can be found as rather than by (2.23). In this case, thenumerical algorithm of synthesizing functions (6.5) differs from (2.28) mainly since difference equa�tions (2.26) are replaced by the following system

(6.6)

The exact analytical synthesis of filter (6.4) can be also performed in the linear Gaussian case

, and given that the entire vector Xt is estimated. Indeed, then from (3.9), we have theequation of the standard Kalman–Bucy filter [7, 26, 34]

where the gain matrix G(t) is still defined by formula (3.11) and the same Riccati equation

yet the initial condition of the latter is now .To find the structure of OSF (6.4) in the Gaussian approximation, we use the approximation of the

probability similar to (4.1). Then, from (4.3)–(4.6) we have a simpler form of the linear functionof the conditional mean and its parameters

(6.7)

R e m a r k 4. For filter (6.4) of the full order n from (2.15), we have and then in (6.7)so that this Gaussian OSF has only one parameter

(6.8)

As a result, in case (6.1), approximations (4.8), (4.9) remain as they were

(6.9)

and differ only by the arguments of the statistical linearization characteristics

(6.10)

We find expressions for the functions of OSF (6.4) in the linearized approximation in a similar way.Omitting the variable y in (5.2), we have

(6.11)

where .Thus, for the system with inaccurate measurements (6.1), the Gaussian and linearized suboptimal fil�

ters have the equations

(6.12)

(6.13)

with the gain matrices , from (6.9), (6.11), respectively, and the initial condition .In this case of inaccurate measurements, these suboptimal filters have almost the same number of cur�

rent parameters as in the previous case, with only the initial parameters H0, e0 being absent. We still have

the vector and the matrices and that are expressed by formulas (6.7) via the first two moments of

= M '[ ]th X

+

+

= + , Δ + , Δ ,

= + , Δ + , , Δ + , Δ .

1

1

( ) ( )

( ) ( )[ ( ) ( ) ]k k k k k k k

k k k k k k k k k k k

X X a t X t B t X W

Z Z f t Z t G t Z c t X t D t X W

= + + , = + + ,[ ( ) ( )] ( ) [ ( ) ( )] ( )t t t t t tdX A t X i t dt B t dW dY C t X j t dt D t dW

= , ,ρ� | 0 00( ) ( )x xx N x m D

= + + − + , = ,{ } 0 0[ ( ) ( )] ( ) [ ( ) ( )] xt t t tdZ A t Z i t dt G t dY C t Z j t dt Z m

T T( ) ( ) ( ) ( ) ( ) ( )t tt A t A t Q t G t R t G t= Γ + Γ + −Γ�

Γ =0 0xD

, ,( )r t A C

T T T

''

'' ''( ) [ ( )] ( )

'' ''

t t

xz z x zx x zt t t t t t t t t t t

t z z t z t z z

D D D D m m⊕

μ , = , , , = Δ + κ ,

Δ = , Γ = − Δ , κ = − Δ .

v v

=

xz zt tD D µ = z

Γ = − .x z

t t tD D

T '( )

'( )

1ˆˆ'( ) [ ( ( ) ) ( ) ] ( ( ) )

ˆ( ) ( ) ( ) ( ( ) )

Nt t t t

N Nt t

G t z c t t z S t t z R t t z

f t z a t t z G t z c t t z

−

µ, = Γ ,μ , ,Γ + ,μ , ,Γ ,μ , ,Γ ,

, = ,μ , ,Γ − , ,μ , ,Γ

φ ,μ,Γ = φ , μ,Γ = φ , μ,Γ .∫Mˆ( ) [ ( ) ] ( ) ( )Nt t X t x N x dx

( ) ( )

( ) ( )

−

, = Γ , , , + , , , , , , ,

, = , , , − , , , , ,

T

'

1'( ) [ ( ( )) ' ( ) ] ( )

( ) ( ) ( ) ( )

Lt

L L

G t z F t z t z S t z t z R t z t z

f t z a t z t z G t z c t z t z

v v v

v v

( ) ( )[ ]( ) zF t z c t z c t z, , = , , , ,v

v v v

( ) ( )[ ]= , , , ,Γ + , − , , , ,Γ ,ˆ ˆ' ( ) ( ) ( )Nt t t t t t t t tdZ a t Z t Z dt G t Z dY c t Z t Z dtv v

( ) ( )[ ]= , , , + , − , , ,' ( ) ( ) ( )Lt t t t t t tdZ a t Z t Z dt G t Z dY c t Z t Z dtv v

,( )NG t z ,( )LG t z =

'0 0

xZ m

κ''t Δ''t Γt



the random vector . To find them, one can apply algorithm (4.14) of the Monte�Carlo method aswell but now with formulas (6.9), (6.11) used for structural functions and equations (6.6) used instead ofdifference scheme (2.26).

7. COMPARISON OF FILTERS: EXAMPLE

To demonstrate the analytical procedure of approximating OSF and to check approximation efficiencynumerically, we consider an example of system (6.1) that includes a number of known ones [7, 19, 35].Suppose the equations of the plant and the sensor with the scalar state Xt and measurement Yt, polynomial

nonlinearities and independent multiplicative scalar noises and have the following form

(7.1)

We consider the initial state X0 to be Gaussian with the parameters and . In what follows, we omitthe argument t in the coefficients of nonlinearities and, for the sake of brevity, write the equations of filtersvia the derivatives like .

Comparing (7.1) and (6.1), we can see that in this case and

(7.2)

which make the noise intensity matrices degenerate into the following scalars

(7.3)

Moreover, since the plant is one�dimensional, there can be only one variant of complete informativity, i.e., OSF (6.4) can also be only scalar. Therefore, in the formulas given above for its structural

functions one needs to remove the stroke symbol , and remove the function from theirarguments of its approximations since here its dimension is zero so that .

Since original nonlinearities (7.2) are differentiable with respect to x, we can obtain expressions for lin�earized approximations to optimal drift and diffusion functions. We find from (6.11)

Therefore, in this example, equation (6.13) of the linearized OSF (LOSF) has the form

(7.4)

where the derivative according to (7.2).For the Gaussian approximation to OSF that can be constructed without nonlinearities of the observed

system being smooth, we obtain from formulas (6.9)

The Gaussian means , , and are now found, according to (6.10), by formula =. Substituting (7.2), (7.3) to it and using the known expressions for the initial moments of

the Gaussian distribution [22, 29], we have

so that . The second terms in these expressions emphasize the differences of the respec�tive elements of the structural functions of the Gaussian filter from the linearized one. Finally, equation (6.12)of the Gaussian OSF (GOSF) takes the form

(7.5)

The initial condition for equations of filters (7.4) and (7.5) is while they themselves includethe parameter Γt found by (6.8). The algorithm to find it is given in (4.14).

,( )t tX Z

1tW 2tW

= + + + + ,

= + + + .

30 1 3 0 1 1

21 2 0 1 2

[ ( ) ( ) ( ) ] [ ( ) ( ) ]

[ ( ) ( ) ] [ ( ) ( ) ]

t t t t t

t t t t t

dX a t a t X a t X dt b t b t X dW

dY c t X c t X dt d t d t X dW

0xm 0

xD

=� /t tZ dZ dt

⎡ ⎤⎣ ⎦=

T1 2t t tW W W

[ ]

[ ]

30 1 3 0 1

21 2 0 1

( ) ( ) 0

( ) ( ) 0

a t x a a x a x B t x b b x

c t x c x c x D t x d d x

, = + + , , = + ,

, = + , , = + ,

, = + , , = + , , = .

2 20 1 0 1( ) ( ) ( ) ( ) ( ) 0Q t x b b x R t x d d x S t x

= =' 1n n='a a ='S S ⋅( )v

µ , =( )t z z

( ) ( ) ( ) ( ) ( ) ( ) ( )L L Lt zG t z c t z R t z f t z a t z G t z c t z, = Γ , , , , = , − , , .

[ ]( ) ( ) ( ) ( )t t t z t t t tZ a t Z c t Z Y c t Z R t Z= , + Γ , − , , ,� �

, = +1 2( ) 2zc t z c c z

ˆˆ ˆ ˆ( ) ( ) ( ) ( ) ( ) ( ) ( )N N Nt z t t t z tG t z c t z R t z f t z a t z G t z c t z, = Γ , ,Γ , ,Γ , , = , ,Γ − , , ,Γ .

a c R φ , ,Γˆ( )t zφ , ,ΓM |[ ( ) ]N t X z

23 2 1

ˆˆ ˆ( ) ( ) 2 ( ) ( ) ( ) ( )a t z a t z a z c t z c t z c R t z R t z d, ,Γ = , + Γ , , ,Γ = , + Γ, , ,Γ = , + Γ

, ,Γ = ,ˆ ( ) ( )z zc t z c t z

ˆˆ ˆ ˆ( ) ( )[ ( )] ( )t t t t z t t t t t t tZ a t Z c t Z Y c t Z R t Z= , ,Γ + Γ , ,Γ − , ,Γ , ,Γ .� �

=0 0xZ m

884


RUDENKO

To compare the filters involved with the known ones, we give the equations of the respective approxi�

mations to AOF (6.2). Due to restriction (1.4), they, strictly speaking, hold here only if when

, i.e., the measurement noise is only additive. The first one is EKF (5.4), its equation now hav�ing the form

(7.6)

where, by (7.2), the derivative . The second one is NAF (4.11), (4.12) and is described

here by the following equations with the common argument

(7.7)

where and . The latter Gaussianmean is calculated using the known [29] property

where is a scalar function. Therefore, we have

The two moments of the initial state X0— , —are the initial conditions for systems of

equations (7.6) and (7.7). For the multiplicative measurement noise , to compare EKF and NAF

with LOSF and GOSF, we replaced the denominator by in (7.6) and by in (7.7).

We consider construction of the scalar version of Pugachev COF (6.3)

(7.8)

Here, , are the vector functions of any dimensions p and q whereas and are the row matri�

ces of their respective orders and is a scalar. We choose the structure of this filter, as it is recommendedin [7, 19], based on comparison of its equation with the first of EKF equations (7.6). Taking into account

the form of nonlinearities and in it, we can see from (7.2) that the coefficient of the

derivative of the measurement should be chosen as a linear function of the variable z whereas other

terms from (7.8) should form third�degree polynomials. Therefore, we take

and , hence we have and . We give the initial condition

for filter (7.8) as usual We do not impose restriction (1.4).

To find the parameters αt, βt, and γt of filter (7.8), we solve definite matrix equations [7, 18]. The firstone is with respect to βt and has the form of a simple linear matrix equation

(7.9)

By adding another artificial element to the column , we managed [22] to write the cumbersomesystem of two equations for αt and γt of the filter that is scalar in this case and of the stochastic measure�ments as a single system of linear equations

(7.10)

= ,1 0d

, =

20( )R t x d

[ ]2

2

2 20 0

( ) ( ) ( ) 2 ( ) ( ) ( ),t tt t z t t t t t z t t z t

P PZ a t Z c t Z Y c t Z P P a t Z Q t Z c t Z

d d= , + , − , , = , + , + ,

� � �

, = +

21 3( ) 3za t z a a z

= ,( )t t tV Z P

20

2 2 20

ˆ ˆ ˆ( ) ( )[ ( )]

ˆˆ ˆ ˆ2 ( ) ( ) { ( ) ( )[ ( )]}

t t t z t t t

t t z t t t z t t t t

Z a t V P c t V Y c t V d

P P a t V Q t V P c t V t V Y c t V d

= , + , − , ,

= , + , + , + Ξ , − , ,

� �

� �

, ,Γ = , + Γ

21

ˆ( ) ( )Q t z Q t z b ( )Ξ , , = − , − , , ,M 2 ˆ( ) [( ) ( ) ( ) ]Nt z P X z c t X c t z P z P

[ ]− − ϕ , = + ∇ ∇ ϕ , ,

T TM | M |[( )( ) ( ) ] ( ) ( )N m m NU m U m U m D D D D U m D

ϕ( )u 2 22ˆ( ) ( ) 2zzt z P P c t z P c PΞ , , = , , = .

=0 0xZ m 0 0

xP D=

≠1 0d20d , = +

20 1( ) ( )t tR t Z d d Z , ,

ˆ( )t tR t Z P

= α ξ , + β Θ , + γ .� �( ) ( )t t t t t t tZ t Z t Z Y

ξ ,( )t z Θ ,( )t z α t βt

γ t

,( )a t x ,( )c t x β Θ ,( )t t z�

tY

α ξ , + γ( )t tt z ξ , =T2 3( ) [ ]t z z z z

Θ , =T( ) [1 ]t z z [ ]α = α α α1 2 3( ) ( ) ( )t t t t [ ]β = β β1 2( ) ( )t t t

= .0 0xZ m

Θ , , Θ , β = − , Θ , .T TM M[ ( ) ( ) ( )] [( ) ( ) ( )]t t t t t t t tt Z R t X t Z X Z c t X t Z

ξ ,( )t z

⎡ ⎤⎣ ⎦α γ = .

T( ) ( )t tA t b t



Here, �dimensional quadratic matrix of coefficients A and the column of free terms b are given by

the following formulas for its elements :

(7.11)

In these expressions, is the estimation error, is the known function,

, are actual elements of the vector function , and its artificial element .

To calculate parameters of the filters considered above and analyze their accuracy, we applied the Eulermethod to perform statistical simulation of stochastic differential equations of observable system (7.1) andall filters to be compared. Doing that, we could easily calculate the parameter Γt, the mathematical expec�

tations in (7.9) and (7.11) and the accuracy characteristics of each filter by stepwiseMonte�Carlo method (4.14). We solved systems of linear equations (7.9), (7.10) by the standard procedureof the Gaussian elimination.

We give the results of statistical simulation of these equations with the step Δt = 0.005 on the segment for the following parameters of system (7.1): , , a3 = –1.0, , ,

, , , , and its initial state , and the weight coefficient ofthe mean square error . We compared all filters on the same batch of 1000 realizations.

For a better way to demonstrate the differences in the accuracy of each of five filters due to fluctuations(“chatter”) of their respective curve of sample value of the mean square error It, the figure shows thegraphs of values of its smoother instantaneously mean with respect to time

and uses the filter abbreviations introduced above. Analyzing these graphs, one can see that in this exam�ple, among one�dimensional filters, LOSF and GOSF turned out to be significantly more accurate thanCOF. The one�dimensional LOSF performed better than its similar, two�dimensional EKF while the one�

+( 1)p

, = , +( 1 1)i j p

[ ]{[ ]

}

M{ }

M

2

( ) [ ( ) ( )] ( )

( ) ( ) ( ) ( ) ( )

( ) ( ) ( ) ( ) ( )

1 ( ) ( ) ( ) ( ) .2

i i jij t t z t t

ii t t t t

it t t t t z t

i it t t zz t t t

A t t Z E t Z t Z

b t a t X h t Z c t X t Z

E c t X h t Z R t X h t Z t Z

E h t Z R t X t Z E t Zt

= ξ , − ξ , ξ , ,

= , − , , ξ ,

+ , − , , , ξ ,

∂+ , , ξ , + ξ ,

∂

= −t t tE X Z , = β Θ ,( ) ( )th t z t z

ξ ,( )i t z = ,1i p ξ ,( )t z +

ξ , =1( ) 1p t z

ε ε

= +

2( )t t t tI L D m

[ ]∈ ;0 1t = .0 0 0a = − .1 0 5a = .0 1 5b = .1 0 0b

= .1 1 0c = .2 0 5c = .0 1 0d = .1 0 1d = .0 0 5xm = .0 0 36xD= 1tL

τ= + τ∫0

0

1( )

t

J t I I dt

0.626

0 0.5 1.0t

0.567

0.509

0.450

0.391

0.332

J(t)

COF

EKFLOSF

GOSF, NAF

Values of the mean square error that are instantaneously mean with respect to time, for various filters.

886


RUDENKO

dimensional GOSF and two�dimensional NAF that are similar to each other yielded almost the sameaccuracy, best among the estimation algorithms involved.

The calculations are consistent with theory (1.17) since in the example we considered even the subop�timal approximations to the proposed OSF given their twice as low order as compared to similar approx�imations to AOF ensured at least as high (the Gaussian one) and even higher (the linearized one) accuracyand proved better than COF of the same order with the structure recommended in publications.

CONCLUSIONS

As an alternative to the classical approach, we proposed the way for synthesizing the fast nonlinear esti�mation algorithm (1.7) or (6.4) of the current state of observable system (1.1) or (6.1), respectively, thatcan be implemented on a cheap computer in real time and possesses the highest accuracy in its class oflow�memory filters. To do this, we considered justification of the local optimality criterion (Theorem 1).For stochastic system with incomplete measurements (1.1), we found the sufficient conditions for the pro�posed estimate to be unbiased and locally optimal (Theorems 2 and 3); the latter allowed us to find thediffusion function of this finite�dimensional filter. They require knowing the conditional probability dis�tribution of the state to be estimated, for which the new partial differential equation is obtained (Theorem 4),which in turn was used to obtain the expression for the drift function of the unbiased filter (Theorem 5).We proposed the numerical algorithm for finding nonlinearities of the exact filter by statistical simulation(Proposition 1). We proved that in various linear Gaussian cases such nonlinear filter of the full orderdegenerates into the respective versions of the linear Kalman–Bucy filter (Theorem 6 and its corollary).We also obtained analytical expressions for structural functions of the Gaussian and linearized approxi�mations to the exact filter (Assertions 1 and 2). We found the form of numerical parameters of these sub�optimal filters and proposed the way to find them by the Monte�Carlo method (Proposition 2). All theseresults are obtained for observable system with inaccurate measurements (6.1) as well. We gave the simu�lation example.

However, both linearized and Gaussian suboptimal filters can turn out to be too rough. One can obtainmore accurate approximations to OSF (1.7) similarly to how it is done for AOF (1.5) using finer approx�imations of both nonlinearities of the observable system (by taking into account the highest terms of theTaylor formula or by spline interpolation applied to them) and probability distribution (by applying Gaus�sian Gram�Charlier or Edgeworth power series, by using poly�Gaussian approximation, etc.). This wouldmake the form of structural functions of such quasi optimal filter more complicated, with its measurementprocessing speed decreased. For a simpler way, which can be quite efficient in terms of increasing the accu�racy of linearized or Gaussian approximations to OSF, one can turn to secondary optimization of onlytheir parameters [22] performed by the method of conditional optimal Pugachev filtering. The meaningand efficiency of this procedure for discrete time is shown in [24].

APPENDIX

P r o o f o f T h e o r e m 1. Assuming that derivative (1.12) exists and is finite, we use the Taylor for�mula for the function

Then, taking into account independence (1.10), the optimal

at the future segment is given by the inequality

We divide it by and move to the limit for . Since

we have the required inequality for derivative (1.13). Theorem 1 is proved.

+Δt tI

+Δ +Δ

+Δ= + Δ + Δ , .0 0[ , ] [ ] [ ] [ ]t t t t t tt

t t t t t tdI

I u u I u u t o t udt

+Δ

+Δ

+Δ

∈Φ

=

� argmint tt

t tt tt

u

Iu

0 0[ ] [ ] [ ] [ ] [ ] [ ]t t t t t t t t t tt tt t t t t t t t

dI dII u u t o t u I u u t o t u u

dt dt+Δ +Δ +Δ +Δ

+ Δ + Δ , ≤ + Δ + Δ , ∀ ∈Φ .� �

Δ > 0t Δ ↓ 0t

+Δ

Δ ↓

Δ , = ,Δ0

1 [ ] 0limt tt

t

o t ut



P r o o f o f C o r o l l a r y 1. We integrate, with respect to , inequality (1.13) for the derivative of thecriterion and take into account the obvious relation of this derivative with the criterion itself

Since the integration operation is monotone, we do obtain inequality (1.16). Corollary 1 is proved.

P r o o f o f L e m m a 1. The quadratic loss function from (1.3) satisfies conditions for valid�ity of Ito formula (2.3), and by assumptions (1) and (3) its mean exists. Therefore, we have from (2.5) thatthe derivative of the mean square error It also exists and has the form

Using operator (2.4) in the following expanded form

(A.1)

and given that is independent of y, we have the more detailed expression

where

Here, for the sake of brevity, we use the subscripts x and z to designate partial derivatives of the function

with respect to the corresponding variables, for instance, , .

At the same time, by the independence condition that follows from assumption (2) andby equality (2.2), for any fixed t the distribution law of the random variable Vt is independent of theform of the section . This means so is the entire last term of this derivative that does not includeits components , . If we substitute the following expressions for derivatives of the loss functions

, , to its first term, where O is the zero matrix, we have (2.7).Lemma 1 is proved.

P r o o f o f T h e o r e m 2. We write the unbiasedness requirement for estimate (1.2) via the jointprobability and move to partial and conditional probabilities. Using (2.13), we obtain

or, taking into account that the deduction z is independent of the integration variable x in

the conditional mean, . If is arbitrary, it leads to (2.14), where more detailed desig�nation (2.12) of the conditional mean is used. Theorem 2 is proved.

P r o o f o f C o r o l l a r y 2. The moments in equality (2.15) exist by the hypotheses. Using (2.13),(2.14), we have for the initial moment

hence . Then, the relation between the initial moment and the central moment

= and unbiasedness leads to the first property from (2.15). The sec�ond property can be proved similarly. Corollary 2 is proved.

P r o o f o f T h e o r e m 3. We take into account that the sections of the partial and conditional probabilities used in functional (2.17) are specified, by (2.8) and (2.9), only using the section of the

joint probability that, by Lemma 1, is independent of the sought function from . Therefore, to

t

τ

τ= + ⋅ τ.

τ∫0 0

0

[ ] [ ( )]

t

tt

dII u I u d

d

ψ , ,( ' )t x z

∂= ψ , , + ψ , , , .

∂' * '( ) [ ( )]t

tdI

t x z K t x z rdt t

T T T T T

T T T T T T T

* tr[

tr[ ]

1( )2

]

x y z x x y y

z z y x z x z y

K a c f Gc Q R

GRG S SG RG

= ∇ + ∇ + + ∇ + ∇ ∇ + ∇ ∇

+ ∇ ∇ + ∇ ∇ + ∇ ∇ + ∇ ∇

ψ , ,( ' )t x z

⎡ ⎤⎢ ⎥⎣ ⎦

= ψ + ψ + ψ + ψ , + ,

T T T Ttr 1( ) ( )2

tz z zx zz t

dIf c S G GRG r C t

dt

⎡ ⎤⎣ ⎦

∂= ψ + ψ + ψ , .∂

T tr1( )2x xx tC t a Q r

t

ψ ⋅() ψ = ∇ ψx x ψ = ∇ ∇ ψT

zx z x

= ⋅inv[ ()]t tI u⋅()tr

⋅()tu ( )C t

⋅()tf ⋅()tGψ = − −'2 ( )z tL x z ψ = 2zz tL [ ]ψ = 2zx tL O

− , =' 0tx z r

{ ' } 0tx z s− , =

− , ={ ' } 0tx z s ⋅( )ts

T T T T T TM[ ] ' ' ' M[ ]' { } { } { }t t t t t t t tX Z x z r x z s x z s zz s Z Z= ⟨ , ⟩ = , = , = , = ,

×=

TM ' '[ ]t t n nE Z O TM[ ]t tE Z

, +

Tcov M[ ]M[ ][ ]t t t tE Z E Z ⎡ ⎤⎣ ⎦ =M 0tE

⋅()tsρ ⋅()t

⋅()tr ⋅()tG Φ t

888


RUDENKO

minimize it, it is sufficient to find the partial minimum of the function of four arguments onlywith respect to H

Indeed, the latter relation is equivalent to the inequality

that holds for any function of comparison from and values of the variables y and z. Integrating itwith respect to these variables with the measure , we obtain the required inequality for the func�

tional (2.17) to be minimized: In what follows, we omit the superscript o that stands forthe optimal.

Now, we consider the extremum of the function given by expression (2.16) with respect to one ofits arguments H. Using the known formulas of matrix differentiation

where U, V, and T are the matrices of consistent orders, and taking into account symmetry of the matrices Land R, we find the necessary extremum condition

Hence, replacing H by and taking into account non�singularity of the quadratic matrix Lt, weobtain required relation (2.18). We now need to prove the sufficient minimum condition. From the previ�ous expression, we obtain the obvious equality ∂2ξ2/∂H2 = , where is the symbol ofdirect (external) product of matrices. Then, since the matrices Lt and are positive definite,

. The latter ensures that the function attains the sought partial minimum on the solu�tions of resulting equation (2.18). Theorem 3 is proved.

P r o o f o f l e m m a 2. In identity (2.5), we choose the function so that it is independent of the

variable : . For it, by (2.12), we have . Then, it follows from (2.13)that , which makes identity (2.5) take the form

Replacing the operator K* by its expression (A.1) and since is independent of x, we have

Here, only the functions and depend on the variable x. Therefore, by (2.13), we have

, where is operator (2.20). As a result, we have identity (2.19). Itsinitial condition is the trivial modification of condition (2.6). Lemma 2 is proved.

P r o o f o f T h e o r e m 4. Now, in identity (2.5), we single out the partial probability . Using(2.13), we find

or, taking into account (2.19) on the left and differentiability of with respect to the variables yand z

ξ , , ,2( )t y z H

∈

° , = ξ , , , ∀ > , ∀ ∈ , ∀ ∈ .R

arg R R'

'2( ) min ( ) 0

n m

m nt

HG y z t y z H t y z

( )2 2( ( )) ( )t tt y z G y z t y z G y z°ξ , , , , ≤ ξ , , , ,

⋅()tG Φ t

⋅ ≥() 0ts° ⋅ ≤ ⋅ .� �[ ()] [ ()]t t t tJ G J G

ξ ⋅2()

∂ ∂= , = + ,∂ ∂

T T Ttr tr( ) ( ) TUV U UVTV UVT U VTV V

T' '2'2 ( ) ( ) ( ) ( )t t n mL HR t x y x z c t x y S t x y O

H×

∂ξ= , , − − , , − , , ,ρ = .

∂� ��

,( )tG y z

� �2 ( )t tL R t x y⊗ , , ,ρ , ⊗

, ,( )R t x y

/2 22 0H∂ ξ ∂ > ξ ⋅2()

ϕ ⋅( )

x ϕ , , , = ξ , ,( ) ( )t x y z t y z ξ , , = ξ , ,( ) ( )t y z t y z{ }ξ , , , = ξ , , ,( ) ( )t tt y z r t y z s

{ } *( ) ( ) [ ( )]t t td t y z s t y z s K t y z rdt t

⎧ ⎫⎪ ⎪⎨ ⎬⎪ ⎪⎩ ⎭

∂ξ , , , = ξ , , , + ξ , , , .∂

ξ ⋅()

ξ = ∇ ξ + + ∇ ξ + ∇ ∇ ξ + ∇ ∇ ξ + ∇ ∇ ξ .T T T T T T T* tr[ ]1[ ] ( ) 2

2y z y y z y z zK c f Gc R RG GRG

, ,( )c t x y , ,( )R t x y

{ } ∗ξ , = ξ , = ξ ,* *[ ] [ ] { [ ] }t t yz tK r K s L s ∗yzL

⋅( )ts

� � *{ ( ) } ( ) [ ( )]t t t td s t s t K tdt t

⎧ ⎫⎪ ⎪⎨ ⎬⎪ ⎪⎩ ⎭

∂, ϕ , ,ρ = , ϕ , + ϕ , ,ρ∂� ��

v v v

ρ , ,|( )t A y z

� � � � *( ) ( ) ( ) [ ( )] 0t t yz t ts t L t t K tt t

⎧ ⎫⎪ ⎪⎨ ⎬⎪ ⎪⎩ ⎭

∂∂ ∗, ϕ , ,ρ + ϕ , ,ρ − ϕ , + ϕ , ,ρ = .∂ ∂

� ��

v v v v



Hence, since is arbitrary, we have identity (2.21). Its initial condition follows from the fact that theconditional probability of the initial value X0 that is independent of the initial state of filter (1.7) isgiven. Theorem 4 is proved.

P r o o f o f C o r o l l a r y 3. In identity (2.21), we put , where is the Dirac func�tion that is understood here in the sense of the limit of a sequence of ordinary functions [36], is the arbi�trary point of its application. The delta function is infinitely differentiable in the known sense, and itsmathematical expectation exists and is the value of the respective smooth density at the point , therefore

Then, we have

and since is independent of the variables t, y, and z, we have , and

, where the new conjugate operator has the form

As a result, we find instead of (2.21)

Using the known properties of derivatives of the delta function [37] and taking into account therequired smoothness, with respect to the variable x, of the functions and from (1.1), wecan represent the first summand via the direct operator Mx from (2.22)

Finally, replacing χ by x, we have equation (2.22). Corollary 3 is proved.

P r o o f o f T h e o r e m 5. We differentiate equality (2.14) with respect to t. We obtain

and use identity (2.21) to find

Taking into account form (A.1) and (2.20) of the operators K* and , we have and

Finally, we have , which leads to formula (2.25). Theo�rem 5 is proved.

P r o o f o f T h e o r e m 6. Given , we have , x' = x, , and . Therefore, sub�stituting (3.3) into (2.23), we easily obtain . Comparing (3.1), (3.2) with (1.1), we can see thatthe functions of the observable system have the form , ,

, . Then, from (2.25), taking into account designation (2.12)and optimality condition (2.14), we find

while (2.24) directly leads to

(A.2)

where the function is the conditional mean of type (2.12)

(A.3)

⋅( )s

η |0( )A y

ϕ , , , = δ − χ( ) ( )t x y z x δ

χ

χ

� � |( ) ( )tx t y zδ − χ , ρ = ρ ,χ , .�

� � |( ) [ ( )]yz t yz tL x L t y z∗ ∗δ − χ , ρ = ,χ , ,ρ�

δ − χ( )x ∂ δ − χ =∂

( ) 0xt

∗δ − χ = δ − χ*[ ( )] [ ( )]xK x M x

∗ = ∇ + ∇ ∇ .T Ttr[ ]1

2x x x xM a Q

∂ ∗ ∗ρ ,χ , = δ − χ ρ , , − ρ ,χ , .∂ ∫� � �| | |( ) [ ( )] ( ) [ ( )]x yzt y z M x t x y z dx L t y z

t

, ,( )a t x y , ,( )B t x y

χ∗ δ − χ ρ , , = δ − χ ρ , , = ρ ,χ , .∫ ∫� � �| | |[ ( )] ( ) ( ) [ ( )] [ ( )]x xM x t x y z dx x M t x y z dx M t y z

� �' 0txt∂ , ρ =∂

� � � � � �' * '[ ']t t yz tx K x L xt∂ ∗, ρ = , ρ − , ρ .∂

∗yzL = , ,* ' '[ ] ( )K x a t x y

� �' [ ] ( ) ( ) ( )yz t yzL x L z f t y z G t y z c t y z∗∗ , ρ = = , , + , , , , .

� �'( ) ( ) ( ) ( ) 0ta t x y f t y z G t y z c t y z, , , ρ − , , − , , , , =

='n n ='t tX X ='a a ='S S

= 0( ) ( )h y m y, , = , + ,( ) ( ) ( )a t x y A t y x e t y , , = ,( ) ( )B t x y B t y

, , = , + ,( ) ( ) ( )c t x y C t y x g t y , , = ,( ) ( )D t x y D t y

[ ]( ) ( ) ( ) ( ) ( ) ( )f t y z A t y z e t y G t y z C t y z g t y, , = , + , − , , , + ,

−

, , = Γ , , , + , , ,

T 1( ) [ ( ) ( ) ( )] ( )G t y z t y z C t y S t y R t y

Γ , ,( )t y z

T T TM |( ) ( ) ( ) [( ) ]t t t t t tt y z x z x x z x X Z X Y y Z zΓ , , = − = − , ρ = − = , = .� ��

890


RUDENKO

Therefore, the only undefined element in the structure of filter (1.7), whose equation now has the form

(A.4)

is the matrix Γ from G that has the sense of the conditional covariance of the error for by (2.14)

(A.5)

Using this fact and linearity of equation (3.1) with respect to Xt, we can find Γ directly, without deter�mining the measure , using the equation for the covariance matrix of the linear system (the Pugachev–Duncan method of moments).

Thus, substituting (3.2) into (A.4) and deducing the result from (3.1), we have the equation

the right�hand side of which is linear with respect to the error Et. Therefore, for conditional covariance (A.5)the following equation of the method of moments holds [7, 26]

Arranging its terms with respect to whether the matrix G is in them, we have

Using (A.2), one can easily see that the latter three terms here represent the same like terms that equal

. Finally, we have the matrix equation of Riccati type with the parameters y and z

(A.6)

We find the initial condition for this equation. Since the equalities and hold and given (2.23) and (3.3), the conditional mean (A.3) for t = 0 coincides with the initial covariance

(A.7)

Since all coefficients and free terms of Riccati equation (A.6), like its initial condition (A.7), are inde�pendent of the variable z, so is its solution, i.e., Substituting the latter into (A.2),we find required expression (3.6) for the matrix G that also turns out to be independent of z. Finally, withexpressions (A.4), (1.7), we can write (3.4) while (3.5) follows from (A.6), (3.6), (A.7). Theorem 6 isproved.

P r o o f o f C o r o l l a r y 4. Comparing (3.7) to (3.1) and (3.2), we have the special case of Theorem 4when , , , , , .Moreover, it follows from (3.8) by theorem on normal correlation [30, p. 323] that conditional density (3.3)

with the parameters , D0(y) = = inv(y) is Gaussian. Sub�stituting these expressions into (3.4)–(3.6), we immediately obtain (3.9)–(3.11), where the fact that thematrix and, hence, the matrix are independent of follows from the fact that the right�hand sideof the Riccati equation is independent of it and its initial condition is independent of Y0. Corollary 4 isproved.

P r o o f o f L e m m a 3. Approximation (4.2) directly follows from (4.1) by the known property ofthe Gaussian distribution whereas its parameters and , by the theorem on normal correlation men�tioned above, have form (4.4) and

, (A.8)

respectively, and the matrix of the order can be found by (4.5). The exact conditional mean

, according to the decomposition and condition (2.14), does have the formof the first formula from (4.3), where is the �dimensional vector function of theconditional mathematical expectation of non�estimated components of the vector Xt. To find the

Gaussian approximation to the latter, we decompose the matrices and Δt into the blocks

[ ] [ ]{ }= , + , + , , − , + , ,( ) ( ) ( ) ( ) ( )t t t t t t t t t tdZ A t Y Z e t Y dt G t Y Z dY C t Y Z g t Y dt

= −t t tE X Z > 0t

[ ]Γ , , = − − = = , = .T cov( ) ( )( ) t t tt y z x z x z E | Y y Z z

ρ ⋅()

[ ] [ ]= , − , , , + , − , , , ,( ) ( ) ( ) ( ) ( ) ( )t t t t t t t t t t tdE A t Y G t Y Z C t Y E dt B t Y G t Y Z D t Y dW

Γ = − Γ + Γ − + − − .T T[[ ] [ ] [ ][ ] ]d A GC A GC B GD B GD dt

Γ = Γ + Γ + − Γ + − Γ + + .T T T T T[ ( ) ( ) ]d A A Q G C S C S G GRG dt

T T T1( ) ( )C S R C S−

Γ + Γ +

T

T T T1

( ) { ( ) ( ) ( ) ( ) ( )

[ ( ) ( ) ( )] ( )[ ( ) ( ) ( )]} .

d t y z A t y t y z t y z A t y Q t y

C t y t y z S t y R t y C t y t y z S t y dt−

Γ , , = , Γ , , + Γ , , , + ,

− , Γ , , + , , , Γ , , + ,

ρ , = η0 0( ) ( )A | y z A | y =0 0( )Z h Y

Γ , , = − = = = − = .T T TM M0 0 0 0 0 0 0 0(0 ) [( ) ] [ ] ( ) ( ) ( )y z X Z X |Y y X X |Y y h y h y D y

Γ , , = Γ , = .inv( ) ( ) ( )t y z t y z

, =( ) ( )A t y A t , = +( ) ( ) ( )e t y K t y i t , =( ) ( )B t y B t , =( ) ( )C t y C t , = +( ) ( ) ( )g t y L t y j t , =( ) ( )D t y D t

⊕= + −0 0 0 0 0( ) [ ]x xy y ym y m D D y m ⊕

−

T0 0 0 0x xy y xyD D D D

Γt ( )G t tY

Γ µ

T T T( ) [( ) ( ) ]x y zt t t tt y z m y m z mμ , , = + Δ − −

Δ t × +( ')n m n

� �( ) tt y z xμ , , = , ρ =

T T T' ''[ ]x x x

� �''( ) , tt y z x, , = ρv −( ')n n

''tX

µ µ = µ µT T T' ''[ ]



and , respectively. Then, from approximation (A.8) for , we obtain the second for�mula (4.3) with the parameters (4.6). Lemma 3 is proved.

P r o o f o f A s s e r t i o n 1. First, we find the Gaussian approximations for conditional means , ,

, and of type (2.12) that are required to find exact structural functions (2.24), (2.25), which we dousing Lemma 3. Substituting (4.2) into (2.12) and taking into account designation (4.7), we can easilyobtain the approximations

(A.9)

with the bulky repetitive argument .

Similarly, for the conditional mean remaining in (2.24), we have the approximation

. Using the Gaussian property that holds for some vec�tor function [29]

where is the Jacobi matrix, we can rewrite the latter as

(A.10)

As a result, substituting (A.9), (A.10) into (2.24), we have required approximation (4.8) to the optimaldiffusion function . To find similar approximation to the drift function , we substitute (A.9) into

(2.25) and replace by . As a result, we easily obtain (4.9). Assertion 1 is proved.

ACKNOWLEDGMENTS

This study was financially supported by the Russian Foundation for Basic Research (grant no. 13�08�00323�a).

REFERENCES

1. Markov Estimation Theory in Radio Engineering, Ed. by M. S. Yarlykov (Radiotekhnika, Moscow, 2004) [in Rus�sian].

2. O. A. Stepanov, Applying Nonlinear Filtering Theory to Navigation Information Processing Problems (TsNIIElektropribor, St. Petersburg, 2003) [in Russian].

3. A. E. Brockwell, A. L. Rojas, and R. E. Kass, “Recursive Bayesian decoding of motor cortical signals by particlefiltering,” J. Neurophysiol. 91, 1899–1907 (2004).

4. J. Brocker and U. Parlitz, “Analyzing communication schemes using methods from nonlinear filtering,” Chaos13, 195–208 (2003).

5. A. N. Shiryaev, Fundamentals of Stochastic Financial Mathematics, Vol. 1 (FAZIS, Moscow, 1998) [in Russian].6. R. Sh. Liptser and A. N. Shiryaev, Statistics of Random Processes. I. General Theory. II. Applications (Nauka,

Moscow, 1974; Springer Verlag, New York, 1977, 1978).7. I. N. Sinitsyn, Kalman and Pugachev Filters (Logos, Moscow, 2007) [in Russian].8. B. I. Shakhtarin, Nonlinear Optimal Filtering in Examples and Problems (Gelios ARV, Moscow, 2008) [in Rus�

sian].9. The Oxford Handbook of Nonlinear Filtering, Eds. D. Crisan and B. Rozovskii (University Press, Oxford, 2011).

10. S. S.�T. Yau, “New algorithms in real time solution of the nonlinear filtering problem,” Communications inInformation and Systems 8 (3), 303–332 (2008).

11. A. Budhiraja, L. Chen, and C. Lee, “A survey of numerical methods for nonlinear filtering problems,” Physica.D. Nonlinear Phenomenon 230 (1–2), 27–36 (2007).

12. K. A. Rybakov, “Approximate solution of the optimal nonlinear filtering problem for stochastic differential sys�tems by the method of stochastic tests,” Sib. Zh. Vychisl. Mat. 16 (4), 377–391 (2013).

13. F. E. Daum, “Exact finite dimensional nonlinear filters,” IEEE Trans. Autom. Control 31, 616–622 (1986).14. V. V. Khutortsev, “On terminal optimal linear filtering of a convolution of the state vector of an information pro�

cess,” J. Comput. Syst. Sci. Int. 48 (4), 554–558 (2009).15. I. E. Kazakov and M. A. Makarov, “Quasi�optimal nonlinear filters,” Comput. Syst. Sci. Int. 35 (3), 374–378

(1996).

Δ = Δ ΔT T T' ''[ ]t t t μ = ν''

′a c

′S R

ˆ ˆˆ ˆ'( ) '( ) ( ) ( ) '( ) '( ) ( ) ( )t t t ta t y z a o c t y z c o S t y z S o R t y z R o, , ≈ , , , ≈ , , , ≈ , , , ≈

= , ,μ , , ,Γ( ( ) )t to t y t y z

−

T( ' )x z c

− , , ≈ − , , μ , , ,Γ

T T' M ' |( ) ( ) [( ) ( ) ( ) ]N tx z c t x y X z c t X y t y zη( )u

− η , = ∇ η , = , ,η

T T TM | M | ˆ[( ) ( ) ] [ ( ) ] ( )N m N mU m U m D D U m D D m D

TTˆ ˆ( ) ( ( ))mm m D m D, = ∇ ,η η

µ µ− , , μ,Γ = Γ ∇ , ,μ,Γ = Γ , ,μ,Γ .

T TTM ' | ' 'ˆ[( ) ( ) ] ( ) ( )ˆN X z c t X y t y c t yc

⋅( )G ⋅( )f

⋅( )G ⋅( )NG

892


RUDENKO

16. O. L. Perov and L. E. Shirokov, “Nonlinear asymptotically unbiased minimal dimensional filters,” Izv. Akad.Nauk SSSR, Tekh. Kibern., No. 3, 161–167 (1976).

17. A. V. Panteleev, “Synthesis of suboptimal nonlinear controlled minimal dimensional filters,” Izv. Vyssh.Uchebn. Zaved., Priborostr., No. 11, 31–37 (1986).

18. V. S. Pugachev, “Estimating the state and parameters of continuous nonlinear systems,” Avtom. Telemekh.,No. 6, 63–79 (1979).

19. M. L. Dashevskii, “Synthesis of conditionally optimal filters based on equations of optimal nonlinear filtering,”Avtom. Telemekh., No. 10, 109–118 (1983).

20. E. A. Rudenko, “Synthesis of optimal structure of continuous nonlinear filters of the order of the estimatedplant,” in Methods for Restoring and Analyzing Dynamics of Controlled Processes (MO, Moscow, 1988), pp. 113–123.

21. E. A. Rudenko, “Optimal structure of nonlinear finite�order filters,” Preprint (Moscow Institute of Aviation,Moscow, 1989).

22. A. V. Panteleev, E. A. Rudenko, and A. S. Bortakovskii, Nonlinear Control Systems: Description, Analysis, Syn�thesis (Vuzovskaya kniga, Moscow, 2008) [in Russian].

23. E. A. Rudenko, “Optimal structure of discrete nonlinear low�order filters,” Avtom. Telemekh., No. 9, 58–71(1999).

24. E. A. Rudenko, “Optimal discrete nonlinear filters of the objects’s order and their Gaussian approximations,”Autom. Remote Control 71 (2), 320–338 (2010).

25. V. S. Korolyuk, N. I. Portenko, A. V. Skorokhod, and A. F. Turbin, Handbook on Probability Theory and Math�ematical Statistics (Nauka, Moscow, 1985) [in Russian].

26. I. E. Kazakov, Statistical Theory of Control Systems in State Space (Nauka, Moscow, 1975) [in Russian].27. G. L. Degtyarev and T. K. Sirazetdinov, Theoretical Foundations of Optimal Control of Elastic Spacecraft (Mashino�

stroenie, Moscow, 1986) [in Russian].28. O. A. Ladyzhenskaya, V. A. Solonnikov, and N. N. Ural’tseva, Linear and Quasilinear Equations of Parabolic

Type (Nauka, Moscow, 1967) [in Russian].29. A. N. Malakhov, Accumulant Analysis of Non�Gaussian Random Processes and Their Transformations (Sov. radio,

Moscow, 1978) [in Russian].30. A. N. Shiryaev, Probability (Nauka, Moscow, 1980; Springer, 1995).31. W. Feller, An Introduction to Probability Theory and Its Applications Vol. 2 (Wiley, New York, 1968).32. D. F. Kuznetsov, Stochastic Differential Equations: Theory and Practice of Numerical Solution (Izd�vo Politekhn.

un�ta, St. Petersburg, 2010) [in Russian].33. M. B. Nevel’son and R. Z. Khas’minskii, Stochastic Approximation and Recurrent Estimation (Nauka, Moscow,

1972) [in Russian].34. K. Brammer and G. Siffling, Kalman–Bucy Filters (Artec House, Norwood, MA, 1989).35. D. F. Liang and G. S. Christensen, “Exact and approximate state estimation for nonlinear dynamic systems,”

Automatica 11, 603–612 (1975).36. P. Antosik, J. Mikusinski, and R. Sikorski, Theory of Distributions. Sequential Approach (Elsevier�PWN, Warsaw,

1973).37. V. I. Tikhonov, Statistical Radio Engineering (Radio i svyaz’, Moscow, 1982) [in Russian].

Translated by M. Talacheva

optimal structure of continuous nonlinear reduced-order pugachev filter

Documents