stochastic filtering by projection

46
Stochastic Filtering by Projection Stochastic Filtering by Projection The Example of the Quadratic Sensor John Armstrong (King’s College London) collaboration with Damiano Brigo (Imperial College) GSI2013

Upload: others

Post on 08-Nov-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Stochastic Filtering by ProjectionThe Example of the Quadratic Sensor

John Armstrong (King’s College London)collaboration with

Damiano Brigo (Imperial College)

GSI2013

Page 2: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Stochastic Filtering

Motivation

Estimate the current state of a stochastic system from imperfectmeasurements

I Estimate the position of a car

I Estimate the volatility of a stock from option prices

I Applications in weather forecasting, oil extraction ...

The calculation should be performed online.

Page 3: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Stochastic Filtering

Motivation

Estimate the current state of a stochastic system from imperfectmeasurements

I Estimate the position of a car

I Estimate the volatility of a stock from option prices

I Applications in weather forecasting, oil extraction ...

The calculation should be performed online.

Page 4: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Stochastic Filtering

Motivation

Estimate the current state of a stochastic system from imperfectmeasurements

I Estimate the position of a car

I Estimate the volatility of a stock from option prices

I Applications in weather forecasting, oil extraction ...

The calculation should be performed online.

Page 5: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Stochastic Filtering

Motivation

Estimate the current state of a stochastic system from imperfectmeasurements

I Estimate the position of a car

I Estimate the volatility of a stock from option prices

I Applications in weather forecasting, oil extraction ...

The calculation should be performed online.

Page 6: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Stochastic Filtering

Motivation

Estimate the current state of a stochastic system from imperfectmeasurements

I Estimate the position of a car

I Estimate the volatility of a stock from option prices

I Applications in weather forecasting, oil extraction ...

The calculation should be performed online.

Page 7: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Stochastic Filtering

Mathematical formulation

dXt = ft(Xt) dt + σt(Xt) dWt , X0,

dYt = bt(Xt) dt + dVt , Y0 = 0 .

I Xt is a process representing the state.

I Yt is a process representing the measurement.

I Wt and Vt are independent Wiener processes.

QuestionWhat is the probability distribution for Xt given the values of Yt

up to time t?

Page 8: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Stochastic Filtering

Mathematical formulation

dXt = ft(Xt) dt + σt(Xt) dWt , X0,

dYt = bt(Xt) dt + dVt , Y0 = 0 .

I Xt is a process representing the state.

I Yt is a process representing the measurement.

I Wt and Vt are independent Wiener processes.

QuestionWhat is the probability distribution for Xt given the values of Yt

up to time t?

Page 9: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Stochastic Filtering

The Kushner–Stratonovich equationWith sufficient regularity and bounds, one can show that theprobability density pt satisfies:

dpt = L∗t ptdt + pt [bt − Ept{bt}][dYt − Ept{bt}dt] .

where:

I

L∗ = −ft∂

∂x+

1

2at

∂x2

is the backward diffusion operator

I aTt a = σ and a is a square root of σ.

I Ept denotes expectation with respect to pt .

QuestionHow can we efficiently approximate solutions to the infinitedimensional Kushner–Stratonovich equation?

Page 10: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Stochastic Filtering

The Kushner–Stratonovich equationWith sufficient regularity and bounds, one can show that theprobability density pt satisfies:

dpt = L∗t ptdt + pt [bt − Ept{bt}][dYt − Ept{bt}dt] .

where:

I

L∗ = −ft∂

∂x+

1

2at

∂x2

is the backward diffusion operator

I aTt a = σ and a is a square root of σ.

I Ept denotes expectation with respect to pt .

QuestionHow can we efficiently approximate solutions to the infinitedimensional Kushner–Stratonovich equation?

Page 11: Stochastic Filtering by Projection

Stochastic Filtering by Projection

The geometric idea

The geometric idea

I Choose a submanifold of the space of probability distributionsso that points in the manifold can approximate pt well.

I View the partial differential equation as defining a stochasticvector field.

I Use projection to restrict the vector field to the tangent space.

I Solve the resulting finite dimensional stochastic differentialequation.

Page 12: Stochastic Filtering by Projection

Stochastic Filtering by Projection

The geometric idea

The geometric idea

I Choose a submanifold of the space of probability distributionsso that points in the manifold can approximate pt well.

I View the partial differential equation as defining a stochasticvector field.

I Use projection to restrict the vector field to the tangent space.

I Solve the resulting finite dimensional stochastic differentialequation.

Page 13: Stochastic Filtering by Projection

Stochastic Filtering by Projection

The geometric idea

The geometric idea

I Choose a submanifold of the space of probability distributionsso that points in the manifold can approximate pt well.

I View the partial differential equation as defining a stochasticvector field.

I Use projection to restrict the vector field to the tangent space.

I Solve the resulting finite dimensional stochastic differentialequation.

Page 14: Stochastic Filtering by Projection

Stochastic Filtering by Projection

The geometric idea

The geometric idea

I Choose a submanifold of the space of probability distributionsso that points in the manifold can approximate pt well.

I View the partial differential equation as defining a stochasticvector field.

I Use projection to restrict the vector field to the tangent space.

I Solve the resulting finite dimensional stochastic differentialequation.

Page 15: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Choosing the submanifold

The linear problem

If:

I the coefficient functions a, b and f in the problem are all linear

I p0, which represents the prior probability distribution for thestate, is a Gaussian

then

I pt is always a Gaussian

I The mean and standard deviation of pt follow a finitedimensional SDE.

This is called the Kalman filter.One can linearize any filtering problem at each point in time toobtain the Extended Kalman filter.

Page 16: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Choosing the submanifold

The linear problem

If:

I the coefficient functions a, b and f in the problem are all linear

I p0, which represents the prior probability distribution for thestate, is a Gaussian

then

I pt is always a Gaussian

I The mean and standard deviation of pt follow a finitedimensional SDE.

This is called the Kalman filter.

One can linearize any filtering problem at each point in time toobtain the Extended Kalman filter.

Page 17: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Choosing the submanifold

The linear problem

If:

I the coefficient functions a, b and f in the problem are all linear

I p0, which represents the prior probability distribution for thestate, is a Gaussian

then

I pt is always a Gaussian

I The mean and standard deviation of pt follow a finitedimensional SDE.

This is called the Kalman filter.One can linearize any filtering problem at each point in time toobtain the Extended Kalman filter.

Page 18: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Choosing the submanifold

Two important familiesFor multi modal problems, project onto one of the followingfamilies:

I A mixture of m Gaussian distributions:

pt(x) =∑i

λie(x−µi )/2σ2

i

I λi ≥ 0.∑

i λi = 1.I Gives rise to a 3m − 1 dimensional family.

I The exponential family

pt(x) = exp(a0 + a1x + a2x2 + . . . a2nx2n)

I a2n < 0I Gives rise to a 2n dimensional family.

Page 19: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Choosing the submanifold

Two important familiesFor multi modal problems, project onto one of the followingfamilies:

I A mixture of m Gaussian distributions:

pt(x) =∑i

λie(x−µi )/2σ2

i

I λi ≥ 0.∑

i λi = 1.I Gives rise to a 3m − 1 dimensional family.

I The exponential family

pt(x) = exp(a0 + a1x + a2x2 + . . . a2nx2n)

I a2n < 0I Gives rise to a 2n dimensional family.

Page 20: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Choosing the submanifold

Two important familiesFor multi modal problems, project onto one of the followingfamilies:

I A mixture of m Gaussian distributions:

pt(x) =∑i

λie(x−µi )/2σ2

i

I λi ≥ 0.∑

i λi = 1.I Gives rise to a 3m − 1 dimensional family.

I The exponential family

pt(x) = exp(a0 + a1x + a2x2 + . . . a2nx2n)

I a2n < 0I Gives rise to a 2n dimensional family.

Page 21: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Projecting the equations

The choice of metric

Choice of metric for the projection

Need to choose a Hilbert space structure on the space ofprobability distributions (more precisely some enveloping space).

I The Hellinger metric.I Theoretical advantage of coordinate independenceI Works well with exponential families (Brigo)I Meaningful for problems where density p does not exist.I Requires numerical approximation of integrals to implement.

I The direct L2 metric.I Works well with mixture families.I All integrals that occur can be calculated analytically.

Page 22: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Projecting the equations

The choice of metric

Choice of metric for the projection

Need to choose a Hilbert space structure on the space ofprobability distributions (more precisely some enveloping space).

I The Hellinger metric.I Theoretical advantage of coordinate independenceI Works well with exponential families (Brigo)I Meaningful for problems where density p does not exist.I Requires numerical approximation of integrals to implement.

I The direct L2 metric.I Works well with mixture families.I All integrals that occur can be calculated analytically.

Page 23: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Projecting the equations

The choice of metric

Choice of metric for the projection

Need to choose a Hilbert space structure on the space ofprobability distributions (more precisely some enveloping space).

I The Hellinger metric.I Theoretical advantage of coordinate independenceI Works well with exponential families (Brigo)I Meaningful for problems where density p does not exist.I Requires numerical approximation of integrals to implement.

I The direct L2 metric.I Works well with mixture families.I All integrals that occur can be calculated analytically.

Page 24: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Projecting the equations

Projecting SDE’s

Understanding stochastic differential equations

A stochastic differential equation such as:

dXt = ft(Xt) dt + σt(Xt) dWt

is shorthand for an integral equation such as:

XT =

∫ T

0ft(Xt) dt +

∫ T

0σt(Xt) dWt

where the right hand integral is defined by the Ito integral:∫ T

0f (t) dWt = lim

n→∞

∞∑i=1

f (ti )(Wti+1 −Wti ).

Page 25: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Projecting the equations

Projecting SDE’s

The Stratonovich integralI Take the Ito integral:∫ T

0f (t) dWt = lim

n→∞

∞∑i=1

f (ti )(Wti+1 −Wti ).

and change the point where you evaluate the integrand∫ T

0f (t) ◦ dWt = lim

n→∞

∞∑i=1

f (ti + ti+1

2)(Wti+1 −Wti ).

to get the Stratonvich integral. Hence you can defineStratonovich SDE’s.

I The difference between the two integrals is an ordinaryintegral. This allows you to convert between the twoformulations.

I Ito SDE’s model causality more naturallyI Stratonovich SDE’s transform like vector fields.

Page 26: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Projecting the equations

Projecting SDE’s

The Stratonovich integralI Take the Ito integral:∫ T

0f (t) dWt = lim

n→∞

∞∑i=1

f (ti )(Wti+1 −Wti ).

and change the point where you evaluate the integrand∫ T

0f (t) ◦ dWt = lim

n→∞

∞∑i=1

f (ti + ti+1

2)(Wti+1 −Wti ).

to get the Stratonvich integral. Hence you can defineStratonovich SDE’s.

I The difference between the two integrals is an ordinaryintegral. This allows you to convert between the twoformulations.

I Ito SDE’s model causality more naturallyI Stratonovich SDE’s transform like vector fields.

Page 27: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Projecting the equations

Projecting SDE’s

A recipe for projecting SDE’s

To project an SDE onto a submanifold parameterized byθ = (θ1, θ2, . . . , θn):

I Write the SDE as an SDE with vector coefficients inStratonovich form.

I Project all the coefficients onto the tangent space.

I Equate both sides of the projected equations to get an SDEfor the θi .

Since Stratonovich SDE’s transform like vector fields, this recipe isinvariant of the parameterization.

Page 28: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Projecting the equations

Projecting SDE’s

A recipe for projecting SDE’s

To project an SDE onto a submanifold parameterized byθ = (θ1, θ2, . . . , θn):

I Write the SDE as an SDE with vector coefficients inStratonovich form.

I Project all the coefficients onto the tangent space.

I Equate both sides of the projected equations to get an SDEfor the θi .

Since Stratonovich SDE’s transform like vector fields, this recipe isinvariant of the parameterization.

Page 29: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Projecting the equations

Projecting SDE’s

The projected equations

The end result for the case of L2 projection is:

dθi =m∑j=1

hij{〈p(θ),Lvj〉dt − 〈γ0(p(θ)), vj〉dt + 〈γ1(p(θ)), vj〉 ◦ dY

}.

Where:

I The vj = ∂p∂θj

give a basis for the tangent space

I hij and hij are the Riemannian metric tensor 〈vi , vj〉.I γ0t (p) := 1

2 [|bt |2 − Ep{|bt |2}]I γ1t (p) := [bt − Ep{bt}]pI 〈·, ·〉 is the L2 inner product.

Note that the inner products and expectations give rise to integrals.We can compute these analytically for the normal mixture family.

Page 30: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Solving the SDE’s

Solving the finite system of SDE’s

I Approximate the differential equation as a difference equationand solve numerically.

I This is more delicate for stochastic equations than ordinaryones. See Kloeden and Platen. We use theStratonovich–Heun cheme.

I Note that the resulting difference equation will depend uponthe choice of parameterization of the submanifold. Choosecoordinates φ : Rn −→M so that φ is defined on all of Rn.

Page 31: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Numerical example

The quadratic sensor

dXt = dWt

dYt = X 2 + dVt .

I We do not receive any information on the sign of X .

I We expect that once X has hit the origin, p will beapproximately symmetrical.

I We expect a bimodal distribution

Page 32: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Numerical example

The quadratic sensor

dXt = dWt

dYt = X 2 + dVt .

I We do not receive any information on the sign of X .

I We expect that once X has hit the origin, p will beapproximately symmetrical.

I We expect a bimodal distribution

Page 33: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Numerical example

Simulation for the Quadratic Sensor

0

0.2

0.4

0.6

0.8

1

-8 -6 -4 -2 0 2 4 6 8

X

Distribution at time 0

ProjectionExact

Extended KalmanExponential

Page 34: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Numerical example

Simulation for the Quadratic Sensor

0

0.2

0.4

0.6

0.8

1

-8 -6 -4 -2 0 2 4 6 8

X

Distribution at time 1

ProjectionExact

Extended KalmanExponential

Page 35: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Numerical example

Simulation for the Quadratic Sensor

0

0.2

0.4

0.6

0.8

1

-8 -6 -4 -2 0 2 4 6 8

X

Distribution at time 2

ProjectionExact

Extended KalmanExponential

Page 36: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Numerical example

Simulation for the Quadratic Sensor

0

0.2

0.4

0.6

0.8

1

-8 -6 -4 -2 0 2 4 6 8

X

Distribution at time 3

ProjectionExact

Extended KalmanExponential

Page 37: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Numerical example

Simulation for the Quadratic Sensor

0

0.2

0.4

0.6

0.8

1

-8 -6 -4 -2 0 2 4 6 8

X

Distribution at time 4

ProjectionExact

Extended KalmanExponential

Page 38: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Numerical example

Simulation for the Quadratic Sensor

0

0.2

0.4

0.6

0.8

1

-8 -6 -4 -2 0 2 4 6 8

X

Distribution at time 5

ProjectionExact

Extended KalmanExponential

Page 39: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Numerical example

Simulation for the Quadratic Sensor

0

0.2

0.4

0.6

0.8

1

-8 -6 -4 -2 0 2 4 6 8

X

Distribution at time 6

ProjectionExact

Extended KalmanExponential

Page 40: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Numerical example

Simulation for the Quadratic Sensor

0

0.2

0.4

0.6

0.8

1

-8 -6 -4 -2 0 2 4 6 8

X

Distribution at time 7

ProjectionExact

Extended KalmanExponential

Page 41: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Numerical example

Simulation for the Quadratic Sensor

0

0.2

0.4

0.6

0.8

1

-8 -6 -4 -2 0 2 4 6 8

X

Distribution at time 8

ProjectionExact

Extended KalmanExponential

Page 42: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Numerical example

Simulation for the Quadratic Sensor

0

0.2

0.4

0.6

0.8

1

-8 -6 -4 -2 0 2 4 6 8

X

Distribution at time 9

ProjectionExact

Extended KalmanExponential

Page 43: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Numerical example

Simulation for the Quadratic Sensor

0

0.2

0.4

0.6

0.8

1

-8 -6 -4 -2 0 2 4 6 8

X

Distribution at time 10

ProjectionExact

Extended KalmanExponential

Page 44: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Numerical example

L2 residuals for the quadratic sensor

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 2 4 6 8 10

Time

Residuals

Projection Residual (L2 norm)Extended Kalman Residual (L2 norm)

Hellinger Projection Residual (L2 norm)

Page 45: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Numerical example

Levy residuals for the quadratic sensor

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0 1 2 3 4 5 6 7 8 9 10

Time

ProkhorovResiduals

Prokhorov Residual (L2NM)Prokhorov Residual (HE)

Best possible residual (3Deltas)

Page 46: Stochastic Filtering by Projection

Stochastic Filtering by Projection

Conclusions

Conclusions

I Projection methods allow us to approximate the solution tononlinear problems with surprising accuracy using only lowdimensional manifolds.

I This conclusion holds for a variety of projection metrics andmanifolds.

I L2 projection of normal mixtures is particularly promising sinceall integrals can be computed analytically.