mathematics of data assimilation - university of bathpeople.bath.ac.uk/mamamf/talks/na080208.pdf ·...

Post on 20-May-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Mathematics of Data Assimilation

Melina Freitag

Department of Mathematical Sciences

University of Bath

Numerical Analysis Seminar8th February 2008

1 Introduction

2 Basic concepts

3 Variational Data AssimilationLeast square estimationKalman Filter

4 Problems

5 Plan

Outline

1 Introduction

2 Basic concepts

3 Variational Data AssimilationLeast square estimationKalman Filter

4 Problems

5 Plan

What is Data Assimilation?

Loose definition

Estimation and prediction (analysis) of an unknown, true state bycombining observations and system dynamics (model output).

What is Data Assimilation?

Loose definition

Estimation and prediction (analysis) of an unknown, true state bycombining observations and system dynamics (model output).

Some examples

Navigation

What is Data Assimilation?

Loose definition

Estimation and prediction (analysis) of an unknown, true state bycombining observations and system dynamics (model output).

Some examples

Navigation

Medical imaging

What is Data Assimilation?

Loose definition

Estimation and prediction (analysis) of an unknown, true state bycombining observations and system dynamics (model output).

Some examples

Navigation

Medical imaging

Numerical weather prediction

Data Assimilation in NWP

Estimate the state of the atmosphere xi.

Observations y

Satellites

Ships and buoys

Surface stations

Aeroplanes

Data Assimilation in NWP

Estimate the state of the atmosphere xi.

A priori information xB

background state (usualprevious forecast)

Observations y

Satellites

Ships and buoys

Surface stations

Aeroplanes

Data Assimilation in NWP

Estimate the state of the atmosphere xi.

A priori information xB

background state (usualprevious forecast)

Models

a model how the atmosphereevolves in time (imperfect)

xi+1 = M(xi)

Observations y

Satellites

Ships and buoys

Surface stations

Aeroplanes

Data Assimilation in NWP

Estimate the state of the atmosphere xi.

A priori information xB

background state (usualprevious forecast)

Models

a model how the atmosphereevolves in time (imperfect)

xi+1 = M(xi)

a function linking model spaceand observation space(imperfect)

yi = H(xi)

Observations y

Satellites

Ships and buoys

Surface stations

Aeroplanes

Data Assimilation in NWP

Estimate the state of the atmosphere xi.

A priori information xB

background state (usualprevious forecast)

Models

a model how the atmosphereevolves in time (imperfect)

xi+1 = M(xi)

a function linking model spaceand observation space(imperfect)

yi = H(xi)

Observations y

Satellites

Ships and buoys

Surface stations

Aeroplanes

Assimilation algorithms

used to find an (approximate)state of the atmosphere xi attimes i (usually i = 0)

using this state a forecast forfuture states of the atmospherecan be obtained

xA: Analysis (estimation of thetrue state after the DA)

Outline

1 Introduction

2 Basic concepts

3 Variational Data AssimilationLeast square estimationKalman Filter

4 Problems

5 Plan

Schematics of DA

Figure: Background state xB

Schematics of DA

Figure: Observations y

Schematics of DA

Figure: Analysis xA (consistent with observations and model dynamics)

Data Assimilation in NWP

Underdeterminacy

Size of the state vector x: 432 × 320 × 50 × 7 = O(107)

Data Assimilation in NWP

Underdeterminacy

Size of the state vector x: 432 × 320 × 50 × 7 = O(107)

Number of observations (size of y): O(105 − 106)

Data Assimilation in NWP

Underdeterminacy

Size of the state vector x: 432 × 320 × 50 × 7 = O(107)

Number of observations (size of y): O(105 − 106)

Operator H (nonlinear!) maps from state space into observationsspace: y = H(x)

Some easy schemes

Cressman analysis

Figure: Copyright:ECMWF

At each time step i

xA(k) = x

B(k)+

Pn

l=1w(lk)(y(l) − xB(l))

Pn

l=1w(lk)

w(lk) = max

0,R2 − d2

lk

R2 + d2lk

«

dlk measures the distance between pointsl and k.

Outline

1 Introduction

2 Basic concepts

3 Variational Data AssimilationLeast square estimationKalman Filter

4 Problems

5 Plan

Data Assimilation in NWP

Estimate the state of the atmosphere xi.

Apriori information xB

background state (usualprevious forecast) has errors!

Models

a model how the atmosphereevolves in time (imperfect)

xi+1 = M(xi) + error

a function linking model spaceand observation space(imperfect)

yi = H(xi) + error

Observations y has errors!

Satellites

Ships and buoys

Surface stations

Airplanes

Assimilation algorithms

used to find an (approximate)state of the atmosphere xi attimes i (usually i = 0)

using this state a forecast forfuture states of the atmospherecan be obtained

xA: Analysis (estimation of thetrue state after the DA)

Error variables

Modelling the errors

background error εB = xB − xTruth of average εB and covariance

B = (εB − εB)(εB − εB)T

Error variables

Modelling the errors

background error εB = xB − xTruth of average εB and covariance

B = (εB − εB)(εB − εB)T

observation error εO = y − H(xTruth) of average εO and covariance

R = (εO − εO)(εO − εO)T

Error variables

Modelling the errors

background error εB = xB − xTruth of average εB and covariance

B = (εB − εB)(εB − εB)T

observation error εO = y − H(xTruth) of average εO and covariance

R = (εO − εO)(εO − εO)T

analysis error εA = xA − xTruth of average εA and covariance

A = (εA − εA)(εA − εA)T

measure of the analysis error that we want to minimise

tr(A) = ‖εA − εA‖2

Assumptions

Linearised observation operator: H(x) − H(xB) = H(x− xB)

Nontrivial errors: B, R are positive definite

Unbiased errors: xB − xTruth = y − H(xTruth) = 0

Uncorrelated errors: (xB − xTruth)(y − H(xTruth))T = 0

Optimal least-squares estimater

Cost function

Solution of the variational optimisation problem xA = arg minJ(x) where

J(x) = (x − xB)T

B−1(x− x

B) + (y − H(x))TR

−1(y − H(x))

= JB(x) + JO(x)

Optimal least-squares estimater

Cost function

Solution of the variational optimisation problem xA = arg minJ(x) where

J(x) = (x − xB)T

B−1(x− x

B) + (y − H(x))TR

−1(y − H(x))

= JB(x) + JO(x)

Interpolation equations

xA = x

B + K(y − H(xB)), where

K = BHT (HBH

T + R)−1K . . . gain matrix

Optimal least-squares estimater

Cost function

Solution of the variational optimisation problem xA = arg minJ(x) where

J(x) = (x − xB)T

B−1(x− x

B) + (y − H(x))TR

−1(y − H(x))

= JB(x) + JO(x)

Interpolation equations

xA = x

B + K(y − H(xB)), where

K = BHT (HBH

T + R)−1K . . . gain matrix

Analysis error covariance

A = (I− KH)B(I− KH)T + KRKT

If K is the optimal least-squares gain

A = (I −KH)B

Conditional probabilities

Non-Gaussian PDF’s (probability density function)

P (x) is a priori PDF (background)

P (y|x) is the observation PDF (likelihood of the observations givenbackground x)

Conditional probabilities

Non-Gaussian PDF’s (probability density function)

P (x) is a priori PDF (background)

P (y|x) is the observation PDF (likelihood of the observations givenbackground x)

P (x|y) conditional probability of the model state given theobservations, Bayes theorem:

argx max P (x|y) = argx maxP (y|x)P (x)

P (y)

Conditional probabilities

Non-Gaussian PDF’s (probability density function)

P (x) is a priori PDF (background)

P (y|x) is the observation PDF (likelihood of the observations givenbackground x)

P (x|y) conditional probability of the model state given theobservations, Bayes theorem:

argx max P (x|y) = argx maxP (y|x)P (x)

P (y)

Gaussian PDF’s

P (x|y) = c1 exp“

−(x− xB)T

B−1(x − x

B)”

·

c2 exp“

−(y − H(x))TR

−1(y − H(x))”

xA is the maximum a posteriori estimator of xTruth. Maximising P (x|y)equivalent to minimising J(x)

A simple scalar illustration

Room temperature

T O observation with standard deviation σO

T B background with standard deviation σB

A simple scalar illustration

Room temperature

T O observation with standard deviation σO

T B background with standard deviation σB

T A = T B + k(T O − T B) with error variance σ2O = (1 − k)2σ2

B + k2σ2O

optimal k which minimises error variance

k =σ2

B

σ2B + σ2

O

equivalent to minimising

J(T ) =(T − T B)2

σ2B

+(T − T O)2

σ2O

and then1

σ2A

=1

σ2B

+1

σ2O

Optimal interpolation

Optimal interpolation

Computation ofx

A = xB + K(y − H(xB))

K = BHT (HBH

T + R)−1K . . . gain matrix

Optimal interpolation

Optimal interpolation

Computation ofx

A = xB + K(y − H(xB))

K = BHT (HBH

T + R)−1K . . . gain matrix

expensive!

Three-dimensional variational assimilation (3D-Var)

Minimisation of

J(x) = (x − xB)T

B−1(x− x

B) + (y − H(x))TR

−1(y − H(x))

= JB(x) + JO(x)

Three-dimensional variational assimilation (3D-Var)

Minimisation of

J(x) = (x − xB)T

B−1(x− x

B) + (y − H(x))TR

−1(y − H(x))

= JB(x) + JO(x)

avoids computation of K by using a descent algorithm

working in control variable space

Four-dimensional variational assimilation (4D-Var)

Minimise the cost function

J(x0) = (x0 − xB)T

B−1(x0 − x

B) +nX

i=0

(yi − Hi(xi))TR

−1i (yi − Hi(xi))

subject to model dynamics xi = M0→ix0

Four-dimensional variational assimilation (4D-Var)

Minimise the cost function

J(x0) = (x0 − xB)T

B−1(x0 − x

B) +nX

i=0

(yi − Hi(xi))TR

−1i (yi − Hi(xi))

subject to model dynamics xi = M0→ix0

Figure: Copyright:ECMWF

4D-Var analysis

Model dynamics

Strong constraint: model states xi are subject to

xi = M0→ix0

nonlinear constraint optimisation problem (hard!)

4D-Var analysis

Model dynamics

Strong constraint: model states xi are subject to

xi = M0→ix0

nonlinear constraint optimisation problem (hard!)

Simplifications

Causality (forecast expressed as product of intermediate forecast steps)

xi = Mi,i−1Mi−1,i−2 . . . M1,0x0

Tangent linear hypothesis (H and M can be linearised)

yi−Hi(xi) = yi−Hi(M0→ix0) = yi−Hi(M0→ixB0 )−HiM0→i(x0−x

B0 )

M is the tangent linear model.

4D-Var analysis

Model dynamics

Strong constraint: model states xi are subject to

xi = M0→ix0

nonlinear constraint optimisation problem (hard!)

Simplifications

Causality (forecast expressed as product of intermediate forecast steps)

xi = Mi,i−1Mi−1,i−2 . . . M1,0x0

Tangent linear hypothesis (H and M can be linearised)

yi−Hi(xi) = yi−Hi(M0→ix0) = yi−Hi(M0→ixB0 )−HiM0→i(x0−x

B0 )

M is the tangent linear model.

unconstrained quadratic optimisation problem one.

Minimisation of the 4D-Var cost function

Efficient implementation of J and ∇J :

forecast state xi = Mi,i−1Mi−1,i−2 . . . M1,0x0

Minimisation of the 4D-Var cost function

Efficient implementation of J and ∇J :

forecast state xi = Mi,i−1Mi−1,i−2 . . . M1,0x0

normalised departures di = R−1i (yi − Hi(xi))

cost function JOi = (yi − Hi(xi))T di

Minimisation of the 4D-Var cost function

Efficient implementation of J and ∇J :

forecast state xi = Mi,i−1Mi−1,i−2 . . . M1,0x0

normalised departures di = R−1i (yi − Hi(xi))

cost function JOi = (yi − Hi(xi))T di

∇J is calculated by

−1

2∇JO = −

1

2

nX

i=0

∇JOi

=nX

i=0

MT1,0 . . .M

Ti,i−1H

Ti di

= HT0 d0 + M

T1,0[H

T1 d1 + M2,1[H

T2 d2 + . . . + M

Tn,n−1H

Tndn] . . .]

Minimisation of the 4D-Var cost function

Efficient implementation of J and ∇J :

forecast state xi = Mi,i−1Mi−1,i−2 . . . M1,0x0

normalised departures di = R−1i (yi − Hi(xi))

cost function JOi = (yi − Hi(xi))T di

∇J is calculated by

−1

2∇JO = −

1

2

nX

i=0

∇JOi

=nX

i=0

MT1,0 . . .M

Ti,i−1H

Ti di

= HT0 d0 + M

T1,0[H

T1 d1 + M2,1[H

T2 d2 + . . . + M

Tn,n−1H

Tndn] . . .]

initialise adjoint variable x̃n = 0 and then x̃i−1 = MTi,i−1(x̃i + HT

i di)etc., . . . x̃0 = − 1

2∇JO

Minimisation of the 4D-Var cost function

Efficient implementation of J and ∇J :

forecast state xi = Mi,i−1Mi−1,i−2 . . . M1,0x0

normalised departures di = R−1i (yi − Hi(xi))

cost function JOi = (yi − Hi(xi))T di

∇J is calculated by

−1

2∇JO = −

1

2

nX

i=0

∇JOi

=nX

i=0

MT1,0 . . .M

Ti,i−1H

Ti di

= HT0 d0 + M

T1,0[H

T1 d1 + M2,1[H

T2 d2 + . . . + M

Tn,n−1H

Tndn] . . .]

initialise adjoint variable x̃n = 0 and then x̃i−1 = MTi,i−1(x̃i + HT

i di)etc., . . . x̃0 = − 1

2∇JO

Further simplifications

preconditioning with B = LLT (transform into control variable space)so that x̂ = L−1x

Incremental 4D-Var

Example - Three-Body Problem

Motion of three bodies in a plane, two position (q) and two momentum (p)coordinates for each body α = 1, 2, 3

Example - Three-Body Problem

Motion of three bodies in a plane, two position (q) and two momentum (p)coordinates for each body α = 1, 2, 3

Equations of motion

H(q,p) =1

2

X

α

|pα|2

−X X

α<β

mαmβ

|qα − qβ|

dqα

dt=

∂H

∂pα

dpα

dt= −

∂H

∂qα

Example - Three-Body problem

solver: partitioned Runge-Kutta scheme with time step h = 0.001

observations are taken as noise from the truth trajectory

background is given from a perturbed initial condition

Example - Three-Body problem

solver: partitioned Runge-Kutta scheme with time step h = 0.001

observations are taken as noise from the truth trajectory

background is given from a perturbed initial condition

assimilation window is taken 300 time steps

minimisation of cost function J using a Gauss-Newton method(neglecting all second derivatives)

∇J(x0) = 0

∇∇J(xj0)∆x

j0 = −∇J(xj

0), xj+10 = x

j0 + ∆x

j0

subsequent forecast is take 5000 time steps

Example- Three-Body problem

Figure: Truth trajectory

Example- Three-Body problem

500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500−0.05

0

0.05

0.1

0.15

0.2

Time step

x po

sitio

n of

the

sun

Solution for position of the sun

500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500−1

−0.5

0

0.5

1

Time step

x po

sitio

n of

the

moo

n

Solution for position of the moon

Truth x

Observation x

Background x

Figure: Truth trajectory with observations and background

Example- Three-Body problem

500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500−0.05

0

0.05

0.1

0.15

0.2

Time step

x po

sitio

n of

the

sun

Solution for position of the sun

500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500−1

−0.5

0

0.5

1

Time step

x po

sitio

n of

the

moo

n

Solution for position of the moon

Truth x

Observation x

Background x

Analysis x (after EKF)

Figure: Analysis

Example- Three-Body problem

500 1000 1500 2000 2500 3000 3500 4000 4500 5000 55000

0.05

0.1

0.15

0.2

Time step

RM

S e

rror

before assimilation

after assimilation

Figure: RMS error

Example- Three-Body problem

500 1000 1500 2000 2500 3000 3500 4000 4500 5000 55000

0.05

0.1

0.15

0.2

Time step

RM

S e

rror

before assimilation

after assimilation

500 1000 1500 2000 2500 3000 3500 4000 4500 5000 55000

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

sun before

planet before

moon before

sun after

planet after

moon after

Figure: RMS error

Example- Three-Body problem - Cycled 4D-Var

500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500

0

0.05

0.1

0.15

0.2

0.25

0.3

Time step

RM

S e

rror

before assimilation

after assimilation

Figure: First cycle

Example- Three-Body problem - Cycled 4D-Var

500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500

0

0.05

0.1

0.15

0.2

0.25

0.3

Time step

RM

S e

rror

before assimilation

after assimilation

Figure: Second cycle

Example- Three-Body problem - Cycled 4D-Var

500 1000 1500 2000 2500 3000 3500 4000 4500 5000 5500

0

0.05

0.1

0.15

0.2

0.25

0.3

Time step

RM

S e

rror

before assimilation

after assimilation

Figure: Third cycle

The Kalman Filter Algorithm

Sequential data assimilation, background is provided by the forecastthat starts from the previous analysis

covariance matrices BF , BA

forecast/model error xTruthi+1 = Mi+1,ix

Truthi + ηi where ηi ∼ N (0,Qi),

assumed to be uncorrelated to analysis error of previous forecast

The Kalman Filter Algorithm

Sequential data assimilation, background is provided by the forecastthat starts from the previous analysis

covariance matrices BF , BA

forecast/model error xTruthi+1 = Mi+1,ix

Truthi + ηi where ηi ∼ N (0,Qi),

assumed to be uncorrelated to analysis error of previous forecast

State and error covariance forecast

State forecast xFi+1 = Mi+1,ix

Ai

Error covariance forecast BFi+1 = Mi+1,iB

Ai M

Ti+1,i + Qi

The Kalman Filter Algorithm

Sequential data assimilation, background is provided by the forecastthat starts from the previous analysis

covariance matrices BF , BA

forecast/model error xTruthi+1 = Mi+1,ix

Truthi + ηi where ηi ∼ N (0,Qi),

assumed to be uncorrelated to analysis error of previous forecast

State and error covariance forecast

State forecast xFi+1 = Mi+1,ix

Ai

Error covariance forecast BFi+1 = Mi+1,iB

Ai M

Ti+1,i + Qi

State and error covariance analysis

Kalman gain Ki = BFi H

Ti (HiB

Fi H

Ti + Ri)

−1

State analysis xAi = x

Fi + Ki(yi − Hix

Fi )

Error covariance of analysis BAi = (I −KiHi)B

Fi

Example - Three-Body Problem

solver: partitioned Runge-Kutta scheme with time step h = 0.001

observations are taken as noise from the truth trajectory

background is given from a perturbed initial condition

assimilation window is taken 300 time steps

minimisation of cost function J using a Gauss-Newton method(neglecting all second derivatives)

application of 4D-Var

Example - Three-Body Problem

solver: partitioned Runge-Kutta scheme with time step h = 0.001

observations are taken as noise from the truth trajectory

background is given from a perturbed initial condition

assimilation window is taken 300 time steps

minimisation of cost function J using a Gauss-Newton method(neglecting all second derivatives)

application of 4D-Var

Compare using B = I with using a flow-dependent matrix B which wasgenerated by a Kalman Filter before the assimilation starts (see G.Inverarity (2007))

Example - Three-Body Problem

2.4 2.5 2.6 2.7 2.8 2.9 3

x 104

0

0.02

0.04

0.06

0.08

0.1

0.12

Time step

RM

S e

rror

before assimilation

after assimilation

Figure: 4D-Var with B = I

Example - Three-Body Problem

2.4 2.5 2.6 2.7 2.8 2.9 3

x 104

0

0.02

0.04

0.06

0.08

0.1

0.12

Time step

RM

S e

rror

before assimilation

after assimilation

Figure: 4D-Var with B = I

2.4 2.5 2.6 2.7 2.8 2.9 3

x 104

0

0.005

0.01

0.015

0.02

0.025

0.03

Time step

RM

S e

rror

before assimilation

after assimilation

Figure: 4D-Var with B = PA

Outline

1 Introduction

2 Basic concepts

3 Variational Data AssimilationLeast square estimationKalman Filter

4 Problems

5 Plan

Problems with Data Assimilation

DA is computational very expensive, one cycle is much more expensivethan the actual forecast

Problems with Data Assimilation

DA is computational very expensive, one cycle is much more expensivethan the actual forecast

estimation and storage of the B-matrix is hardin operational DA B is about 107

× 107

B should be flow-dependent but in practice often staticB needs to be modeled and diagonalised since B−1 too expensive tocompute (”control variable transform”)

Problems with Data Assimilation

DA is computational very expensive, one cycle is much more expensivethan the actual forecast

estimation and storage of the B-matrix is hardin operational DA B is about 107

× 107

B should be flow-dependent but in practice often staticB needs to be modeled and diagonalised since B−1 too expensive tocompute (”control variable transform”)

many assumptions are not validerrors non-Gaussian, data have biasesforward model operator M is not exact and also non-linear and systemdynamics are chaoticminimisation of the cost function needs close initial guess, smallassimilation window

Problems with Data Assimilation

DA is computational very expensive, one cycle is much more expensivethan the actual forecast

estimation and storage of the B-matrix is hardin operational DA B is about 107

× 107

B should be flow-dependent but in practice often staticB needs to be modeled and diagonalised since B−1 too expensive tocompute (”control variable transform”)

many assumptions are not validerrors non-Gaussian, data have biasesforward model operator M is not exact and also non-linear and systemdynamics are chaoticminimisation of the cost function needs close initial guess, smallassimilation window

model error not included

Outline

1 Introduction

2 Basic concepts

3 Variational Data AssimilationLeast square estimationKalman Filter

4 Problems

5 Plan

Model error and perturbation error

Figure: One assimilation window (6 hours)

xF1 − x

Truth1 = Mappr(x

A0 ) − M(xTruth

0 )

= Mappr(xA0 ) − Mappr(x

Truth0 )

| {z }

Perturbation error

+Mappr(xTruth0 ) − M(xTruth

0 )| {z }

Model error

Model error and perturbation error

Figure: Model error and perturbation error

xj − x

C → perturbation error (after 6/12 hours)

xj − x

A → model error

Model error

Figure: Perturbation error after 7 hours (Copyright: MetOffice)

Model error

Figure: Perturbation error after 12 hours (Copyright: MetOffice)

Model error

Figure: Model error after 12 hours (Copyright: MetOffice)

Met Office research and plans

design a simple chaotic model of reduced order (Lorenz model)

include several time scales (to model the atmosphere)

Met Office research and plans

design a simple chaotic model of reduced order (Lorenz model)

include several time scales (to model the atmosphere)

identify and analyse model error and analyse influence of this modelerror onto the DA scheme

analyse the influence of the error made by the numericalapproximation (part of the model error) on the error in the DA scheme

compare assimilation algorithms and optimisation strategies to reduceexisting errors

Met Office research and plans

design a simple chaotic model of reduced order (Lorenz model)

include several time scales (to model the atmosphere)

identify and analyse model error and analyse influence of this modelerror onto the DA scheme

analyse the influence of the error made by the numericalapproximation (part of the model error) on the error in the DA scheme

compare assimilation algorithms and optimisation strategies to reduceexisting errors

improve the representation of multiscale behaviour in the atmospherein existing DA methods

improve the forecast of small scale features (like convective storms)

top related