inverse modeling of atmospheric composition data daniel j. jacob see my web site under...

INVERSE MODELING OF ATMOSPHERIC COMPOSITION DATAINVERSE MODELING OF ATMOSPHERIC COMPOSITION DATA

Daniel J. Jacob

• See my web site under “educational materials” for lectures on • inverse modeling• atmospheric transport and chemistry models

• Notation and terminology in this lecture follows that of Rodgers (2000)

THE INVERSE MODELING PROBLEMTHE INVERSE MODELING PROBLEMOptimize values of an ensemble of variables (state vector x) using observations:

THREE MAIN APPLICATIONS FOR ATMOSPHERIC COMPOSITION:

1. Retrieve atmospheric concentrations (x) from observed atmospheric

radiances (y) using a radiative transfer model as forward model

2. Invert sources (x) from observed atmospheric concentrations (y) using a CTM as forward model

3. Construct a continuous field of concentrations (x) by assimilation of sparse

observations (y) using a forecast model (initial-value CTM) as forward model

a priori estimate

xa + a

observation vector yforward model

y = F(x) +

“MAP solution”“optimal estimate”

“retrieval”

ˆ ˆx + εBayes’theorem

BAYES’ THEOREM: FOUNDATION FOR INVERSE MODELSBAYES’ THEOREM: FOUNDATION FOR INVERSE MODELS

( ) ( )( )

( )

P PP

P

y | x xx | y

y

P(x) = probability distribution function (pdf) of xP(x,y) = pdf of (x,y)P(y|x) = pdf of y given x

a priori pdfobservation pdf

normalizing factor (unimportant)

a posteriori pdf

Maximum a posteriori (MAP) solution for x given y is defined by

( | )P x x y 0

max( ( | ))P x y

solve for

P(x,y)dxdy

( ) ( )

( ) ( )

P d P d

P d P d

x x y | x y

y y x | y x

Bayes’theorem

SIMPLE LINEAR INVERSE PROBLEM FOR A SCALARSIMPLE LINEAR INVERSE PROBLEM FOR A SCALARuse single measurement used to optimize a single sourceuse single measurement used to optimize a single source

a priori bottom-up estimate

xa a

Monitoring site measures

concentration yForward model gives y = kx

“Observational error”

• instrument• fwd model

y = kx

22

2 2

( )( )ln ( | ) ln ( | ) ln ( ) a

a

x xy kxP x y P y x P x

Max of P(x|y) is given by minimum of cost function 22

2 2

( )( )( ) a

a

x xy kxJ x

Solution: ˆ ( )a ax x g y kx where g is a gain factor

2

2 2 2a

a

kg

k

22 2 1( ( / ) )a k

Alternate expression of solution: (1 ) ay kx x ax a x g

where a = gk is an averaging kernel

solve for / 0dJ dx

Assume random Gaussian errors, let x be the true value. Bayes’ theorem:

Variance of solution:

GENERALIZATION:GENERALIZATION:CONSTRAINING CONSTRAINING nn SOURCES WITH SOURCES WITH mm OBSERVATIONS OBSERVATIONS

1

n

j ij ii

y k x

Linear forward model:

A cost function defined as , 1

1 2 21 1, ,

( )( )

( ,... )

n

j ij in mi a i i

ni ja i j

y k xx x

J x x

is generally not adequate because it does not account for correlation between sources or between observations. Need vector-matrix formalism:

1 1( ,... ) ( ,... ) T Tn mx x y y x y y Kx ε

Jacobian matrix

JACOBIAN MATRIX FOR FORWARD MODELJACOBIAN MATRIX FOR FORWARD MODEL

Consider a non-linear forward model y = F(x)

Use of vector-matrix formalism requires linearization of forward model

and linearize it about xa:

( ) ( ) (+ higher-order terms) a ay F x K x x

x

yK = F =

xis the Jacobian of F evaluated at xa

iij

j

yk

x

with elements

If forward model is non-linear, K must be recalculated iteratively for successive solutions

1 1 1

1

/ /

/ /

n

m m n

y x y x

y x y x

K

KT is the adjoint of the forward model (to be discussed later)

Construct Jacobian numerically column by column: perturb xa by xi,

run forward model to get corresponding y

GAUSSIAN PDFs FOR VECTORSGAUSSIAN PDFs FOR VECTORSA priori pdf for x:

Scalar x Vector

2

2

( )1( ) exp[ ]

22a

aa

x xP x

1 ,1 1 ,1 ,

1 ,1 , ,

var( ) cov( , )

cov( , ) var( )

a a n a n

a n a n n a n

x x x x x x

x x x x x x

aS

-112ln ( ) ( ) ( )TP c a a ax x x S x x

1/ 2/ 2

1 1( ) exp[

2(2 )T

na

P

-1a a ax (x - x ) S (x - x )]

S

1( ,... )Tnx xx

where Sa is the a priori error covariance matrix describing error statistics on (x-xa)

In log space:

OBSERVATIONAL ERROR COVARIANCE MATRIXOBSERVATIONAL ERROR COVARIANCE MATRIX

( ) i my F x ε + ε

observationtrue value

instrument error

fwd model error observational error

i mε = ε + ε

Observational error covariance matrix

1 1

1

var( ) cov( , )

cov( , ) var( )

n

n n

S

is the sum of the instrument and fwd model error covariance matrices:

i mε ε εS = S + S

How well can the observing system constrain the true value of x ?

122ln ( ) ( ) ( )TP c

y | x y Kx S y KxCorresponding pdf, in log space:

MAXIMUM A POSTERIORI (MAP) SOLUTIONMAXIMUM A POSTERIORI (MAP) SOLUTION-1

12ln ( ) ( ) ( )TP c a a ax x x S x x1

22ln ( ) ( ) ( )TP c y | x y Kx S y Kx

-1 132ln ( ) ( ) ( ) ( ) ( )T TP c

a a ax | y x x S x x y Kx S y Kx

Bayes’ theorem:

MAP solution: ( | ) P x x y 0 miminize cost function J:

-1 1( ) ( ) ( ) ( ) ( )T TJ a a ax x x S x x y Kx S y Kx

Solve for -1 1( ) 2 ( ) 2 ( )TJ x a ax S x x K S Kx - y 0

ˆ ( ) a ax x G y Kx

Analytical solution:

1( )T T

a aG S K KS K S

1ˆ ( )T 1 1

aS K S K S

with gain matrix

bottom-up constraint top-down constraint

PARALLEL BETWEEN VECTOR-MATRIX AND SCALAR SOLUTIONS:PARALLEL BETWEEN VECTOR-MATRIX AND SCALAR SOLUTIONS:

Scalar problem Vector-matrix problem

2

2 2 2

2 22 1

ˆ ( )

( / )

ˆ (1 )

a a

a

a

a

a

x x g y kx

kg

k

k

x ax a x g

a gk

MAP solution:

1

1

ˆ ( )

( )

ˆ ( )

ˆ ( )

T T

T

a a

a a

1 1a

n a

x x G y Kx

G S K KS K S

S K S K S

x Ax I A x Gε

A GK

Gain factor:

A posteriori error:

Averaging kernel:

Jacobian matrix

ˆ G x/ y

K = y/ x sensitivity of observations to true state

Gain matrix sensitivity of retrieval to observations

Averaging kernel matrix ˆ A x/ x sensitivity of retrieval to true state

A LITTLE MORE ON THE AVERAGING KERNEL MATRIXA LITTLE MORE ON THE AVERAGING KERNEL MATRIX

A describes the sensitivity of the retrieval to the true state

1 1 1

1

ˆ ˆ/ /ˆ

ˆ ˆ/ /

n

n n n

x x x x

x x x x

xA

x

ˆ ( ) n ax Ax I A x Gεand hence the smoothing of the solution:

smoothing error retrieval error

MAP retrieval gives A as part of the retrieval:

1( )T T

a aA = GK = S K KS K S K

Other retrieval methods (e.g., neural network, adjoint method) do not provide A

# pieces of info in a retrieval = degrees of freedom for signal (DOFS) = trace(A)

APPLICATION TO SATELLITE RETRIEVALSAPPLICATION TO SATELLITE RETRIEVALS

Here y is the vector of wavelength-dependent radiances (radiance spectrum);

x is the state vector of concentrations;

forward model y = F(x) is the radiative transfer model

Illustrative MOPITT averaging kernel matrix for CO retrieval

MOPITT retrieves concentrations at 7 pressure levels; lines are the corresponding columns of the averaging kernel matrix

trace(A) = 1.5 in this case;1.5 pieces of information

INVERSE ANALYSIS OF MOPITT AND TRACE-P (AIRCRAFT) DATA INVERSE ANALYSIS OF MOPITT AND TRACE-P (AIRCRAFT) DATA TO CONSTRAIN ASIAN SOURCES OF COTO CONSTRAIN ASIAN SOURCES OF CO

TRACE-P CO DATA(G.W. Sachse)Bottom-up

emissions(customized for TRACE-P)

Fossil and biofuel Daily biomass burning(satellite fire counts)

GEOS-Chem CTM

MOPITT CO

Inverse analysis

validation

chemicalforecasts

top-downconstraints

OPTIMIZATION OF SOURCES

Streets et al. [2003] Heald et al. [2003a]

COMPARE TRACE-P OBSERVATIONS COMPARE TRACE-P OBSERVATIONS WITH CTM RESULTS USING WITH CTM RESULTS USING A PRIORI A PRIORI SOURCESSOURCES

Model is low north of 30oN: suggests Chinese source is lowModel is high in free trop. south of 30oN: suggests biomass burning source is high

• Assume that Relative Residual Error (RRE) after bias is removed describes the observational error variance (20-30%)

• Assume that the difference between successive GEOS-Chem CO forecasts during TRACE-P (to+48h and to + 24 h) describes the covariant error structure (“NMC method”)

Palmer et al. [2003], Jones et al. [2003]

CHARACTERIZING THE OBSERVATIONAL ERROR CHARACTERIZING THE OBSERVATIONAL ERROR COVARIANCE MATRIX FOR MOPITT CO COLUMNSCOVARIANCE MATRIX FOR MOPITT CO COLUMNS

Diagonal elements (error variances) obtained by residual relative error method

Add covariant structure from NMC method

Heald et al. [2004]

COMPARATIVE INVERSE ANALYSIS OF ASIAN CO SOURCES COMPARATIVE INVERSE ANALYSIS OF ASIAN CO SOURCES USING DAILY MOPITT AND TRACE-P DATA USING DAILY MOPITT AND TRACE-P DATA

• MOPITT has higher information content than TRACE-P because it observes source regions and Indian outflow

• Don’t trust a posteriori error covariance matrix; ensemble modeling indicates 10-40% error on a posteriori sources

Heald et al. [2004]

CO observations from Spring 2001, GEOS-Chem CTM as forward model

TRACE-P Aircraft CO MOPITT CO Columns

4 degreesof freedom

10 degreesof freedom

(from validation)

“Ensemble modeling”: repeat inversion with different values of forward model and covariance parameters to span uncertainty range

Analytical solution to inverse problemAnalytical solution to inverse problem

1ˆ ( ) ( )T T

a a a ax x S K KS K S y Kx

requires (iterative) numerical construction of the Jacobian matrix K and matrix operations of dimension (mxn); this limits the size of n, i.e., the number of variables that you can optimize

Address this limitation with Kalman filter (for time-dependent x) or with adjoint method

-1 1( ) 2 ( ) 2 ( )TJ x a ax S x x K S Kx - y 0

BASIC KALMAN FILTERBASIC KALMAN FILTERto optimize time-dependent state vectorto optimize time-dependent state vector

a priori xa,0 ± Sa,0

to

time observations state vector

y0ˆˆ 0 0x S

Evolution model M for [t0, t1]:

ˆ ˆ

ˆ T

a,1 0 M

a,1 0 M

x = Mx ± S

S = MS M S

t1 y11 1

ˆˆ x S

Apply evolution model for [t1, t2]…etc.

ADJOINT INVERSION (4-D VAR)ADJOINT INVERSION (4-D VAR)

°

°

°

°

a

2

1

3

x2

x1

x3

xa

Minimum of cost function J

-1 1( ) 2 ( ) 2 ( ( ) )TJ x a ax S x x K S F x - y 0

Solve

numerically rather than analytically

1. Starting from a priori xa, calculate ( )Jx ax

2. Using an optimization algorithm (BFGS),

get next guess x1

3. Calculate ( )Jx 1x , get next guess x2

4. Iterate until convergence

NUMERICAL CALCULATION OF COST FUNCTION GRADIENTNUMERICAL CALCULATION OF COST FUNCTION GRADIENT

-1 1( ) 2 ( ) 2 ( ( ) ))TJ x a ax S x x K S F x - y

Adjoint model is applied to error-weighted difference between model and obs

…but we want to avoid explicit construction of K

( ) ( ) ( -1) (1) (0)

(0) ( -1) ( -2) (0) (0)

...i i i

i i

y y y y y

Kx y y y x

( ) ( -1) (1) (0) (0) (1) ( -1) ( )

( -1) ( -2) (0) (0) (0) (0) ( -2) ( -1)

... ...

T T T T T

i i n nT

i i n n

y y y y y y y yK

y y y x x y y y

and since (AB)T = BTAT,

Apply transpose of tangent linear model to the adjoint forcings; for time interval [t0, tn], start from observations at tn and work backward in time until t0, picking up new observations (adjoint forcings) along the way.

Construct tangent linear model ( ) ( 1)/i i y ydescribing evolution of concentration field over time interval [ti-1, ti]

Sensitivity of y(i) to x(0) at time t0 can then be written

of forward model

adjoint “adjoint forcing”

APPLICATION OF GEOS-Chem ADJOINT TO CONSTRAIN ASIAN SOURCES APPLICATION OF GEOS-Chem ADJOINT TO CONSTRAIN ASIAN SOURCES WITH HIGH RESOLUTION USING MOPITT DATAWITH HIGH RESOLUTION USING MOPITT DATA

MOPITT daily CO columns (Mar-Apr 2001)

Correction factors to a priori emissions with 2ox2.5o resolution

A priori knowledge of emissions

Kopacz et al. [2007]

GEOS-Chem adjointwith 2ox2.5o resolution

Fossil fuel

Biomass burning

20000

22000

24000

26000

28000

30000

0 5 10 15 20iteration number (n )

cost

fu

nct

ion

J(xn)

number of observations

Iterative approachto minimum of cost function

COMPARE ADJOINT AND ANALYTICAL INVERSIONSCOMPARE ADJOINT AND ANALYTICAL INVERSIONSOF SAME MOPITT CO DATAOF SAME MOPITT CO DATA

Adjoint solution reveals aggregation errors, checkerboard pattern in analytical solution

[Heald et al., 2004]

Kopacz et al. [2007]

inverse modeling of atmospheric composition data daniel j. jacob see my web site under...

Documents