an introduction to data assimilation for the geosciences ross bannister amos lawless alison fowler...
TRANSCRIPT
An introduction to data assimilation for the geosciences
Ross BannisterAmos LawlessAlison Fowler
National Centre for Earth ObservationSchool of Mathematics and Physical Sciences
University of Reading
(A) Introductory lecture
(B) Variational intro + practical (C) Kalman filter + practicalDA ‘surgery’
NCEO Early Career Science Conference 16th – 18th April 2012 Introduction to data assimilation Page 2 of 20
What is data assimilation?
What is the temperature, T, of the fluid inside each jar as a function of time, t?
measurementat t=0:
thermometeryA(0)
radiometeryB(0)
(in-situ) (remotely sensed)
model TA(t) = Tenv + (TA(0)-Tenv) × exp –αAt TB(t) = Tenv + (TB(0)-Tenv) × exp –αBt
measurementat t=t:
thermometeryA(t)
radiometeryB(t)
A B
NCEO Early Career Science Conference 16th – 18th April 2012 Introduction to data assimilation Page 3 of 20
What is data assimilation?
Data assimilation is concerned with how we combine these pieces of information to obtain the best possible knowledge of the system as a function of time.
Observations + gauge of uncertainty
Model estimates + gauge of uncertainty
Data assimilation → Combined estimate + gauge of uncertainty
prob
abili
ty
possible
Note on uncertainty:
value (observed or modelled)
Gaussian with std dev.σ = √<ε2>
“All models are wrong …” (George Box)
“All models are wrong and all observations are inaccurate” (a data assimilator)
NCEO Early Career Science Conference 16th – 18th April 2012 Introduction to data assimilation Page 4 of 20
What is data assimilation?
star
t of t
he sy
stem
time
= observation
xtrue(t)(unknown)
xf(t1)xa(t1)
xf(t2)
xa(t2)
xf(t3)
This is an example of a ‘filter’
Data assimilation has:• prediction stages (xf = ‘forecast’, ‘prior’, ‘background’)• analysis stages (xa)
(extrapolation)(interpolation)
NCEO Early Career Science Conference 16th – 18th April 2012 Introduction to data assimilation Page 5 of 20
What is data assimilation?
“[The atmosphere] is a chaotic system in which errors introduced into the system can grow with time … As a consequence, data assimilation is a struggle between chaotic destruction of knowledge and its restoration by new observations.”
Leith (1993)
NCEO Early Career Science Conference 16th – 18th April 2012 Introduction to data assimilation Page 6 of 20
Outline and references
What is data assimilation?
Applications of data assimilation in the geosciences
A prototype data assimilation system
Indirect observations and prior knowledge
Errors
Leading data assimilation methods
Essential mathematics
Challenges, subtleties, caveats, …
References:• Kalnay, 2003, Atmospheric Modeling, Data Assimilation and Predictability.• Daley, 1991, Atmospheric Data Analysis.• Lorenc, 2003, The potential of the ensemble Kalman Filter for NWP – a comparison with 4d-Var, QJRMS 129, 3183-3203.• van Leeuwen, Particle filtering in geophysical systems.• Rodgers , 2000, Inverse methods for atmospheric sounding, theory and practice, World Scientific, Singapore.• Wang X., Snyder C., Hamill T.M., 2007, On the theoretical equivalence of differently proposed ensemble-3D-Var hybrid analysis
schemes, Mon. Wea. Rev. 135. pp. 222-227.
NCEO Early Career Science Conference 16th – 18th April 2012 Introduction to data assimilation Page 7 of 20
Applications of data assimilation in the geosciences
Atmospheric retrievals
H
L L
Atmospheric dynamics / NWP
Inverse modelling for sources/sinks
Reanalysis
Atmospheric chemistry
Hydrological cycle
Carbon cycle Oceanography
Parameter estimation
α, β, γ
NCEO Early Career Science Conference 16th – 18th April 2012 Introduction to data assimilation Page 8 of 20
A prototype data assimilation problem
Consider two sources of information (e.g. measurements), x1 ± σ1 and x2 ± σ2 that each estimate x (assume Gaussian statistics)
21
21
1
11 2
)(exp
2
1)|(
xx
xxp
22
22
2
22 2
)(exp
2
1)|(
xx
xxp
pn(xn|x) δxn :“the probability that the data xn lies between xn and xn+δxn given that the ‘true’ value is x”
The joint probability is p1(x1|x) δx1 p2(x2|x) δx2(“the probability that x1 is … and x2 is … given x”)
22
22
21
21
212121 2
)(
2
)(exp
2
1)|()|()|,(
xxxx
xxpxxpxxxp
In the above theory, x is known and x1 and x2 are unknown.Now introduce actual information x1 and x2 : now x1 and x2 are known and x is unknown.
What x maximizes p(x1, x2|x)?
Combining imperfect data
NCEO Early Career Science Conference 16th – 18th April 2012 Introduction to data assimilation Page 9 of 20
A prototype data assimilation problemCombining imperfect data
22
22
21
21
212121 2
)(
2
)(exp
2
1)|()|()|,(
xxxx
xxpxxpxxxp
What x maximizes p(x1, x2|x)? The same x that minimizes the ‘cost function’
22
22
21
21 )()(
2
1)(
xxxx
xI
To minimize, look for stationary values of I:
22
21
222
211
e /1/1
// 0
e
xxx
x
I
xx
If information source 1 is much more accurate than information source 2, then σ1 << σ2:
122
21
22
2121
e /1
/x
xxx
If information source 2 is much more accurate than information source 1, then σ2 << σ1:
221
22
221
221
e 1/
/x
xxx
NCEO Early Career Science Conference 16th – 18th April 2012 Introduction to data assimilation Page 10 of 20
Indirect observations and prior information
If x1 and x2 were measurements, they are direct measurements of x. Many observations are indirect. E.g.
Interested in (x) … Have observations of (y) …
Atmospheric T, O3, q, ρx Infrared radiances from satellite
Atmospheric T, q Time delays from GPS satellite
Sources of trace gases Trace gas measurements
Leaf area index Optical reflectance from satellite
Sea surface temperature Infrared or microwave radiances from satellite
Precipitation Radar reflectivity
Generalise:•x is the state vector (n elements)•ymo is the model’s version of the observations (mo=“model observations”) (p elements)•h is the forward model or observation operator (input n elements, output p elements)•y is the observation vector (p elements)
Strategy: what x gives best fit between y and ymo?)(mo xhy
NCEO Early Career Science Conference 16th – 18th April 2012 Introduction to data assimilation Page 11 of 20
Indirect observations and prior information
)(mo xhy
modelin used parameters
ninformatio field modelx
The structure of the state vector (for the example of meteorological fields u, v, θ, p, q are 3-D fields; λ, φ and ℓ are longitude, latitude and vertical level). There are n elements in total.
The observation vector – comprising each observation made. There are p observations.
modelparameters
pn observatio
1n observatio
, mo yy
NCEO Early Career Science Conference 16th – 18th April 2012 Introduction to data assimilation Page 12 of 20
Indirect observations and prior information
)(mo xhy
Examples of h•For in-situ observations, h is an interpolation function.•For radiance observations, h is a radiative transfer operator.•For observations at a later time than that of x, h includes a forecast model.
Prior information•Often the observations are insufficient to determine x.• Introduce prior information (a-priori, background, first guess, forecast), xf.
One strategy (variational assimilation) to solving the assimilation problem is to ask:
“What x (called xa [in earlier slide this was called xe]) gives:•ymo that is the closest possible to y and•x that is the closest possible to xf?”
Construct a cost functional and minimize w.r.t. x(a generalized least-squares problem).
22
f )(~)( xhyxxx J
NCEO Early Career Science Conference 16th – 18th April 2012 Introduction to data assimilation Page 13 of 20
Indirect observations and prior information
))(())((2
1)()(
2
1
)(~)(
1Tf
1-f
Tf
22
f
xhyRxhyxxPxx
xhyxxx
J
Square of length of vector
Error covariance matrices define the norm (these respect the uncertainty of xf and y and are important!)• Pf forecast (or background) error covariance matrix (n × n matrix). Sometimes called B.•R observation error covariance matrix (p × p matrix).
This cost function•can be derived from Bayes’ Theorem by assuming forecast and obs errors obey Gaussian stats,•has argument, x (think of as a control variable),•may be extended to include fit to other unknowns in the system (e.g. the fact that h is imperfect,
including model parameters.
1T
1Tf
1-f
Tf
2
1
])),([(])),([(2
1)()(
2
1),,(
Q
pxhyRpxhyxxPxxpxJ
NCEO Early Career Science Conference 16th – 18th April 2012 Introduction to data assimilation Page 14 of 20
Errrors everywhere
Random errors:• background (a-priori) errors• observation errors• model errors• representivity errors
Systematic errors:• biases in background• biases in observations• biases in model
All significant sources of uncertainty should be accounted for in data assimilation
Example 1 – repeated observations of air temperature
y (T observations)
truthunbiased thermometer
truth
biased thermometer
Example 2 – representivity errors due to model grid
NCEO Early Career Science Conference 16th – 18th April 2012 Introduction to data assimilation Page 15 of 20
Leading methods of solving the DA problem
))(())((2
1)()(
2
1)( 1T
f1-
fT
f xhyRxhyxxPxxx J
Variational-type approach
Kalman filter-type approach (linear obs operator, Ht xt= ht(xt)
ttt
ttt
t
ttt
t
ttttttttt
ttttttttttt
QMPMP
xMx
PHHPHRHPIP
xHyHPHRHPxx
T1a1
1f
a11
f
f1T
fT
fa
f1T
fT
ffa
])([
)()(
)( minimizes that a xxx J
← analysis update at time t
← analysis error covariance
← forecast
← forecast error covarianceModel error covariance matrix
Linear forecast model
NCEO Early Career Science Conference 16th – 18th April 2012 Introduction to data assimilation Page 16 of 20
Leading methods of solving the da problem
Ensemble Kalman filter-type approach
Have N ensemble members (index i, 1 ≤ i ≤ N). Differences between them represent uncertainty.
Approximate the forecast error covariance matrix with an ensemble to make manageable the Kalman update equation for n << p
ppN
pnN
N
nnN
N
i
tti
ttti
tttt
N
i
tti
ttti
tN
i
tti
tti
tt
ti
tti
ttttttti
ti
N
i
tti
tti
t
1
1
1
1
1
1
)()(
1
1
T
1ffff
Tf
T
1ffff
T
1
T
ffffT
f
f1T
fT
ffa
1
T
fffff
xxHxxHHPH
xxHxx
HxxxxHP
xHyHPHRHPxx
xxxxP
ti
ti
tti
N
i
tti
ti
ti
tN
i
ti
ti
ttttt
tti
tti
ti
tti
ti
N
N
1
1
1
1Let
Let
Let
1T
1fffa
1
TTf
ff
f
dSyxxxx
RyyRHPHS
xxHy
xHyd
A superposition of ensemble members
But beware ...
NCEO Early Career Science Conference 16th – 18th April 2012 Introduction to data assimilation Page 17 of 20
Leading methods of solving the da problem
Method Description Pros Cons
A. Data insertion
Set grid points to observation values
1. Easy to do 1. No respect of uncertainty2. What about observation voids?3. Can’t deal with indirect observations
B. Variational data assimilation
Minimize a cost functionMany flavours: 3D, 4D, weak/strong constraint
1. Respect of data uncertainty2. Direct and indirect observations3. Pf gives smooth and balanced fields4. Efficient5. Can deal with (weakly) non-linear h
1. Pf is difficult to know, often static and suboptimal
2. High development costs3. h: need tangent linear, H and adjoint, HT
4. Gaussian pdfC. Kalman filtering
Evaluate KF equations
1. As B.1, B.2, B.32. Pf adapts with the state
1. As B.3, B.42. Difficult to use with non-linear h3. Prohibitively expensive for large n
D. Ensemble Kalman filtering
Approximate KF equations with ensemble of N model runsMany flavours
1. As B.1,B.2, B.4, B.5, C.22. h: do not need H and HT
3. Have measure of analysis spread
1. As B.42. Serious sampling issues when N << n3. Need ensemble inflation and localization
schemes to overcome D.2
E. Hybrid Cross between C/D 1. As B.1, B.2, B.3, B.4, B.5, C.2 1. As D.2
F. Particle filter
Assign weights to ensemble members to represent any pdf
1. As. B.1, B.22. Can deal with non-linear h3. Can deal with non-Gaussian pdf4. Have measure of analysis spread
1. As D.22. Inefficient – members often become
redundant3. Need special techniques to overcome F.2
NCEO Early Career Science Conference 16th – 18th April 2012 Introduction to data assimilation Page 18 of 20
Mathematics required
• Vector representation of fields• Matrix algebra• Linear vector spaces• Matrix inversion• Vector derivative• Generalized chain rule• Jacobians• Eigenvectors/eigenvalues• Singular vectors/values• Variances, covariances, correlations• Matrix rank• Lagrange multipliers
www.met.reading.ac.uk/~ross/MTMD02/MathTools.pdf
NCEO Early Career Science Conference 16th – 18th April 2012 Introduction to data assimilation Page 19 of 20
Summary of basic principles
• DA is concerned with estimating the state of a system given:• observations (direct [e.g. in-situ] and indirect [e.g. remotely sensed]),• forecast models (to provide a-priori data, given too-few obs),• observation operators (to connect model state with obs).
• All data have uncertainties, which must be quantified.• DA estimates are sensitive to uncertainty characteristics, which are often poorly known.• Many observations and model have systematic as well as random errors.• Should take into account all sources of error in the system.
• DA theory is suited mostly to errors that are Gaussian distributed.• Most errors are non-Gaussian and non-linearity is synonymous with non-Gaussianity.
• DA problems are computationally expensive and require intensive development effort.
NCEO Early Career Science Conference 16th – 18th April 2012 Introduction to data assimilation Page 20 of 20
Some subtleties and caveats of DA
• DA estimates are not the ‘truth’ and can be problematic for some kinds of analyses:• A good fit to observations does not guarantee that the analysis is correct!• E.g. if h-operator has inadequacies not accounted for, or if error covariances matrices are poor.• Unobserved parts of the system may be poor.• E.g. in meteorology, horizontal winds may be constrained well by obs, but implied vertical wind
may be poor.
• Assimilated fields may be subject to other constraints:• E.g. certain balance constraints.
• Be careful with error covariance matrices:• Pf, R need to be tuned for variational DA, Pf subject to sampling problems for ensemble DA.
• DA systems should be well tested before using real data:• Test h-operators (forecast models and obs. operators) – which parts of x is ymo sensitive to? • Adjoint tests, H, HT if using variational data assimilation.• Test DA system with simulated obs. from a made-up truth (identical twin experiments).• For assimilation of real data, validate analysis against independent obs. if possible.