dee: practice of quality controlncar summer colloquium 20031 practice of quality control dick dee...
TRANSCRIPT
NCAR Summer Colloquium 20031Dee: Practice of Quality Control
Practice of Quality Control
Dick Dee
Global Modeling and Assimilation Office
NASA Goddard Space Flight Center
NCAR Summer Colloquium 2003
NCAR Summer Colloquium 20032Dee: Practice of Quality Control
Outline
• Motivation• QC procedures• The background check• The buddy check• An adaptive buddy check algorithm• The Bayesian framework • Variational quality control• Summary
NCAR Summer Colloquium 20033Dee: Practice of Quality Control
QC Example 1: Rotated earth scenario
NCAR Summer Colloquium 20034Dee: Practice of Quality Control
QC Example 2: Strange sat winds
NCAR Summer Colloquium 20035Dee: Practice of Quality Control
QC Example 3: French Christmas Storm No. 2
NCAR Summer Colloquium 20036Dee: Practice of Quality Control
Quality Control Procedures
• At the instrument site: – E.g. radiation correction for rawinsonde temperatures
• During the retrieval process:– E.g. cloud-track wind height assignment
• As part of preprocessing at the DAS site:– E.g. aircraft wind checks– E.g. hydrostatic checks for rawinsonde temperatures
• During the assimilation: Statistical quality control
• Background reading:
Some of the early papers in numerical weather map analysis:Bergthórsson and Döös 1955; Bedient and Cressman 1957
More recent papers with a good general discussion of QC:Lorenc and Hammon 1988; Collins and Gandin 1990
NCAR Summer Colloquium 20037Dee: Practice of Quality Control
Statistical Quality Control
• Since this takes place late in the data assimilation process, a lot of information is at hand:
– Observations from various instruments– A short-term forecast valid at the time of the observations– Some information about expected errors
• Basic idea: check if each observed value is reasonable in view of all other available information
• Danger: rejecting good data / including bad data
This is clearly a problem in probability theory..
NCAR Summer Colloquium 20038Dee: Practice of Quality Control
Background check
• Bergthórsson and Döös 1955; Bedient and Cressman 1957
• Compare each observation against its prediction based on first-guess fields(e.g. interpolated background)
• Flag or reject the observation if the difference is large(but what is large?)
Example: rawinsonde observed-minus-forecast temperature residuals
NCAR Summer Colloquium 20039Dee: Practice of Quality Control
The background check as a hypothesis test
Definitions: observations
background
data residuals
In terms of errors:
Assumptions: errors for ‘good’ data
background errors
Therefore in the absence of gross errors.
For each single residual, the null hypothesis is
Reject the hypothesis if for some fixed tolerance
Probability of false rejection:
NCAR Summer Colloquium 200310Dee: Practice of Quality Control
Traditional buddy check
• Identify a suspect observation (e.g. using a background check)
• Define a set of buddies (e.g. based on distance, data type)
• Predict the suspect from the buddies (e.g. using local OI)
• Reject the suspect observation if it is too far from the predicted value (based on error statistics)
• See: Lorenc 1981
NCAR Summer Colloquium 200311Dee: Practice of Quality Control
The buddy check as a hypothesis test
Null hypothesis H0:
Divide into suspects and buddies:
Given H0, the conditional pdf of the suspects given the buddies is
where
Let
Reject the null hypothesis if for some fixed tolerance
The choice of determines the significance level δ of the test, which bounds the probability of false rejection of the null hypothesis:
NCAR Summer Colloquium 200312Dee: Practice of Quality Control
Illustration of the buddy check
NCAR Summer Colloquium 200313Dee: Practice of Quality Control
An adaptive buddy check algorithm
Loop:
End loop
identify suspects
predict suspects from buddies
prediction error covariances
null hypothesis:
adjust the error estimates
NCAR Summer Colloquium 200314Dee: Practice of Quality Control
Illustration with fixed tolerances
true range (μ ± 2σ)
expected range
suspect observations
predicted suspects
rejected observations
acceptable discrepancy
NCAR Summer Colloquium 200315Dee: Practice of Quality Control
Illustration with adaptive tolerances
adjusted range
adjusted range
NCAR Summer Colloquium 200316Dee: Practice of Quality Control
Illustration with real data
Fixed tolerances
Adaptive tolerances
NCAR Summer Colloquium 200317Dee: Practice of Quality Control
Some remarks on the adaptive buddy check
Very little dependence on prescribed error statistics in densely observed regions
… but reverts to a simple background check for isolated observations
Cheap and simple to implement, although parallel implementation takes some care
Not effective for detecting systematic gross errors (coherent batches of bad data)
Does not incorporate prior information about instrument reliability … but that can be done, following Lorenc and Hammon (1988)
The analysis is not a smooth function of the observations
Quality control and analysis are treated as separate steps in the assimilation process
NCAR Summer Colloquium 200318Dee: Practice of Quality Control
The Bayesian framework (1)
For example, our earlier Gaussian error models:
can also be written as
See: Lorenc 1986, Cohn 1997
We can formulate the analysis problem in terms of conditional probabilities:
NCAR Summer Colloquium 200319Dee: Practice of Quality Control
Example: Gaussian distributions
Lorenc and Hammon (1988)
NCAR Summer Colloquium 200320Dee: Practice of Quality Control
The Bayesian framework (2)
The Bayesian framework is not restricted to Gaussian distributions and/or linear operators.
This represents the most likely state in view of the available information.
Actually we’d be happy with just the mode of the conditional pdf:
When h(x) is linear, J(x) is quadratic and the solution is
with
For Gaussian distributions,
NCAR Summer Colloquium 200321Dee: Practice of Quality Control
Error models that account for bad data
Generalize the observation error model to account for possible gross errors:
If G is the event that a gross error occurred, then:
and
This is no longer a Gaussian pdf, and the variational problem becomes non-linear.
See: Purser 1984, Lorenc and Hammon 1988.
NCAR Summer Colloquium 200322Dee: Practice of Quality Control
Example: Non-Gaussian observation errors
Lorenc and Hammon (1988)
NCAR Summer Colloquium 200323Dee: Practice of Quality Control
Variational Quality Control at ECMWF (1)
After modification of p(y|x) to account for gross errors we have instead
Assuming independent Gaussian errors, the contribution of a single observation is
(cost)
(gradient)
Minimize cost function
where and
It turns out that is the a posteriori prob. of gross error
NCAR Summer Colloquium 200324Dee: Practice of Quality Control
Example: Impact of an observation in VarQC
Andersson and Järvinen (1999)
NCAR Summer Colloquium 200325Dee: Practice of Quality Control
Some remarks on variational QC
Strong dependence on prescribed error statistics
Implementation for observations with correlated errors is much more complicated
Not effective for detecting systematic gross errors (coherent batches of bad data)
Incorporates prior information about instrument reliability
In principle, the analysis is a smooth function of the observations … but not really (multiple minima)
Quality control and analysis are done simultaneously – each can take advantage of iterative improvement during the optimization
Requires a relatively strict background check to avoid convergence issues
NCAR Summer Colloquium 200326Dee: Practice of Quality Control
Summary
NCAR Summer Colloquium 200327Dee: Practice of Quality Control
Literature
• Andersson, E., and H. Järvinen, 1999: Variational quality control. Quart. J. Royal Meteor. Soc., 125, 697-722
• Bedient, H. A., and G. P. Cressman, 1957: An experiment in automatic data processing. Mon. Wea. Rev., 85, 333-340.
• Bergthórsson, P., and B. R. Döös, 1955: Numerical weather map analysis. Tellus, 7, 329-340• Collins, W. G., 1998: Complex quality control of significant level rawinsonde temperatures. J. Atmos.
Ocean. Tech., 15, 69-79.• Collins, W. G., and L. S. Gandin, 1990: Comprehensive hydrostatic quality control at the National
Meteorological Center. Mon. Wea. Rev., 118, 2752-2767• Dee, D. P., L. Rukhovets, R. Todling, A. M. da Silva, and J. W. Larson, 2001: An adaptive buddy check for
observational quality control. Quart. J. Royal Meteor. Soc., 114, 2451-2471.• Dharssi, I., A. C. Lorenc, and N. B. Ingleby, 1992: Treatment of gross errors using maximum probability
theory. Quart. J. Royal Meteor. Soc., 118, 1017-1036• Gandin, L. S., 1988: Complex quality control of meteorological observations. Mon. Wea. Rev., 116, 1137-
1156• Ingleby, N. B., and A. C. Lorenc, 1993: Bayesian quality control using multivariate normal distributions.
Quart. J. Royal Meteor. Soc., 119, 1195-1225.• Lorenc, A. C., 1981: A global three-dimensional multivariate statistical interpolation scheme. Mon. Wea.
Rev., 109, 701-721.• Lorenc, A. C., and O. Hammon, 1988: Objective quality control of observations using Bayesian methods:
Theory, and a practical implementation. Quart. J. Royal Meteor. Soc., 114, 515-543.• Purser, R. J., 1984: A new approach to the optimal assimilation of meteorological data by iterative
Bayesian analysis. Proceedings of 10th Conf. On Weather Forecasting and Analysis, American Meteorological Society, Boston, 102-105.