evolved and random perturbation methods for calculating

37
Evolved and Random Perturbation Methods for Calculating Model Sensitivities and Covariances William J. Martin * Center for Analysis and Prediction of Storms, University of Oklahoma, Norman, Oklahoma Ming Xue Center for Analysis and Prediction of Storms and School of Meteorology, University of Oklahoma, Norman, Oklahoma Submitted to Monthly Weather Review December, 2006 Revised May, 2007 * Corresponding author address: Dr. William Martin Center for Analysis and Prediction of Storms National Weather Center, Suite 2500 120 David L. Boren Blvd Norman, OK 73072. Phone: (405) 325-0402 E-mail: [email protected]

Upload: others

Post on 17-Nov-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Evolved and Random Perturbation Methods for Calculating Model Sensitivities and Covariances

William J. Martin* Center for Analysis and Prediction of Storms, University of Oklahoma, Norman, Oklahoma

Ming Xue

Center for Analysis and Prediction of Storms and School of Meteorology, University of Oklahoma, Norman, Oklahoma

Submitted to Monthly Weather Review December, 2006

Revised May, 2007

*Corresponding author address: Dr. William Martin

Center for Analysis and Prediction of Storms National Weather Center, Suite 2500

120 David L. Boren Blvd Norman, OK 73072.

Phone: (405) 325-0402 E-mail: [email protected]

2

ABSTRACT

Different ways of perturbing the initial condition of an ensemble of forecasts for the purpose

of calculating sensitivity or covariance fields of model variables are examined. The three

methods considered are: random perturbations at each grid-point, smoothed random

perturbations, and perturbations that are evolved by the model through time from an earlier set of

perturbations. A very large ensemble of model runs using spatially discrete perturbations is also

compared for validation purposes. An ensemble size of 2000 members is used so as to reduce

the noise in sensitivity fields. Covariances found from the three methods are highly accurate and

nearly identical for any perturbation method. The calculation of sensitivity fields, however, is

more dependent on the perturbation method. For the cases of evolved or smoothed perturbations,

the spatial correlation of the perturbations leads to an inherent smoothing of the sensitivity fields.

Sensitivity structures of scales smaller than the perturbation correlation distance can not be

found. This is a particular problem for the evolved perturbations in the boundary layer.

Furthermore, the spatial correlation of initial perturbations makes the calculation of sensitivity

values inaccurate unless the complicated problem of separating the combined effects of

correlated perturbations on the forecast is dealt with. Consequently, mathematically correct

sensitivity values are only found by using initial perturbation fields that are spatially completely

random.

3

1. Introduction

In recent years, ensemble forecasts have gained increasing importance. Small ensembles

of order 10 members are used to provide probabilistic forecast information and the forecast

uncertainty (Buizza 2000; Toth and Kalnay 1997); moderate-sized ensembles of order 10 to 100

members are being used with data assimilation for various forms of ensemble Kalman filtering

(EnKF) (Evensen 1994). Moderately sized to very large sized ensembles have also been used to

calculate sensitivities statistically in a manner analogous to the calculation of covariances in an

EnKF method (Beare et al. 2003; Hakim and Torn 2005; Martin and Xue 2006, 2007). Hamill et

al. (2003) have used ensembles with up to 1600 members for the approximate calculation of

singular vectors as part of an ensemble data assimilation system. In addition to its value in data

assimilation and forecasting, information derived from ensembles can potentially be used for the

targeting of observations (Bishop et al. 2001), for possible weather modification, and for

discovering physical connections between different processes in the atmospheric model.

The uncertainty in a forecast depends on the error in the analysis (for a perfect model). If

perturbations smaller than analysis errors are used (as is the case in this work), then the forecast

variance is expected to be smaller and not useful as an estimate of forecast skill, but covariance

and sensitivity values can still be meaningful by themselves. Indeed, the sensitivity gradients

found by an adjoint model are exactly those that would be found with infinitesimal perturbations.

In situations where the variance in the initial conditions of an ensemble is lower than analysis

errors, covariance inflation (Anderson and Anderson 1999) might be used in some cases to

empirically increase the covariance for some sets of perturbations for use in an EnKF (these are

4

then modified by the assimilation cycle of an EnKF). This is not the purpose of our current

study, however.

Initial perturbations are generally either random or Monte Carlo in form, where

perturbation fields are created using random numbers (Mullen and Baumhefner 1994); or are

physically based, where fields are derived from the model itself (singular vectors, Lyapunov

vectors, and bred vectors). Other methods include lagged average forecasting (Hoffman and

Kalnay 1983) in which an ensemble is formed from runs of the same model initialized at

different times; multimodel ensemble techniques (Krishnamurti et al. 2000), which form an

ensemble from different prediction models; and methods that involve perturbing the

observations, model parameters, and boundary conditions (Houtekamer et al. 1996a,b;

Houtekamer and Mitchell 1998).

Singular vectors correspond to the fastest growing error modes of a model (in a tangent-

linear framework), and there use as initial perturbations is thought to be optimal for the

generation of accurate forecast error covariances with the smallest possible ensemble size

(Ehrendorfer and Tribbia 1997). In order for the forecast ensemble to be unbiased, theoretically,

the selection of initial perturbations should also be unbiased, and therefore should be selected in

some appropriate manner from the distribution of possible analysis states (Ehrendorfer and

Tribbia 1997). A review of various methods of ensemble initialization, for the purpose of

ensemble forecasting, can be found in Houtekamer and Mitchell (1998).

Different from most studies that deal with the optimal ensemble perturbations for an

ensemble forecast or for ensemble-based data assimilation, where the mean and ensemble spread

are the most important, we examine in this paper different ways of initializing an ensemble for

the purpose of accurately calculating the forecast sensitivities and covariances. Here the

5

sensitivity is defined as the gradient of a model output quantity, usually a scalar, with respect to

individual elements in the model input or initial condition state vector (Martin and Xue 2006;

2007). Since the sensitivity is a gradient field, the magnitude of the initial perturbations of the

ensemble does not matter so long as the perturbations are large enough for the machine

truncation error to be insignificant and the perturbations do not cause unrealizable physical

behavior. In this case, it is the ability of the initial perturbations in delineating the structure

details of the sensitivity fields that is important. In the case of covariance calculation, the

magnitude of the initial ensemble will affect the variance, i.e., the diagonal components of the

covariance matrix, but to a much smaller extent, if at all, on the correlation structure (a

covariance matrix B can be written as 1/ 2 1/ 2≡B D CD where C is the symmetric correlation

matrix and D1/2 as diagonal matrix made up of the square root of the variances) (Kalnay 2002).

For the calculation of initial condition sensitivity fields, it would be ideal to perturb every

degree-of-freedom (DOF) of the model each in an independent model run, as was done by

Lorenz (1968) for a 28 parameter model. This is not currently practical because of the large

number of DOFs in a typical modern, atmospheric prediction model. Each field variable value at

each grid point constitutes a DOF. For the sample case considered here, there are six variables

(not counting microphysical variables) defined at each of 1 million grid points. The approach

taken by Martin and Xue (2006) was to reduce the number of DOFs to consider by defining

perturbations at a 2-D array of multiple grid-point patches. While highly accurate, this method

could only be applied to the problem of finding 2-D sensitivity fields, and still required over

2000 forward model runs. Martin and Xue (2007) approached this problem by perturbing every

DOF of the model simultaneously and randomly and then using statistics (linear regression

analysis) to determine the desired relations among variables. Adjoint models (Hall and Cacuci

6

1982, 1983; Errico 1992; 1997) have also been used for calculating initial condition sensitivity

fields. Such models, however, are difficult to implement and can not estimate nonlinear

sensitivity; though once developed, they are computationally more economical.

By any perturbation method, the perturbed initial conditions lead to perturbed forecasts.

These perturbed forecasts can be related to the perturbed initial conditions in various ways, as

described in section 4 below, to produce forecast sensitivities (as either partial derivatives or

time-lagged covariances), as well as covariances between variables at the same time (Martin and

Xue 2007).

In this paper, several methods of perturbing the model initial conditions for the purpose

of calculating covariances and sensitivities are compared, including the random method of

Martin and Xue (2007, referred to as method “RAND”), the smoothed Gaussian method of Tong

and Xue (2006, referred to as method “SGAU”), and using perturbations evolved by the model

itself from a previous ensemble, similar to the bred vector method (referred to as method

“EVOL”). Because initial condition sensitivity fields are relatively noisy when calculated by

these methods, we employ ensembles of 2000 members.

2. The ARPS model and case used for study

a. The ARPS model

The prediction model used in this study is the Advanced Regional Prediction System

(ARPS, Xue et al. 2000, 2001, 2003). The configuration of the ARPS model used for this study

is the same as that in Martin and Xue (2006, 2007), which used a 9 km horizontal resolution grid

and 135×135×53 points centered over the IHOP (International H2O Project) study area in the

southern Great Plains of the United States (Weckwerth et al. 2004). The lateral boundaries were

forced by the 1800 UTC 24 May, 2002 NCEP ETA model forecast on the 40 km grid. Further

7

details of the model configuration will not be repeated here while the model details can be found

in the afore-referenced model description papers.

A problem with using forced lateral boundary conditions with this study is that the lateral

boundary conditions are the same for all the perturbed model runs (unless a technique is

employed to perturbed the boundaries, which we have not implemented). This means that

forecast field values near the boundaries will be largely determined by the boundary values

(rather than perturbations), with a consequent low variance in the forecast near the boundaries.

Consequently, statistical inferences near the boundaries may be inaccurate. This issue is

discussed further in Martin and Xue (2007). In that paper, it is shown that the forecast fields

after 6 hours of integration in the interior of the domain within 100 km to 200 km of the lateral

boundaries have little variance. Most of the interior of the domain, however, is not affected.

b. Case used for study

The case used here is the same as that from Martin and Xue (2006). This was a

convective initiation case from the 2002 IHOP field program, studied at high-resolution by Xue

and Martin (2006a,b). In this case, a dryline-cold front triple-point was in place in the eastern

Texas panhandle at 1800 UTC, 24 May, 2002. This triple-point moved southward during the

afternoon, reaching just southwest of the southwest corner of Oklahoma by 0000 UTC, 25 May.

The observed convection initiated south of the triple-point along the dryline by 2100 UTC, 24

May. In the ARPS 9 km resolution simulation, convection also occurred along the cold front in

northwest Oklahoma and southern Kansas, though this was not observed. Figure 1a shows the

three hour forecast of water vapor and wind 10 m above the ground, valid at 2100 UTC, 24 May,

and Fig. 1b shows the six hour forecast of total accumulated precipitation, together with surface

winds. For most of the analyses in this paper, perturbations were applied at 2100 UTC and the

8

ensemble integrated for 3 hours. This is to facilitate the comparison with evolved perturbations

from forecasts that started at 1800 UTC, a time when initial perturbations were examined in

Martin and Xue (2006, 2007). Since the relatively coarse resolution used by the very large

ensembles and the absence of data assimilation cycles that can include high-resolution local data,

the model forecast is not as good as those of Xue and Martin (2006a,b). Getting an accurate

forecast is not the primary goal of this study, however, as long as the model prediction is

physically realistic.

3. Perturbation Methods

Four methods are used in this study for perturbing the initial conditions of an ensemble of

model runs. For convenience, these will be abbreviated RAND (random perturbations at

gridpoints), SGAU (smoothed Gaussian perturbations), EVOL (evolved perturbations), and VLE

(a very large ensemble of discrete perturbations as used in Martin and Xue 2006).

a. Random perturbations at model gridpoints (RAND)

This is the method used by Martin and Xue (2007). In this method, a uniformly

distributed, quasi-random number between 0 and 1 is calculated at each grid point and for each

variable that is to be perturbed. A perturbation that is a constant fraction of the unperturbed

variable is then added to the variable at each grid point. If the random number is greater than or

equal to 0.5, the perturbation is added, while if the value is less than 0.5, it is subtracted. These

perturbations are thus binary, with only two possibilities at each grid point. This is perhaps the

simplest method for perturbing a model and has the advantage of independently perturbing each

degree of freedom of the model. An alternative and even simpler method is to add or subtract a

fixed amount (rather than a percentage of the unperturbed variable value) to each variable at each

grid point depending on the random coin flip. This is problematic as a fixed magnitude

9

perturbation may be too large in some parts of the model domain and too small in others for

certain variables. For example, a perturbation of ±1 g kg-1 in water vapor is reasonable in the

planetary boundary layer, but is too large at upper levels where such perturbations can lead to

unrealistic behavior, such as the instant formation of super-saturated clouds. It is therefore

desirable to perturb model variables by an amount that is some small fraction of its normal

variance. Using perturbation amounts equal to a percentage of the unperturbed variable only

achieves this if the percentage magnitude is chosen from experience as the variance is not

directly reflected by the magnitude of the unperturbed variable. In our experience (Martin and

Xue 2007), perturbations of 2-10% seem to work well for water vapor perturbations, while 0.2%

seem to work well for potential temperature perturbations (in Kelvins), in that these perturbations

were found not to be so small as to be lost by round-off error nor too large to lead to obviously

unrealistic nonlinearities.

The idea of the RAND method is to determine the response to perturbing each degree of

freedom (DOF) of the model independently, while actually perturbing all the DOFs

simultaneously and randomly. The response of the model to a perturbation of a degree of

freedom is the signal being sought, even if this response is diffusion or model adjustment. The

fact that the random perturbations are rapidly affected by dissipation during the model

integration, does not change the fact that the net affect is the sum of all the individual DOF

perturbations. In this Monte Carlo approach, the model response to a perturbation of a particular

DOF is found by a statistical method in which the randomness of the perturbations in all the

other DOFs cancel-out. This is approximately the result that would be found by perturbing just

that one DOF.

10

b. Spatially smoothed, Gaussian random perturbations (SGAU)

This method is used by Tong and Xue (2007) to initialize their ensemble for EnKF data

assimilation. For each variable at each grid point with indices (l,m,n), a spatially smoothed

random Gaussian perturbation is added of an amount:

�∈

=Skji

kjiwkjirhnml),,(

),,(),,(),,(δ , (1)

where r(i,j,k) is a random number at the grid point (i,j,k) sampled independently from a Gaussian

distribution with zero mean and unit standard deviation; w(i,j,k) is a three-dimensional distance-

dependent weighting function; and h is a scaling parameter. The fifth order correlation function

of Gaspari and Cohn (1999, Eqn. (4.10)) is chosen for w. The summation is over all grid points

within a specified radius from (l,m,n), which is chosen to be 27 km here (chosen to be between

the length scale of the perturbations typical of the RAND and EVOL methods). The end result

of (1) is a spatially random field with spatial correlation scales related to the spatial scale of the

correlation function w. The scaling factor h is chosen to obtain a random field with a desirable

variance or standard deviation.

c. Evolved perturbations from an ensemble (EVOL)

For this method, an ensemble is numerically integrated from initial states that are

randomly perturbed (in this case by the RAND method above). This integration is carried up to

the time when perturbed initial conditions are desired. The difference between the model fields

of each member of the ensemble and the unperturbed model run then can function as

perturbations for use in subsequent perturbation runs. These evolved perturbations will tend to

have much larger spatial scales than either grid point perturbations or smoothed perturbations.

This is due to the diffusion in the model acting throughout the integration, serving to spread and

11

smooth initial grid-scale randomness (the model employed fourth-order advection in the

horizontal and vertical, a 1.5 order TKE based subgrid turbulence and PBL model, as well as a

small amount of fourth order numerical diffusion). Other physical processes, such as surface

insolation, evapotranspiration, or convection can also affect the nature of these perturbations.

This method will also remove any gross imbalances introduced into the model by random

perturbations. The length scale of the perturbations from this method will depend on the length

of time integration from the initial state and the manner in which the model was originally

perturbed. This is similar to the bred vector method of Toth and Kalnay (1997) except that the

evolved perturbations are not rescaled. Such perturbations are also similar to those found in the

EnKF data assimilation system cycles prior to scaling and the assimilation of new observations

during an analysis cycle. Unlike the SGAU and RAND methods, these perturbations are

potentially biased. They are also not necessarily Gaussian.

For using the EVOL method here, the model is initialized at 1800 UTC using the RAND

method and integrated for 3 hours producing a set of evolved perturbations. The ensemble is

then integrated another 3 hours using these evolved perturbations. The final forecasts at 0000

UTC are no different from the straight forecast from 1800 UTC starting with the initial random

perturbations; the difference is that the evolved perturbations at 2100 UTC are recorded and

therefore known.

d. A very large ensemble of model runs with discrete perturbations (VLE)

This method is examined in Martin and Xue (2006). A similar method was used by

Beare et al. (2003) who referred to it as “sensitivity mapping” . It is used in this paper for

comparison with the methods described above. It is a direct perturbation method and does not

make use of random perturbations. For this method, a specific perturbation is applied at one area

12

in the model domain and in one variable. For example, a perturbation in boundary layer

moisture might be made at one location in a patch 3x3 grid points in size. The model is then

integrated forward in time and the difference between the perturbed and unperturbed model runs

provides the exact impact of that perturbation on the forecast. By using a Very Large Ensemble

(VLE) in which each member of the ensemble has the discrete perturbation at a different

location, sensitivity fields can be derived as the model response to individual perturbations, if

these generally non-overlapping perturbations fill the 2-D domain for which the sensitivity map

is to be constructed.

4. Calculation of covariances and sensitivities

Sample perturbation fields from each of the four methods described in section 3 are

shown in Fig. 2. Figures 2a,b,c show a single realization of perturbed boundary layer water

vapor which is arrived at by adding together the perturbations from each of the 9 vertical levels

in the lowest 1 km of the model. In each subfigure, the 10 g kg-1 isopleth of moisture is drawn

for reference. This line is close to the location of the dryline and cold front. The perturbation

magnitudes in Figs. 2a,b,c are larger to the south and east of this line where humidity is higher,

because the perturbations are defined as a percentage of the unperturbed moisture field. The

VLE perturbation sample shown in Fig. 2d is a single perturbation 27 km by 27 km wide by 9

km deep. In principle, a separate model run is made with this perturbation at a different one of

the possible 2025 different non-overlapping locations required for this perturbation to tile the

entire 2-D horizontal domain.

It is noteworthy that the effective length scale of the random perturbation methods varies,

with the RAND method (Fig. 2a) being the smallest, and the EVOL method (Fig. 2c) being the

largest. For the SGAU method (Fig. 2b), the mean length scale is selectable, and was selected

13

here to be a 27 km radius. It is also noted that the perturbations along the boundary of the EVOL

method are much weaker than in the interior of the domain. This is due to the effect of the

boundary conditions. The boundaries of the model run are externally forced by a larger scale

model run (the ETA model) which was not perturbed. Finally, we note that the magnitude of the

EVOL perturbations is much smaller than either the RAND or SGAU perturbations. The EVOL

perturbations were generated by first perturbing the ensemble 3 hours prior to the time of Fig. 2c

using perturbations identical to those of the RAND method. After the 3 hours of model

integration, these RAND perturbations have spread in space by primarily advection and

diffusion, processes which have reduced the perturbation magnitudes by an order of magnitude.

Covariance and sensitivity values can both be derived from the same ensemble of runs

(Martin and Xue 2007). To find sensitivity values from an ensemble of M members, two sets of

scalars need to be defined: the response function values, Ji, which are the scalar forecast

quantities of interest (one value from each member of the ensemble, with i being the index for

ensemble members); and the perturbation quantities of an initial DOF of the model, 0

ilx∆ (also

one value for each member of the ensemble). The subscript l refers to one element of the model

state vector,x , and the superscript 0 refers to the initial time. The statistical relation between

these M ordered-pairs of values can then be found by either a covariance sum or by statistical

linear regression. Sensitivity fields can be found simply by considering each component of the

vector or field individually.

Figures 3a, b, and c show sample scatter plots for which the statistical relation might be

sought. For the example of Fig. 3, the response function is the total rain which fell in the three

hour forecast (from 2100 UTC to 0000 UTC) in the box drawn southwest of Oklahoma in Fig.

6a. The perturbation quantity is the amount the boundary layer moisture was initially perturbed

14

at a particular grid location we knew to have a strong relation to the selected response function

(which was at the local correlation field maximum in Fig. 6a).

The banded structure of Fig. 3a is an artifact of using a binary coin-flip for choosing

either a positive or negative perturbation at each grid point. The boundary layer perturbations

are arrived at by averaging the perturbation at each of the lower 9 model levels together. If these

levels had the exact same value of humidity, then there would be only 10 possible values of the

perturbation, depending on the 9 coin flips; with the least likely being either all positive or all

negative perturbations. The breadth of the vertical lines is caused by the fact that all of the 9

levels do not have the same humidity, though they are close, as these points are in the (well-

mixed) convective boundary layer at 2100 UTC.

Since Figs. 3a, b, and c are all plotted on the same scale, it can directly be seen that the

sensitivities implied by these plots differ with perturbation method, with the EVOL method

having the steepest slope, and the RAND method the shallowest. Because sensitivity is a

dimensional physical quantity, it might have been expected that all three of these relations would

have been the same, in the absence of noise. The reason they are different will be discussed in

section 5.

a. Covariance and correlation coefficient

As discussed in more detail in Martin and Xue (2007), the covariance between a forecast

scalar quantity, J, and the forecast state vector (which includes all the model fields) at time t,�� ��

x ,

is calculated for each element, tlx , of

�� ��

x from an ensemble randomly different model runs of size

M as:

�=

−−=M

i

tl

tliJx

xxJJM i

tl

1

))((1σ . (2)

15

J itself can be defined as any scalar function of�� ��

x . The non-dimensional form of covariance is

the correlation coefficient, � :

tl

tl

xl

xJ

Jx

Jx σσσ

ρ = . (3)

In Fig. 4a,b,c we plot the correlation coefficient between the response function, J, defined

as the total rain which fell along the dryline between 2100 UTC and 0000 UTC, and the field of

10-m potential temperature at 0000 UTC. This example was chosen because it has a fairly

complex correlation structure, with a negative local minimum in the center, surrounded by a ring

of positive correlation, surrounded by areas of both positive and negative correlation in a

complex pattern. From all three perturbation methods, subtle details of the correlation structure

have been almost identically reproduced. From Fig. 1b it is seen that precipitation fell in and

around the central minimum in the box drawn in Figs 4a,b,c. We interpret the negative

correlation between this rainfall and the temperature field at the forecast time as being due to

evaporative cooling associated with the rainfall and downdraft. The ring of positive correlation

around this central minimum is likely related to positive correction that should exist between the

rainfall amount and the boundary layer moisture in the environment not directly affected by the

downdraft. Other areas of positive and negative correlation around this ring are more difficult to

interpret and may be related to gravity wave oscillations triggered by the convection; they are

nonetheless well-defined. For each of these ensembles, the ensemble size, M, was 2000, which

is apparently more than needed for finding accurate covariance fields between variables at the

same forecast time.

For the fields of Fig. 4, a signal-to-noise ratio (SNR) has been calculated. The signal is

defined as the peak magnitude of the sensitivity field, and the noise level is defined as the root-

16

mean-square value of the sensitivity field at all locations 20 or more grid points (180 km) away

from the center of the response function box. The SNR in dB is then:

levelnoise

signalSNR 10log10= . (4)

The SNR of the sensitivity fields of Figs. 4a,b, and c are practically identical, being 9.9,

10.0, and 10.1 dB, respectively. The slightly higher noise level from the EVOL method is

possibly due to an increase in round-off error from the smaller perturbation magnitudes. Clearly,

the method of perturbation has made little difference so far as the calculation of the correlation

between variables at the forecast time is concerned, and this was true for many other fields we

examined but do not show.

b. Sensitivity

Sensitivity can be defined in terms of either a partial derivative (a gradient) or a

sensitivity covariance. As discussed in Martin and Xue (2007), the two are closely related.

When defined as a covariance or correlation coefficient, sensitivity is calculated by simply

replacing�� ��

x with the state vector at the initial time,�� ��

x , in (2) and (3) above.

Figure 5a plots the correlation coefficient between the forecast (at 0000 UTC) boundary

layer water vapor in the small box drawn south of the Oklahoma-Texas border (which is 3x3 grid

cells in size), and the field of initial (2100 UTC) boundary layer water vapor, for the RAND

case. This region of the domain did not have precipitation and the sensitivity is due largely to

advection and diffusion effects (Martin and Xue 2006). Because the initial fields were perturbed

randomly at each grid point, Fig. 5a exhibits grid point noise. Consequently, it is desirable to

apply smoothing to the sensitivity field. Figure 5b is the same field as Fig. 5a after a 2-D nine-

point smoother has been applied three times in the horizontal. This is to be compared with the

same field derived from the ensemble using the SGAU perturbations, shown in Fig. 5c, and the

17

same field derived from the ensemble using EVOL perturbations (Fig. 5d), both unsmoothed.

For the calculation of these sensitivity correlation coefficients, the ensemble size of 2000 was

necessary to reduce the noise to a tolerable level, though there remains a significant level of

noise. Because the noise level can be controlled by smoothing and the different methods used

different amounts of smoothing (explicit or implicit), the SNR is not a useful a measure. By

comparing Fig. 5a with Fig 2a, Fig. 5c with Fig. 2b, and Fig. 5d with Fig. 2c, it is clear that the

spatial scale of the noise scales directly with the spatial scale of the initial perturbations.

Smoothing of the result from method RAND increases the length scale of the noise and produces

a result (Fig. 5b) which has a noise level comparable to that of the SGAU and EVOL methods.

All three methods produce well-defined central maxima as would be expected from the effects of

advection and diffusion. One difference is the value of maximum correlation coefficient, which

is 0.56 for the SGAU method, 0.95 for the EVOL method, and 0.19 for the RAND method

(before smoothing). This is due to the reduced amount of uncorrelated noise in the EVOL

method relative to the other two; a result which might have been anticipated from the scatter

plots of Fig. 3, which shows that the sample EVOL scatter plot has much less noise than the

other two methods.

Another difference is the size of the central maximum. The EVOL method has produced

a region of sensitivity that is physically larger than either the SGAU or RAND methods. This is

due to the larger length scales of the initial perturbations. Initial perturbations at individual grid

points for the EVOL method (Fig. 2c) are correlated with perturbations at other grid points

within approximately 100 km, as compared with 27 km for the SGAU method and no correlation

for the RAND method. This spatial correlation of initial perturbations leads to a blurring of the

calculated sensitivity correlation coefficients. The spatially uncorrelated perturbations from the

18

RAND method allow for the greatest possible spatial precision (or resolution) in calculated

sensitivity fields, though the need for smoothing reduces this precision. There is a direct trade-

off between the smoothness of the sensitivity field and its spatial noise.

The importance of spatial precision and the value of the RAND method are further

illustrated by sensitivities in vertical cross section presented in Fig. 6. Figure 6a plots the 3-hour

sensitivity correlation coefficient between the total rain which fell in the box drawn, from 2100

UTC to 0000 UTC, to initial (2100 UTC) 10-m (not boundary layer) water vapor perturbations as

calculated from the RAND method. As expected, this shows a central maximum near the rainfall

maximum along the dryline. Figures 6b,c,d show vertical cross-section plots of the same

sensitivity field with the cross-sections taken along the line A-B drawn in Fig. 6a. The vertical

cross-section plots do show significant differences, with only the RAND method producing a

sensitivity field which shows significant vertical structure. The vertical length scale of the

perturbations for the SGAU method is selectable and was chosen in this case to be larger than the

depth of the boundary-layer. Consequently, no vertical resolution in sensitivities can by found.

Boundary-layer vertical mixing produces a similar effect for the EVOL method. This mixing

vertically homogenizes the boundary layer perturbations. Only the RAND method, in which the

perturbations are completely uncorrelated in the vertical, is capable of discerning the vertical

structure of the sensitivity pattern.

It is interesting that the maximum in sensitivity correlation coefficient shown by Fig. 6c

is approximately 1 km above the surface. It might have been anticipated that the maximum in

sensitivity would be near the surface where the moisture advection is greatest. One possible

explanation for this is that moisture near the capping inversion might be relatively important in

the process of breaking the cap for convective initiation.

19

5. Forecast sensitivity analysis from correlated perturbation fields

In Fig. 3 above, it was noted that the actual dimensional sensitivity values (the best fit

slopes for Fig. 3) arrived at by a regression of a response function against a perturbation differed

by perturbation method. To show this in detail, Fig. 7 is presented. Figures 7a,b,c show the

gradient (partial derivative) sensitivity of the boundary layer moisture in the small box indicated

in each figure at the forecast time to the initial boundary-layer moisture field as calculated from

each of the three methods. The gradient sensitivity is calculated as the least-squares regression

slope between the cost function and the initial grid point perturbations (the best fit straight-line

slopes of scatter plots like Fig. 3). This approximates the partial derivatives:

>∆∆<≈

∂∂

00ll x

J

x

J (5)

where the angle brackets denote the expected value. To make the comparison as direct as

possible, all fields in Fig. 7 were smoothed by 3 passes through a 9-point filter. For each

subfigure of Fig. 7, there is an increase in noise on the dry side of the dryline. This effect

disappears when such plots are non-dimensionalized (as in Fig. 5). This is caused by the fact

that the regression is inaccurate in areas with no (or weak) signal. As non-dimensionalization of

(5) involves multiplication by the standard deviation of 0lx , non-dimensionalization eliminates

this effect where the standard deviation of the perturbations is small, which is the case west of

the dryline.

From Fig. 7, we find that the dimensional sensitivity is an order of magnitude larger from

the EVOL method (Fig. 7c) than it is from the RAND method (Fig. 7a), with the SGAU method

(Fig. 7b) in between. The reason for this is that (5) is only correct for the RAND method. More

20

generally, each of the iJ∆ values from the ensemble represents to a first order approximation the

total differential of J, which by the chain rule is:

0xdJdJJ •∇=≈∆ �� ���� �� , (6)

so that each iJ∆ value depends on every partial derivative of J times the perturbation in each

DOF of the model. If the perturbations of each DOF are uncorrelated (as for the RAND

method), then

><∂∂=>< 0

0 ll

dxx

JdJ . (7)

When dJ is regressed against a particular 0ldx , (5) is recovered. If, on the other hand, 0

ldx is

correlated with other members of dx0 (as for the EVOL and SGAU methods), then dJ is larger

(for positive correlations) than that given by (7) and any results calculated from (5) will be larger

than the desired sensitivity gradient. Because the effects of all the correlated perturbations are

added together in (6), it is not a simple matter to separate them so as to obtain the sensitivity

gradient at a specific grid point. (6) represents a set of simultaneous equations which in principle

can be solved for the components of J�� ���� ��∇ . In practice, this solution is a difficult and possibly

ill-conditioned inversion problem.

For the EVOL method, perturbations at a point are correlated with nearby perturbations.

The spatial scale for this correlation would be expected to increase with the integration time used

for the production of the evolved perturbations because of diffusion effects. Consequently, the

longer the evolution time, the steeper the slope of a plot like Fig. 3c would be.

To explore the accuracy of the RAND technique and to provide some validation for it,

Fig. 8 is presented that compares the dimensional sensitivity derived from the RAND method

(Fig. 8a) with that from the VLE method (Fig. 8b). Both sensitivity fields were smoothed by 2

21

passes through a 9-point smoother. For Fig. 8, the sensitivity is between the forecast 10-meter

water vapor in the box drawn to the initial field of boundary layer moisture. For the VLE

method, the perturbations are 1-km deep columns of moisture at the surface 27X27 km in

horizontal size. In order to accomplish this direct comparison, these same perturbations are

synthesized for the RAND method by adding together the initial grid point perturbations in this

27X27X1 km region. These perturbations are then regressed against the Ji values as before. The

maximum from the RAND method is found to be 0.017, while that from the VLE method is

0.014, a difference of about 20%. It is possible that this difference is due to nonlinearity because

the perturbations from the VLE method are effectively much larger in magnitude than those from

the RAND method. It is also possible that the RAND and/or VLE result is contaminated by the

sweeping of unperturbed boundary information, as the region of sensitivity is less than 200 km

from the southern boundary.

6. Summary and Discussion

This study compared the covariance between forecast scalars and fields at a forecast time,

and compared the sensitivity of forecast scalars to initial fields, as calculated from ensemble

forecasts starting from initial conditions perturbed with three different random or quasi-random

methods. The ensemble size was very large at 2000 members, which was necessary because of

the need to reduce the noise in sensitivity fields. The three methods included random,

uncorrelated perturbations at each model grid point (RAND); smoothed Gaussian, random

perturbation fields (SGAU); and perturbations evolved from a previous ensemble of randomly

perturbed runs (EVOL).

For the calculation of the covariance between variables at the same forecast time, the

initial perturbation method did not make any material difference, despite large differences in the

22

spatial scale and magnitude of the perturbations. Of the methods used, the RAND method was

the simplest to apply. This similarity of covariance for the different perturbation methods may

be due partly to the fact that the perturbations were relatively small. From Fig. 2, the

perturbations of the EVOL method were typically 0.05 g kg-1. Those of the SGAU and RAND

methods were larger, but more localized. The effective perturbation magnitude of large length

scales from these latter two methods would be reduced due to cancellation of the small randomly

perturbed regions. If small perturbations are used, then the model response is more likely to be

linear, and the response to different kinds of perturbations more similar. Using perturbations

larger in magnitude for the SGAU or RAND methods is problematic because strong local, non-

linear pathologies occur (which are not realized in real cases). Also, using effectively small

perturbations may be undesirable when implementing and EnKF because the model variance is

combined with measurement uncertainty to produce the Kalman gain, so that an EnKF would

underweight observations as the forecast variance is related to the initial condition variance.

However, it might be possible to either inflate the model variance or deflate the measurement

uncertainty, if an ensemble based on small perturbations were used. One advantage of small

perturbations is that results for sensitivity similar to an adjoint are achieved, without the

difficulty of implementing an adjoint.

The EVOL method, surprisingly, had several shortcomings for use in sensitivity

calculations. Because the length scale of the EVOL perturbations was not controlled, some fine

details of sensitivity fields can not be found. For example, the sensitivity of the forecast to the

vertical moisture profile of the initial boundary layer could not be found for the case explored

here because the evolved perturbations did not have sufficient vertical structure in the convective

boundary layer due to turbulent mixing. Evolved perturbations are also affected by the lateral

23

boundary conditions. If the boundary conditions are not perturbed in some manner as was the

case here, then the evolved perturbations will not have structure near the boundaries. This would

affect results of both correlations between the forecast and the initial conditions, and between

different quantities at the forecast time, if these relations are desired near a lateral boundary.

There is an inherent trade-off between the smoothness of calculated sensitivity fields and

the spatial resolution. The RAND method inherently produces the best spatial resolution for

sensitivity fields, however the fields contain considerable grid scale noise. This noise can be

reduced by smoothing, but then some spatial resolution is lost. The SGAU and EVOL methods

have smooth and spatially correlated initial perturbations. This produces smoothing of the

calculated sensitivity fields as well.

Finally, both the SGAU and EVOL methods present particular difficulty in calculating

dimensional sensitivity values. Because the initial perturbations at each grid point are correlated

with perturbations at other grid points, it becomes a difficult inversion problem to derive

sensitivity fields. The RAND method does not have this problem because all the initial

perturbations are uncorrelated. The RAND method is the only method that can easily determine

sensitivity fields which are correct in magnitude

Acknowledgements

This work was primarily supported by NSF grants ATM-0129892 and ATM-0530814.

The second author was further supported by NSF grants EEC-0313747, ATM-0331594 and

ATM-0331756 and ATM-0608168. The computations were performed on the National Science

Foundation Terascale Computing System at the Pittsburgh Supercomputing Center. Thomas

Hamill, Robert Fovell, and an anonymous reviewer help to improve this paper.

24

REFERENCES

Anderson, J. L., and S. L. Anderson, 1999: A Monte Carlo implementation of the nonlinear

filtering problem to produce ensemble assimilations and forecasts. Mon. Wea. Rev., 127,

2741-2758.

Bishop, C. H., B. J. Etherton, and S. J. Majumdar, 2001: Adaptive sampling with the ensemble

transform Kalman filter. Mon. Wea. Rev., 129, 420-36.

Beare, R. J., A. J. Thorpe, and A. A. White, 2003: The predictability of extratropical cyclones:

Nonlinear sensitivity to localized potential-vorticity perturbations. Quart. J. Roy. Meteor.

Soc., 129, 219-37.

Buizza, R., 2000: Skill and economic value of the ECMWF ensemble prediction system. Quart.

J. Roy. Meteor. Soc., 126, 649-68.

Evensen, G., 1994: Sequential data assimilation with a nonlinear quasigeostrophic model using

Monte Carlo methods to forecast error statistics. J. Geophys. Res., 99, 10143-62.

Ehrendorfer, M. and J. J. Tribbia, 1997: Optimal prediction of forecast error covariances through

singular vectors. J. Atmos. Sci., 54, 286-313.

Errico, R. M., 1997: What is an adjoint model? Bull. Amer. Meteor. Soc., 78, 2577-91.

Errico, R. M., 2003: The workshop on applications of adjoint models in dynamic meteorology.

Bull. Amer. Meteor. Soc., 84, 795-8.

Errico, R. M., and T. Vukisevic, 1992: Sensitivity analysis using an adjoint of the PSU-NCAR

mesoscale model. Mon. Wea. Rev, 120, 1644-60.

Gaspari, G and S. E. Cohn, 1999: Construction of correlation functions in two and three

dimensions. Quart. J. Roy. Meteor. Soc., 125, 723-57.

25

Hakim, G. J. and R. D. Torn, 2006: Ensemble Synoptic Analysis. Fred Sanders AMS

Monograph.

Hall, M. C. G. and D. G. Cacuci, 1982: Sensitivity analysis of a radiative-convective model by

the adjoint method. J. Atmos. Sci., 39, 2038-50.

Hall, M. C. G. and D. G. Cacuci, 1983: Physical interpretation of the adjoint functions for

sensitivity analysis of atmospheric models. J. Atmos. Sci., 40, 2537-46.

Hamill, T. M., C. Snyder, and J. S. Whitaker, 2003: Ensemble forecasts and the properties of

flow-dependent analysis error covariance singular vectors. Mon. Wea. Rev., 131, 1741-58.

Hamill, T. M. and J. S. Whitaker, 2005: Accounting for the error due to unresolved scales in

ensemble data assimilation: a comparison of different approaches. Mon. Wea. Rev., 133,

3132-47.

Hoffman, R. N., and E. Kalnay, 1983: Lagged average forecasting, an alternative to Monte

Carlo forecasting. Tellus, 35A, 100-18.

Houtekamer, P. L., L. Lefaivre, J. Derome, H. Ritchie, and H. L. Mitchell, 1996a: A system

simulation approach to ensemble prediction. Mon. Wea. Rev., 124, 1225-1242.

Houtekamer, P. L. L. Lefaivre, and J. Derome, 1996b: The RPN ensemble prediction system.

Proc. of the ECMWF Seminar on Predictability, Vol. 2, 121-146.

Houtekamer, P. L., and H. L. Mitchell, 1998: Data assimilation using an ensemble Kalman filter

technique. Mon. Wea. Rev. 126, 796-811.

Kalnay, E., 2002: Atmospheric modeling, data assimilation, and predictability. Cambridge

University Press, 341 pp.

26

Krishnamurti, T. N., C. M. Kishtawal, Z. Zhang, T. LaRow, D. Bachiochi, E. Williford, S.

Gadgil, and S. Surendran, 2000: Multimodel ensemble forecasts for weather and seasonal

climate. J. Climate 13, 4196-216.

Lorenz, E. N., 1968: The predictability of a flow which possesses many scales of motion.

Tellus, 17, 321-33.

Martin, W. J. and M. Xue, 2006: Sensitivity Analysis of Convection of the 24 May 2002 IHOP

Case Using Very Large Ensembles. Mon. Wea. Rev., 134, 192-207.

Martin, W. J. and M. Xue, 2007: Determining sensitivities, impacts, and covariances from very

large ensembles with randomly perturbed initial conditions. Submitted to Mon. Wea. Rev.

Mullen, Steven L. and D. P. Baumhefner, 1994: Monte Carlo simulations of explosive

cyclogenesis. Mon. Wea. Rev., 122, 1548-67.

Tong, M, and M. Xue, 2007: Simultaneous estimation of microphysical parameters and

atmospheric state with radar data and ensemble square-root Kalman filter. Part I:

Sensitivity analysis and parameter identifiability Mon. Wea. Rev., Conditionally accepted.

Toth, Z. and E. Kalnay, 1997: Ensemble forecasting at NCEP: the breeding method. Mon. Wea.

Rev., 125, 3297-318.

Weckwerth, T. M., D. B. Parsons, S. E. Koch, J. A. Moore, M. A. LeMone, B. B. Demoz, C.

Flamant, B. Geerts, J. Wang, W. F. Feltz, 2004: An overview of the International H2O

Project (IHOP 2002) and some preliminary highlights. Bull. Amer. Meteor. Soc., 85, 253-

77.

Xue, M., K. K. Droegemeier, and V. Wong, 2000: The Advanced Regional Prediction System

(ARPS)-a multiscale nonhydrostatic atmospheric simulation and prediction tool. Part I:

Model dynamics and verification. Meteor. Atmos. Phys., 75, 161-93.

27

Xue, M., K. K. Droegemeier, V. Wong, A. Shapiro, K. Brewster, F. Carr, D. Weber, Y. Liu, and

D.-H. Wang, 2001: The Advanced Regional Prediction System (ARPS) – a multiscale

nonhydrostatic atmospheric simulation and prediction tool. Part II: Model physics and

applications. Meteor. Atmos. Phys., 76, 143-65.

Xue, M, D. Wang, J. Gao, K. Brewster and K. Droegemeir, 2003: The Advanced Regional

Prediction System (ARPS), storm-scale numerical weather prediction and data

assimilation. Meteor. Atmos. Phys., 82, 139-70.

Xue, M. and W. J. Martin, 2006a: A high-resolution modeling study of the 24 May 2002 case

during IHOP. Part I: Numerical simulation and general evolution of the dryline and

convection. Mon. Wea. Rev., 134, 149–171.

Xue, M. and W. J. Martin, 2006b: A high-resolution modeling study of the 24 May 2002 case

during IHOP. Part II: Horizontal convective rolls and convective initiation. Mon. Wea.

Rev., 134, 172–191.

28

List of figures

Fig. 1. Fields of (a) 10-m water vapor mixing ratio and wind vectors at 2100 UTC 24 May 2002

from a 3-hour forecast, and (b) total accumulated rainfall and 10-m wind vectors at 0000

UTC 25 May 2002 from a 6-hour forecast. Water vapor contour increment is 0.5 g kg-1

and rainfall contour increment is 10 mm. Length of 10.0 m s-1 wind vector is indicated at

the lower left corner of plots.

Fig. 2. Perturbed initial fields of boundary layer water vapor from four methods: (a) random

binary perturbations at each grid point, (b) spatially smoothed Gaussian perturbations, (c)

evolved perturbations, and (d) one example of a specific perturbation at one spatial

location. The 10 g kg-1 isopleth of moisture is drawn for reference. Contour increments

vary and are (a) 0.125 g kg-1, (b) 0.25 g kg-1, (c) 0.025 g kg-1, and (d) 0.1 g kg-1.

Fig. 3. Scatter plots of a response function versus initial perturbations at a point for three

methods of initial perturbation: (a) method RAND; (b) method SGAU; and (c) method

EVOL.

Fig. 4. Correlation coefficient between the total rain that fell in the rectangular box drawn in each

figure (from 21 UTC to 0 UTC) and the field of forecast 10-m potential temperature from

three different perturbation methods: (a) random grid-point, (b) smoothed Gaussian, and

(c) evolved. A portion of the political outline of Oklahoma is also drawn in the upper-

right corner of each figure. Contour increment is 0.1.

Fig. 5. Correlation coefficient fields between the total boundary layer water vapor in the drawn

box and the field of initial boundary layer water vapor perturbations, obtained for

different initial perturbation methods as indicated in the plots. Panel b shows a smoothed

29

version of panel a. Local maxima are (a) 0.195, (b) 0.143, (c) 0.558, and (d) 0.946.

Contour increments are 0.025 for (a), 0.0125 for (b), and 0.05 for (c) and (d).

Fig. 6. (a) Correlation coefficient between total rain which fell in the box drawn over 3 hours to

initial boundary-layer water vapor as found from method RAND. (b) same as (a), but in

vertical cross-section along the line A-B at y = 436.5 km. (c) same as (b), but from the

SGAU method. (d) same as (b), but from the EVOL method. Contour increments are

0.0125 for (a), 0.01 for (b) and 0.05 for (c) and (d). (a) and (b) have been smoothed by

two passes through a 9-point smoother. The 10 g kg-1 isopleth of 10-meter moisture has

been drawn in each figure for reference.

Fig. 7. Dimensional sensitivity of the boundary layer moisture in the small box indicated in each

figure at the 3-hr forecast time to the initial field of boundary-layer moisture

perturbations. From three perturbation methods: (a) random, (b) smoothed Gaussian, and

(c) evolved. Contour increments are, respectively, 0.1, 0.2, and 1.0 g kg-1 per g kg-1 for

(a), (b), and (c). The local maxima southeast of the response function box are: 0.42, 1.0,

and 5.4 for (a), (b), and (c). Fields for all subfigures have been smoothed by two passes

through a nine-point smoother.

Fig. 8. Dimensional sensitivity of 10-m moisture in small black box in each figure to initial field

of boundary layer moisture from (a) random initial perturbations at each grid point, and

(b) a very large ensemble of discrete perturbations. Contour increment was 0.0025 for (a)

and (b). Maximum in (a) is 0.017 and for (b) is 0.014.

30

0 200 400 600 800 10000

200

400

600

800

1000

(km)

(km

)

1010

3

34

4

4

4

5

5

5

6

6

6

6 7

7

8

8

8

9

9

9

10

10

10

11

11

11

12

12

12

13

13

13

13

13

13

14

14

14

0 200 400 600 800 10000

200

400

600

800

1000

(km)

(km

)

1010

20

60

a b

Fig. 1. Fields of (a) 10-m water vapor mixing ratio and wind vectors at 2100 UTC 24 May 2002 from a 3-hour forecast, and (b) total accumulated rainfall and 10-m wind vectors at 0000 UTC 25 May 2002 from a 6-hour forecast. Water vapor contour increment is 0.5 g kg-1 and rainfall contour increment is 10 mm. Length of 10.0 m s-1 wind vector is indicated at the lower left corner of plots.

31

10

1010

0 200 400 600 800 10000

200

400

600

800

1000

x (km)

y (k

m)

10

1010

0 200 400 600 800 10000

200

400

600

800

1000

x (km)

(km

)

10

1010

0 200 400 600 800 10000

200

400

600

800

1000

x (km)

y (

km)

10

1010

0 200 400 600 800 10000

200

400

600

800

1000

x (km)

y (k

m)

a

c d

b

Fig. 2. Perturbed initial fields of boundary layer water vapor from four methods: (a) random binary perturbations at each grid point, (b) spatially smoothed Gaussian perturbations, (c) evolved perturbations, and (d) one example of a specific perturbation at one spatial location. The 10 g kg-1 isopleth of moisture is drawn for reference. Contour increments vary and are (a) 0.125 g kg-1, (b) 0.25 g kg-1, (c) 0.025 g kg-1, and (d) 0.1 g kg-1.

32

500.

550.

600.

650.

700.

750.

800.

850.

-1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0

J, m

m r

ain

BL QV PERT, g/kg

500.

550.

600.

650.

700.

750.

800.

850.

-1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0

J, m

m r

ain

BL QV PERT, g/kg

500.

550.

600.

650.

700.

750.

800.

850.

-1.0 -0.8 -0.6 -0.4 -0.2 0.0 0.2 0.4 0.6 0.8 1.0

J, m

m r

ain

BL QV PERT, g/kg

a

b

c

Fig. 3. Scatter plots of a response function versus initial perturbations at a point for three methods of initial perturbation: (a) method RAND; (b) method SGAU; and (c) method EVOL.

33

-0.6

-0.6

-0.6

-0.4

-0.4

-0.4

-0.4

-0.2

-0.2

-0.2

-0.2

-0.2

-0.2

0.2

0.2

0.2

0.4

300 400 500 600 700300

400

500

600

700

x (km)

y (

km)

RAND SNR=9.9 db MIN = - 0.86

-0.8

-0.6

-0.6

-0.4-0.4

-0.4

-0.4

-0.2

-0.2

-0.2

-0.2

-0.2

-0.2

0.20.2

0.4

0.4

300 400 500 600 700300

400

500

600

700

x (km)

y (

km)

SGAU SNR=10.0 db MIN = - 0.86

-0.6

-0.6

-0.4

-0.4-0.4

-0.4

-0.4

-0.2-0.2

-0.2

-0.2

-0.2

-0.2

0.2

0.2 0.4

0.4

300 400 500 600 700300

400

500

600

700

x (km)

y (k

m)

EVOL SNR=10.1 db MIN = - 0.85

a

c

b

Fig. 4. Correlation coefficient between the total rain that fell in the rectangular box drawn in each figure (from 21 UTC to 0 UTC) and the field of forecast 10-m potential temperature from three different perturbation methods: (a) random grid-point, (b) smoothed Gaussian, and (c) evolved. A portion of the political outline of Oklahoma is also drawn in the upper-right corner of each figure. Contour increment is 0.1.

34

10

1010

0 200 400 600 800 10000

200

400

600

800

1000

y (k

m)

10

1010

0 200 400 600 800 10000

200

400

600

800

1000

y (k

m)

10

1010

0 200 400 600 800 10000

200

400

600

800

1000

x (km)

y (k

m)

10

1010

0 200 400 600 800 10000

200

400

600

800

1000

x (km)

y (k

m)

c d

baRAND RAND SMOOTH

SGAU EVOL

Fig. 5. Correlation coefficient fields between the total boundary layer water vapor in the drawn box and the field of initial boundary layer water vapor perturbations, obtained for different initial perturbation methods as indicated in the plots. Panel b shows a smoothed version of panel a. Local maxima are (a) 0.195, (b) 0.143, (c) 0.558, and (d) 0.946. Contour increments are 0.025 for (a), 0.0125 for (b), and 0.05 for (c) and (d).

35

10.0

10.0

10

.0

0 200 400 600 800 10000

200

400

600

800

1000

x (km)

y (k

m)

10.0

200 400 600 8000

1

2

3

4

5

x (km)

z (k

m)

0.0

0.0

0.0

0.1

10.0

10.0

200 400 600 8000

1

2

3

4

5

x (km)

z (k

m)

0.1

0.1 0.2

0.2

0.3

0.3

0.4 0.4 10.0

10.0

200 400 600 8000

1

2

3

4

5

x (km)

z (k

m)

0.1

0.1

0.2

0.2

dc

a b

AA B

RAND

SGAU EVOL

Fig. 6. (a) Correlation coefficient between total rain which fell in the box drawn over 3 hours to initial boundary-layer water vapor as found from method RAND. (b) same as (a), but in vertical cross-section along the line A-B at y = 436.5 km. (c) same as (b), but from the SGAU method. (d) same as (b), but from the EVOL method. Contour increments are 0.0125 for (a), 0.01 for (b) and 0.05 for (c) and (d). (a) and (b) have been smoothed by two passes through a 9-point smoother. The 10 g kg-1 isopleth of 10-meter moisture has been drawn in each figure for reference.

36

10

10

10

0 200 400 600 800 10000

200

400

600

800

1000

x (km)

y (

km)

10

10

10

0 200 400 600 800 10000

200

400

600

800

1000

x (km)

y (k

m)

10

10

10

0 200 400 600 800 10000

200

400

600

800

1000

x (km)

y (k

m)

a

b

c

RAND MAX=0.42

SGAU MAX=1.03

EVOL MAX=5.42

Fig. 7. Dimensional sensitivity of the boundary layer moisture in the small box indicated in each figure at the 3-hr forecast time to the initial field of boundary-layer moisture perturbations. From three perturbation methods: (a) random, (b) smoothed Gaussian, and (c) evolved. Contour increments are, respectively, 0.1, 0.2, and 1.0 g kg-1 per g kg-1 for (a), (b), and (c). The local maxima southeast of the response function box are: 0.42, 1.0, and 5.4 for (a), (b), and (c). Fields for all subfigures have been smoothed by two passes through a nine-point smoother.

37

10

10

10

0 200 400 600 800 10000

200

400

600

800

1000

x (km)

y (k

m)

10

10

10

0 200 400 600 800 10000

200

400

600

800

1000

x (km)

y (k

m)

a b

Fig. 8. Dimensional sensitivity of 10-m moisture in small black box in each figure to initial field of boundary layer moisture from (a) random initial perturbations at each grid point, and (b) a very large ensemble of discrete perturbations. Contour increment was 0.0025 for (a) and (b). Maximum in (a) is 0.017 and for (b) is 0.014.