reduced models of atmospheric low-frequency variability: parameter estimation and comparative...

22
Physica D 239 (2010) 145–166 Contents lists available at ScienceDirect Physica D journal homepage: www.elsevier.com/locate/physd Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance K. Strounine a , S. Kravtsov b,*,1 , D. Kondrashov a , M. Ghil a,2 a Department of Atmospheric and Oceanic Sciences and Institute of Geophysics and Planetary Physics, University of California-Los Angeles, Los Angeles, CA 90095, United States b Department of Mathematical Sciences, University of Wisconsin-Milwaukee, P.O. Box 413, Milwaukee, WI 53201, United States article info Article history: Received 4 November 2008 Received in revised form 21 August 2009 Accepted 19 October 2009 Available online 29 October 2009 Communicated by H.A. Dijkstra Keywords: Coarse-graining Climate dynamics Mesoscopic models Stochastic-dynamic models abstract Low-frequency variability (LFV) of the atmosphere refers to its behavior on time scales of 10–100 days, longer than the life cycle of a mid-latitude cyclone but shorter than a season. This behavior is still poorly understood and hard to predict. The present study compares various model reduction strategies that help in deriving simplified models of LFV. Three distinct strategies are applied here to reduce a fairly realistic, high-dimensional, quasi- geostrophic, 3-level (QG3) atmospheric model to lower dimensions: (i) an empirical–dynamical method, which retains only a few components in the projection of the full QG3 model equations onto a specified basis, and finds the linear deterministic and the stochastic corrections empirically as in Selten (1995) [5]; (ii) a purely dynamics-based technique, employing the stochastic mode reduction strategy of Majda et al. (2001) [62]; and (iii) a purely empirical, multi-level regression procedure, which specifies the functional form of the reduced model and finds the model coefficients by multiple polynomial regression as in Kravtsov et al. (2005) [3]. The empirical–dynamical and dynamical reduced models were further improved by sequential parameter estimation and benchmarked against multi-level regression models; the extended Kalman filter was used for the parameter estimation. Overall, the reduced models perform better when more statistical information is used in the model construction. Thus, the purely empirical stochastic models with quadratic nonlinearity and additive noise reproduce very well the linear properties of the full QG3 model’s LFV, i.e. its autocorrelations and spectra, as well as the nonlinear properties, i.e. the persistent flow regimes that induce non-Gaussian features in the model’s probability density function. The empirical–dynamical models capture the basic statistical properties of the full model’s LFV, such as the variance and integral correlation time scales of the leading LFV modes, as well as some of the regime behavior features, but fail to reproduce the detailed structure of autocorrelations and distort the statistics of the regimes. Dynamical models that use data assimilation corrections do capture the linear statistics to a degree comparable with that of empirical–dynamical models, but do much less well on the full QG3 model’s nonlinear dynamics. These results are discussed in terms of their implications for a better understanding and prediction of LFV. © 2009 Elsevier B.V. All rights reserved. 1. Introduction We study low-frequency variability [LFV] in the extratropical atmosphere by constructing and analyzing dynamically consistent low-order models thereof. For this purpose, we make use of a rel- atively high-resolution, three-layer quasi-geostrophic model [1] * Corresponding author. Tel.: +1 414 229 4863; fax: +1 414 229 4907. E-mail address: [email protected] (S. Kravtsov). 1 Additional affiliation: Department of Atmospheric and Oceanic Sciences and Institute of Geophysics and Planetary Physics, University of California-Los Angeles, Los Angeles, CA 90095, United States. 2 Additional affiliation: Geosciences Department and Laboratoire de Météorolo- gie Dynamique (CNRS and IPSL), Ecole Normale Supérieure, Paris. known to produce realistic low-frequency behavior. We consider three approaches to constructing our low-order models. In the purely empirical approach, we specify the mathematical form of the model and find the model parameters based solely on the out- put from the full-model simulation [2,3]. In the semi-empirical ap- proach we use the bare truncation of the full model, with additive and multiplicative noise terms, along with a simple determinis- tic parameterization of the unresolved noise dynamics [4,5]. In the purely dynamical approach, we apply the stochastic mode reduc- tion methodology of [6] [hereafter MTV]. The latter two classes of models are further enhanced by applying Kalman filter methodol- ogy to estimate the reduced-model parameters. A comprehensive analysis of such reduced models may improve our understanding of potentially predictable LFV modes. 0167-2789/$ – see front matter © 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.physd.2009.10.013

Upload: k-strounine

Post on 26-Jun-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

Physica D 239 (2010) 145–166

Contents lists available at ScienceDirect

Physica D

journal homepage: www.elsevier.com/locate/physd

Reduced models of atmospheric low-frequency variability: Parameter estimationand comparative performance

K. Strounine a, S. Kravtsov b,∗,1, D. Kondrashov a, M. Ghil a,2a Department of Atmospheric and Oceanic Sciences and Institute of Geophysics and Planetary Physics, University of California-Los Angeles, Los Angeles, CA 90095, United Statesb Department of Mathematical Sciences, University of Wisconsin-Milwaukee, P.O. Box 413, Milwaukee, WI 53201, United States

a r t i c l e i n f o

Article history:Received 4 November 2008Received in revised form21 August 2009Accepted 19 October 2009Available online 29 October 2009Communicated by H.A. Dijkstra

Keywords:Coarse-grainingClimate dynamicsMesoscopic modelsStochastic-dynamic models

a b s t r a c t

Low-frequency variability (LFV) of the atmosphere refers to its behavior on time scales of 10–100 days,longer than the life cycle of a mid-latitude cyclone but shorter than a season. This behavior is still poorlyunderstood and hard to predict. The present study compares variousmodel reduction strategies that helpin deriving simplified models of LFV.Three distinct strategies are applied here to reduce a fairly realistic, high-dimensional, quasi-

geostrophic, 3-level (QG3) atmospheric model to lower dimensions: (i) an empirical–dynamical method,which retains only a few components in the projection of the full QG3 model equations onto a specifiedbasis, and finds the linear deterministic and the stochastic corrections empirically as in Selten (1995)[5]; (ii) a purely dynamics-based technique, employing the stochastic mode reduction strategy of Majdaet al. (2001) [62]; and (iii) a purely empirical, multi-level regression procedure, which specifies thefunctional form of the reduced model and finds the model coefficients by multiple polynomial regressionas in Kravtsov et al. (2005) [3]. The empirical–dynamical and dynamical reduced models were furtherimproved by sequential parameter estimation and benchmarked against multi-level regression models;the extended Kalman filter was used for the parameter estimation.Overall, the reduced models perform better when more statistical information is used in the model

construction. Thus, the purely empirical stochastic models with quadratic nonlinearity and additive noisereproduce very well the linear properties of the full QG3model’s LFV, i.e. its autocorrelations and spectra,as well as the nonlinear properties, i.e. the persistent flow regimes that induce non-Gaussian features inthe model’s probability density function. The empirical–dynamical models capture the basic statisticalproperties of the full model’s LFV, such as the variance and integral correlation time scales of the leadingLFVmodes, as well as some of the regime behavior features, but fail to reproduce the detailed structure ofautocorrelations and distort the statistics of the regimes. Dynamical models that use data assimilationcorrections do capture the linear statistics to a degree comparable with that of empirical–dynamicalmodels, but do much less well on the full QG3model’s nonlinear dynamics. These results are discussed interms of their implications for a better understanding and prediction of LFV.

© 2009 Elsevier B.V. All rights reserved.

1. Introduction

We study low-frequency variability [LFV] in the extratropicalatmosphere by constructing and analyzing dynamically consistentlow-order models thereof. For this purpose, we make use of a rel-atively high-resolution, three-layer quasi-geostrophic model [1]

∗ Corresponding author. Tel.: +1 414 229 4863; fax: +1 414 229 4907.E-mail address: [email protected] (S. Kravtsov).

1 Additional affiliation: Department of Atmospheric and Oceanic Sciences andInstitute of Geophysics and Planetary Physics, University of California-Los Angeles,Los Angeles, CA 90095, United States.2 Additional affiliation: Geosciences Department and Laboratoire de Météorolo-gie Dynamique (CNRS and IPSL), Ecole Normale Supérieure, Paris.

0167-2789/$ – see front matter© 2009 Elsevier B.V. All rights reserved.doi:10.1016/j.physd.2009.10.013

known to produce realistic low-frequency behavior. We considerthree approaches to constructing our low-order models. In thepurely empirical approach, we specify the mathematical form ofthe model and find the model parameters based solely on the out-put from the full-model simulation [2,3]. In the semi-empirical ap-proach we use the bare truncation of the full model, with additiveand multiplicative noise terms, along with a simple determinis-tic parameterization of the unresolved noise dynamics [4,5]. In thepurely dynamical approach, we apply the stochastic mode reduc-tion methodology of [6] [hereafter MTV]. The latter two classes ofmodels are further enhanced by applying Kalman filter methodol-ogy to estimate the reduced-model parameters. A comprehensiveanalysis of such reduced models may improve our understandingof potentially predictable LFV modes.

Page 2: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

146 K. Strounine et al. / Physica D 239 (2010) 145–166

1.1. Extratropical low-frequency variability (LFV)

Extratropical atmospheric behavior is characterized by variabil-ity on a wide range of spatial and temporal scales. Major weatherphenomena are associatedwith fast baroclinicwaveswhose break-ing forms synoptic eddies with time scales on the order of aweek or shorter. Atmospheric LFV is predominantly equivalentbarotropic [7] and has time scales from ten days to severalmonths;it has been shown to possess both stationary and traveling wavepatterns with very low zonal wavenumber [8–11]. These patternshave been described in episodic, intermittent terms, via multipleregimes [12–14], as well as in quasi-periodic, oscillatory terms, viaintraseasonal oscillations [15,16].Most of the Northern Hemisphere’s mid-latitude LFV can be

described in terms of a few recurrent and persistent teleconnectionpatterns, such as the North Atlantic Oscillation [NAO [17]], theArctic Oscillation [AO], possibly related to the NAO [18,19], andPacific-North American [PNA] teleconnection pattern [20]. Thesepatterns exhibit either stationary or propagating characteristics, ora combination of both [21]. An important task of climate researchis to establish dynamical laws that govern the maintenance andevolution of major low-frequency patterns. This task is renderedmore difficult by the nonlinear nature of climate behavior: the low-frequency signals of interestmodify, and are in turn beingmodifiedby the interaction with higher-frequency transients, whose effectmust be understood and accounted for in any dynamical model oflow-frequency flow.The nonlinear interaction between storm tracks and low-

frequency patterns in the extra-tropics is a long-standing topic ofinvestigation. Synoptic eddies can be shown to maintain LFV pat-terns andmay, in turn, be modified by LFV [22–31]. A feedback be-tween the twomight involvemodifications of the storm tracks dueto LFV, which in turn amplifies the LFV pattern, leading to a self-maintained structure. Synoptic-eddy feedback is at the heart of theoriginal concept ofweather regimes [32]; the key idea is that inter-action between the synoptic eddies and low-frequency flow fea-tures results in the occurrence of flow patterns that persist longerthan the lifetime of an individual synoptic eddy.Another possibility is that LFV is a separate dynamical entity

that involves nonlinear, self-sustained low-frequency oscillationscoexisting with multiple quasi-stationary states [33–36]. Thispossibility has been explored in the framework of dynamicalsystems theory and shown to give rise to persistent planetary flowregimes [37], Ch.6; [36].Our goal here is to construct an explicit, reduced model of

Northern Hemisphere LFV that resolves behavior associated witheither weather regimes or planetary flow regimes or both, whiletreating synoptic eddies as stochastic forcing.

1.2. Reduced models

Purely empirical approach. Purely empirical, inverse stochas-tic models rely almost entirely on the data set’s informationcontent, while making only minimal assumptions about the un-derlying dynamics. The simplest type of inverse stochasticmodel isthe so-called linear inversemodel [LIM: [38–40]]. In constructing aLIM, it is assumed that the evolution of the system can be capturedby linear deterministic terms involving the resolved modes, plusstochastic terms in the form of a spatially correlated noise processthat iswhite in time; all the parameters that enter either determin-istic or stochastic terms are obtained here by regression fitting. Toreduce the number of parameters to be estimated, the climate statevector is usually represented in terms of a small number of lead-ing empirical orthogonal functions. LIMs have shown some successin predicting certain aspects of El Niño/Southern Oscillation phe-nomena [41,42], tropical Atlantic sea-surface temperature variabil-ity [43], as well as extratropical atmospheric variability [44].

Kravtsov et al. [3] considered generalizations of LIMs that useadditional statistical information to relax the assumptions of lin-earity and white noise. They applied their generalized methodto the analysis of Marshall and Molteni’s [1] three-layer quasi-geostrophic [QG3] model and to Northern Hemisphere geopo-tential height anomalies. For both QG3-generated and observedheight data, their optimized inverse model captured well the non-Gaussian structure of the probability density function [PDF], aswellas the detailed power spectra of the data. Kravtsov et al. [45] re-cently reviewed other applications of this approach.Despite the fact that such parametric empirical models can be

quite successful in reproducing key aspects of observed climatevariability, they suffer from a conceptual as well as a practicaldrawback: (i) the parametric form of the model is pre-specifiedin an ad hoc fashion; and (ii) when estimating a large numberof parameters from relatively small samples of data, estimationprocess can become ill-conditioned.

Semi-empirical approach. Let us assume for a moment that anentirely complete and correct dynamical model of LFV were avail-able, in the sense in which Maxwell’s equations govern the behav-ior of electromagnetic fields in a vacuum, and that the integrationof such a model were feasible for long enough and with sufficientspatial resolution. If so, one might be able to derive closed formsof the reduced model using a statistical–dynamical approach, bycombining statistical properties of themodeled data and the equa-tions that govern the flow dynamics. For example, the full govern-ing equations can be linearized about the numerically computedclimatologicalmean state,while the effects of high-frequency tran-sients are accounted for in the form of flow-independent, spatiallycorrelated noise that is white in time [46,47].One may also add information about climate variability to the

knowledge about the time-mean climate. Oneway of doing so is byprojecting the full-model equations onto an appropriate subset ofbasis functions representing the system’s low-frequency behavior.Typical choices of this subset are the leading empirical orthogonalfunctions [EOFs: [4,48,49,5]] or, alternatively, principal interactionpatterns [PIPs: [50–53]]. The nonlinear reduced dynamics modelcan be obtained by rewriting the full system of equations in thetruncated EOF basis [54,55,49,5,56,57], while treating the residualforcing as random. Alternatively, one can develop a deterministic,flow-dependent parameterization of unresolved processes, basedon the library of differences between the tendency of the full andtruncated models [58,59].Truncated EOF models of the former type, however, show

the so-called climate drift due to neglected interactions with theunresolved modes, i.e. long-term averages do not stabilize afterinitial transients to a statistical equilibrium but keep ‘drifting’slowly. Selten [5] and Achatz and Branstator [4] parameterizedthese neglected interactions by linear damping, whose strength isdetermined empirically. A more powerful tool to extract the slowdynamical features of the system under consideration is based onusing PIPs as a basis for truncation [51–53]. The calculation of PIPstakes into account the dynamics of the full model for which onetries to find an optimal basis. This procedure also involves an adhoc closure through linear damping and certain assumptions aboutnonlinear interactions.Crommelin and Majda [60] compared different optimal bases.

They found thatmodels based on PIPs are superior tomodels basedon EOFs and the so-called optimal persistence patterns of Del-Sole [61], which are related to rotation of EOF bases to concentrateon the lowest-frequency component of variability. On the otherhand, they also pointed out that the determination of PIPs may besensitive to certain features of the numerical algorithm, at least forsome low-order atmospheric dynamical systemswith regime tran-sitions.In summary, the properties of reduced dynamical models

are sensitive to the choice of the function basis in which this

Page 3: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

K. Strounine et al. / Physica D 239 (2010) 145–166 147

construction is carried out. Once the basis is chosen, the truncatedmodels typically exhibit climate drift and thus have to include adhoc parameterizations to model the effect of unresolved processeson the resolved modes.

MTV strategy. From the theoretical standpoint, a commonweakness of the two approaches to model reduction described sofar – whether data-driven, fully empirical or model-driven, semi-empirical – is that the interactions between the resolved and un-resolved modes are parameterized by statistical corrections to thedynamical operator that act directly on the resolvedmodes. The al-ternative approach of Majda and colleagues [6,62–64] has the ad-vantage of assuming that the self-interactions of the unresolvedmodes can be modeled by a stochastic term. The equations of mo-tion for the unresolved modes are then eliminated using standardprojection methods for stochastic differential equations [65–67].An attractive feature of this approach is that it predicts rather thanassumes the structure of the stochastic terms in the equations forthe resolved modes. Moreover, this procedure is mathematicallyrigorous in the limit of significant time scale separation betweenthe resolvedmodes and unresolvedmodes; onemay hope, further-more, that it still works even in the absence of such separation, asis often the case in applying mathematically rigorous methods tophysical problems in which the original assumptions are satisfiedonly partially or not at all.The stochastic mode reduction strategy was dubbed MTV by

the authors of the first four papers on this approach [6,62–64];it has been applied and tested on a variety of simplified modelsand examples, including some models that possess simple, uniqueequilibrium climates, in the sense of statistical wavemechanics [6,62], as well as some that exhibit periodic orbits or multiple modesand satisfy the truncated Burgers–Hopf equation [68,64]. The MTVstrategy worked fairly well in these examples even when there islittle separation of time scales between resolved and unresolvedmodes.Franzke et al. [69] have recently applied this approach to a

barotropic flow model on the sphere, with a few hundred spher-ical harmonics, while Franzke and Majda [70] have applied it tothe QG3 baroclinic model [1]. In the latter two examples, the EOFsof a longmodel integration were used as the basis for the reduced-model construction. The reduced models so obtained do capturesome important features of the full model’s behavior, such as ge-ographical distribution of streamfunction standard deviation andeddy forcing, as well as autocorrelation functions of the resolvedmodes. Still, there are significant quantitative differences betweenthe full and reduced model in both cases, including climate drift inthe reduced model.There are two possible reasons for these discrepancies. First,

the EOFs might indeed not be an optimal basis for carrying out theMTV procedure [60]. Second, it turns out that the EOF modes’ timescales exhibit no significant time scale separation. To overcome thelatter difficulty, one may introduce into the MTV formalism one ormore free parameters, including a time scale separation parameter,as well as scaling parameters that represent the magnitudes ofvarious terms in the full-model equations. Indeed, Franzke andMajda [70] have shown, by choosing an optimal parameter set in atrial-and-error fashion, that the reduced model’s behavior can beimproved relative to that of its uncorrected version.In summary, the MTV strategy predicts the structure of the re-

duced stochastic model equations and estimates all model param-eters with minimal regression fitting. The still limited success ofthis methodology when applied to the analysis of more realisticclimate models thus far may be due to (i) a suboptimal dynami-cal basis used for mode reduction; and (ii) the approximately redcharacter of the continuous component of the spectral density ofclimate fields. MTV-reduced climate models may potentially beimproved by introducing into them a few additional parametersin order to account for the latter property of relatively weak or no

time scale separation, and estimating these parameters optimallyby other procedures.

1.3. Parameter estimation using data assimilation

A novel aspect of our work is combining MTV methodology, aswell as the semi-empirical approach [4,5] with data assimilationmethods to estimate an optimal set of reduced-model parameters.Two major classes of advanced data assimilation methods aresequential estimation methods [71–73] and variational methodsbased on minimization of a cost functional [74–76]. The twoclasses of methods are equivalent for linear, finite-dimensionalproblems [77,78] and involve comparable computational expenseif the optimal four-dimensional space–time state of the system, aswell as the associated error distribution are both to be estimated.The variational methods, however, do not necessarily involve theestimation of error covariances, which is computationally quiteexpensive but central in constructing the sequential filters. Theformer are thusmore amenable to the quick assimilation of variousdata types into a high-resolution atmospheric, oceanic or coupledmodel, at the expense of lacking appropriate error bars.Major computational expense and convergence issues arise in

applying the Kalman [72] filter to nonlinear geophysical systems,due to the necessity of updating sequentially the forecast error co-variance matrix. Ghil [77] discusses various suboptimal, approx-imate schemes, which allow to speed up this computation andavoid filter divergence problems. A standard generalization of theKalman filter for use in nonlinear systems is the so-called extendedKalman filter [EKF], where the third and higher moments are ne-glected in integrating the error covariance equation [79–81]. Forlow-dimensional systems where computational cost is not high,the EKF is a perfectly valid choice for trying to estimate an opti-mal set of reduced-model parameters.

1.4. This study

In the present work, we build upon past progress in derivingan optimal low-order description of atmospheric LFV to introduceand test a novel model reduction methodology. This methodol-ogy should satisfy the following criteria: (i) be based on rigor-ous mathematical theories for systems that are characterized bya pronounced time scale separation between a few low-frequencymodes and the rest; (ii) contain a method for correcting biases inthe reduced-model coefficients when there is little or no time scaleseparation; and (iii) be practically feasible for realistic systems in-volving a large number of degrees of freedom. In particular, wewill utilize simulations of moderate duration [on the order of 5000days] using a fairly sophisticated, high-dimensional atmosphericmodel and Kalman filter estimation to help determine the param-eters of an MTV-reduced model. The reduced model obtained bythismodifiedMTV strategywill be comparedwith the results of thepurely empirical and semi-empirical approach for obtaining the re-duced models, by assessing the models’ relative performance.An important novel feature of our analysis is that the reduced-

model parameters will be estimated here objectively via Kalmanfiltering, rather than in an ad hoc fashion, as in previous studies.Using Kalman filtering for parameter estimation in reducedmodelsis an appealing approach to climatic data assimilation: first, the re-ducedmodel is low-dimensional, thus reducing the computationaleffort, and, second, if chosen correctly, it is likely to capture mostof the potentially predictable climate signals; keeping the higher-ordermodeswould just complicate the analysiswithout improvingforecast skill.The paper is organized as follows. Section 2 briefly describes

the high-dimensional model we will use, and introduces ourmethodology for obtaining the basis functions. Section 3 provides a

Page 4: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

148 K. Strounine et al. / Physica D 239 (2010) 145–166

description of the three types of reduced models we construct,including a brief account of Kalman filtering and of the MTV ap-proach; amore detailed description of the latter two topics appearsin Appendices A and B, respectively. Section 4 compares the perfor-mance of the three types of reduced models, using several quanti-tative criteria. A summary anddiscussion of the results is containedin Section 5.

2. The high-dimensional QG3 model

2.1. Model formulation

Ideally, one would like to use model reduction methodologiesto capture the most interesting and important features of a givensystem under study. We used here a spectral, three-level, quasi-geostrophic (QG3)model in spherical geometry developed byMar-shall andMolteni [1]. This model represents a reasonable trade-offbetween capturing themajor features of mid-latitude atmosphericLFV and allowing a fairly satisfactory study of the mechanisms andparameter dependence. It has been widely used, therefore, in the-oretical LFV studies, including several previous studies of modelreduction methodologies [70,2,3].The spherical harmonics expansion representing the horizontal

fields has a triangular truncation at total wavenumber 21, com-monly referred to as T21. The model is governed by prognosticequations for quasi-geostrophic potential vorticity (PV) at 200 hPa(level 1), 500 hPa (level 2), and 800 hPa (level 3):

q1 = −J(ψ1, q1 + f )− D1(ψ1, ψ2)+ S1,

q2 = −J(ψ2, q2 + f )− D1(ψ1, ψ2, ψ3)+ S2, (1)q3 = −J(ψ3, q3 + f )− J(ψ3, fh/H0)− D(ψ2, ψ3)+ S3;

the indices i = 1, 2, 3 refer to the pressure level, f = 2Ω sinφ isthe Coriolis parameter, h is the topographic height, andH0 is a scaleheight set to 9 km. Layer PVs are defined as

q1 = ∇2ψ1 − R−21 (ψ1 − ψ2),

q2 = ∇2ψ2 + R−21 (ψ1 − ψ2)− R−22 (ψ2 − ψ3), (2)

q3 = ∇2ψ3 + R−22 (ψ2 − ψ3).

Here,∇ is the gradient operator and∇2 is the Laplace–Beltramioperator on the sphere, while R1 = 700 km and R2 = 450 km areRossby radii of deformation appropriate for the 200–500 hPa andthe 500–800 hPa layers, respectively. The linear operators D1, D2,D3 represent the effects of Newtonian relaxation of temperature,linear drag on the 800 hPa wind, and horizontal diffusion ofvorticity and temperature.The terms S1, S2, and S3 are time-independent but spatially

varying PV sources. In order to estimate these sources, we requirethat the time-mean PV tendencies computed by substituting theobserved atmospheric streamfunction fields into (1) vanish onaverage. Let the overbar represent the time mean, and the primethe deviation from it; in meteorology and climate studies, thisdeviation is also called the anomaly. We thus require that

q = −J(ψ, q)− J(ψ ′, q′)− D(ψ)+ S = 0. (3)

Eq. (3) was used to calculate the first-guess forcing S0 =

J(ψ, q) + J(ψ ′, q′) + D(ψ). This first-guess forcing was theniteratively corrected by making twenty runs, each 2000-day-long,and calculating the updated forcing Sn for each new iteration by

Sn = Sn−1 +1τ(qobs − qn−1); (4)

here qobs is the observational mean, qn the time mean of the nthrun, and τ = 50 days. This procedure ensures that the observedand simulated time-mean states are approximately the same.

In the above model tuning procedure, we used averaged dailywind data (u, v) from the National Centers for Environmental Pre-diction/National Center for Atmospheric Research [NCEP/NCAR]Reanalysis [82] to obtain the time series of the streamfunctionfieldsψ by inverting the Poisson equation∇2ψ = ξ on the sphere,where ξ is the relative vorticity. We analyzed the ψ-data sub-set for Northern Hemisphere winter, defined as an interval of 90days starting on 1 December of each year, from December 1948 toFebruary 2005; the daily time series thus contained 57 × 90 =5130 days. The analysis was carried out on a 2.5 × 2.5 lati-tude–longitude grid, using 4 pressure levels: 850, 700, 500, and200 hPa; the 800 hPa streamfunction ψ800 was obtained by av-eraging the streamfunction at the 700 hPa and 850 hPa levels asψ800 = (2ψ850 + ψ700)/3.First, the seasonal cycle was removed from the data. This

cycle was estimated by applying a 5-day running mean to dailystreamfunction data and averaging over all years. Next, in orderto concentrate on the mid-latitude dynamics, rather than on thebehavior associated with tropical–extratropical teleconnections,the El Niño/Southern Oscillation signal was linearly removed fromall fields as in [83]. This signal was found by regressing thestreamfunction field onto centered and normalized time series ofthe Niño-3 index, that is the sea-surface temperature anomalyaveraged over the box (5 S–5 N, 150–90 W). The time seriesof the daily Niño-3 index was obtained by applying a cubic splineto its monthly time series. We then utilized the data sets of dailyanomaliesψ ′ so obtained, alongwith the observed streamfunctionclimatology ψ , to tune our dynamical model.The dynamical model so tuned does a fairly decent job in simu-

lating the observed Northern Hemisphere’s time-mean climate, aswell as the spatial pattern of the variance distribution (not shown);see also similar results in [58,59,84,2,1]. This high-dimensionalQG3 model will be subjected in Section 4 to various systematicmode reduction methods, to be described in Section 3.

2.2. Principal component (PC) analysis

We use principal component (PC) analysis to define the basisupon which wewill project our model equations. PC analysis com-putes the spatial patterns called empirical orthogonal functions[EOFs]; the time dependence associated with each EOF is givenby the corresponding PC. In engineering fluid dynamics and tur-bulence literature, this statistical methodology is known as properorthogonal decomposition [85].Themodel state vector9(t), represented here at each time t by

the J = 693 spectral coefficients of the streamfunction field in theNorthern Hemisphere, can thus be written in a unique way as

9(t) =J∑j=1

xj(t)ej. (5)

Each PC xj(t) is given by the projection of 9(t) onto the corre-sponding EOF ej:

xj(t) = 〈9(t), ej〉p, (6)

where the symbol 〈, 〉p denotes a particular kind of inner product,and the ordering 1 ≤ j ≤ J is by the variance captured by eachEOF–PC pair.We compute the EOFs in spectral space rather then in physical

space [49,5]. The elements of our state vector will thus be repre-sented by the real and imaginary parts of the coefficients in thespectral expansion of the streamfunction or PV into spherical har-monics on three levels.To calculate the EOFs, a particular inner product has to be

chosen. The most widely used choice is associated with the L2norm of the streamfunction field, ‖ψ‖20 =

∫ψ2dA, where

Page 5: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

K. Strounine et al. / Physica D 239 (2010) 145–166 149

Fig. 1. QG3-simulated 500 hPa streamfunction EOFs, using the energy norm. The EOF index j = 1, 2, 3, or 4 is shown in the panel caption, while contour intervals, in m2 s−1 ,are listed in each panel’s legend. Negative contours are light, and zero contours are omitted.

dA is the infinitesimal area element on the sphere. This choice,however, is not suitable for our purposes, because the truncatedsystem written in the streamfunction norm EOF basis would notconserve any quadratic invariants of the unforced equations in theabsence of dissipation, unless additional approximations are madeto project model fields into the subspace of a few leading EOFs[55,49,5]. Franzke andMajda [70] proposed to use the total-energynorm instead of the streamfunction norm; the spectral form of theenergy-norm representation was derived for the QG3 model byEhrendorfer [86]. This choice ensures that the projected equations,in the absence of forcing and dissipation, will conserve the totalenergy at any truncation.We use this energy norm and the associated inner product, as

well as the potential-enstrophy norm in Eq. (6). The latter normis the L2 norm of the PV, ‖ψ‖22 = ‖q‖

20 =

∫q2dA; it is thus

associated with the inner product 〈ψ,ψ〉2 = 〈q, q〉0. Projectingthe equations onto such a basis will conserve potential enstrophyat any truncation. Vorticity-based norms have been argued to bepoorly suited for construction of reduced models in the barotropiccase [5], since theymight tend to emphasize smaller-scale featuresrather than the low-frequency, large-scale patterns of interest, butthis turned out not to be the case in the present baroclinic model.The energy EOFs of the QG3 model are shown in Fig. 1. These

EOFs, in fact, resemble closely streamfunction norm and potential-enstrophy-norm EOFs [see [87], not shown here]; some of themexhibit features similar to thewell-known teleconnection patternsthat emerge from observational data sets [18,17,19]: an AO [EOF-1] that is zonally symmetric by-and-large, as well as combinationsof NAO and PNA [the Atlantic portion of EOF-2 and EOFs 4–7(5–7 not shown)]. Other EOFs, such as EOFs 2–3 and 8–9 [thelatter not shown] represent higher-wavenumber wavetrains [87]that are associated with synoptic eddies; note again that EOF-2 is a mixture of an NAO-type pattern over the Atlantic with a

wavetrain over the Pacific. All of the results reported below arerobust with respect to the choice of norm used for computingEOFs; i.e. the absolute and relative performance of the reducedmodels constructed in the streamfunction, energy and potential-enstrophy bases is qualitatively and quantitatively similar [87]. Forthis reason, we will only show here the results obtained for thereducedmodels constructed in the EOF basis derived in the energynorm.To summarize, in this work we use EOF bases compatible with

the total-energy norm and the potential-enstrophy norm, becauseof their conservation properties. The leading modes for either ba-sis are similar, and analogous to the streamfunction norm EOFs; asubset of these modes is associated with well-known teleconnec-tion patterns: AO, NAO, and PNA. Note that the streamfunction, en-ergy and potential-enstrophy norms are essentially equivalent tothe Sobolev norms H0, H1 and H2 for the solutions of the governingequations (1) [88]. It turns out that the performance of the reducedmodels constructed in these two bases is also very similar. Below,we illustrate our analyses by showing the results using the energynorm; the qualitative results in the other basis are analogous [87].

3. Reduced models

3.1. Empirical models

We use multiple polynomial regression to construct reducedmodels in the EOF bases discussed in Section 2.2. Previous work[89,2,3] showed that a quadratic model of the general form

dxi = (xTAix+ b(0)i x+ ci(0))dt + dri(0) 1 ≤ i ≤ I, (7)

where x = xi is the state vector of dimension I , is well suitedfor this task. The matrices Ai, the rows b

(0)i of the matrix B

(0) and

Page 6: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

150 K. Strounine et al. / Physica D 239 (2010) 145–166

the components ci(0) of the vector c(0), as well as the componentsri(0) of the residual forcing r(0), are determined by least-squares.If the inverse model (7) contains a large number of variables, thestatistical distribution of r(0) at a given instant is nearly Gaussian,according to the central limit theorem [90].We are interested, however, in obtaining a reduced model

with a fairly small number of variables, I J , and hence thestochastic forcing r(0) in Eq. (7) typically involves serial correlationsand might also depend on the modeled process x. We include,therefore, an additionalmodel level to express the time incrementsdr(0) as a linear function of an extended state vector [x, r(0)] ≡(xT, r(0)T)T, and estimate this level’s residual forcing r(1); the lineardependence is used at this second and at all subsequent levels sincethe non-Gaussian statistics of the data has already been capturedby the first, nonlinear level. Additional linear levels are added in thesameway, until the (L+1)-st level’s residual r(L+1) becomes whitein time, and its lag-0 correlation matrix converges to a constantmatrix:

dxi = (xTAix+ b(0)i x+ c(0)i ) dt+ r(0)i dt,

dri(0) = b(1)i [x, r(0)]dt + ri(1) dt,

dri(1) = b(2)i [x, r(0), r(1)]dt + ri(2) dt, (8)

· · ·

dri(L) = b(L)i [x, r(0), r(1), . . . , r(L)]dt + dri(L+1); 1 ≤ i ≤ I.

The convergence of this procedure is guaranteed since, with eachadditional level l ≥ 1, we are accounting for additional time-lag information, thereby squeezing out any time correlations fromthe residual forcing. The reduced model (8) effectively uses time-delayed information to improve the reconstruction of the fullmodel’s attractor, as proposed by Mañé [91] and Takens [92].In practice, we approximate the infinitesimal increments dxi,

dri(l) by finite differences

1xi = xk+1i − xki , 1ri(l) = ri(l),k+1 − ri(l),k, 1 ≤ l ≤ L, (9)

where k is the time index, while 1t is assumed to be equal tothe data set’s sampling interval; without loss of generality, weuse 1t = 1, in these nondimensional units. To test the residualnoise r(l) for whiteness, we computed lag-0 and lag-1 covariancematrices of 1r(l) at each level, until the lag-1 covariance vanishesand the spatial covariance does not change when adding morelevels. The covariance matrix of the last-level residual 1ri(L+1) isestimated directly from its multivariate time series; in subsequentintegrations of the inverse model, this forcing is approximated as aspatially correlated white noise. It turns out for our reducedmodelthat the third-level residual forcing is indeed white.A practical difficulty in constructing the empirical model (8) is

the large number of regression coefficients to be estimated using alimited amount of data; doing so, and still obtaining a statisticallystable model requires utilization of regularization techniques suchas principal component regression (PCR) or partial least-squares(PLS); see [3] for further details.This type of empirically reduced models performs very well in

capturing the essential statistics of the full model’s variability [2,3]. The downside of thesemodels, however, is that they are difficultto interpret in terms of the dynamical processes responsible for thesimulated behavior and thus, ultimately, for the observed behavior.Furthermore, the purely statistical procedure does not take intoaccount the system’s conservation properties, which can result inthe model parameters producing unstable climates. One possibleway of dealing with this problem is to constrain the regressioncoefficients of model (8) in such a way as to conserve energy [52];this approach is one topic of our ongoing research.

The highest-order explicit nonlinearities in the QG3 model’sgoverning Eq. (1) are quadratic. Still, the mode reduction method-ology of Majda et al. [6,62,63] yields cubic terms in the reduced-model equations as well. Franzke et al. [69] have shown that thesecubic terms introduce only slight modifications into the dynamicsof a barotropic reduced model, but Franzke and Majda [70] havefound them to be much more important in their reduced modelswhen trying to capture the QG3model’s dynamics. We have there-fore attempted to construct the empirical reducedmodels that useonly quadratic, as well as some that use cubic polynomials in themain model level.Our empirical model works equally well for both the H1 and

H2 bases of choice. We show here only the results produced byempirical models with quadratic and cubic nonlinearities at thehighest order, with respect to an energy-norm EOF basis. Figs. 2and 4 show just howwell the quadratic nonlinear regressionmodelcaptures the PDFs and lagged autocorrelation functions of thenine leading EOFs in this basis. In particular, the full and reducedmodel PDFs are almost indistinguishable: the main differences inthe autocorrelation function are still quite small and involve asomewhat shorter time scale of the reduced models EOF-1 and aspurious damped oscillationwith a period of around 20 days in thereduced models EOF-4.Fig. 3 shows the same results for the PDFs of the empiricalmodel

with the highest-order terms being cubic. The fact that Figs. 2 and4 here vs. Fig. 3 here and Fig. 3.4 in [87] are pairwise almost in-distinguishable from each other shows that the reduced modelwith quadratic nonlinearities only is a more parsimonious, and anequally accurate approximation of the full QG3 model as the onethat contains cubic nonlinearities as well. It thus turns out that thequadratic empiricalmodel [Figs. 2 and 4 here] performs just aswellas the cubic one [Fig. 3 here and Fig. 3.4 in [87]]. Using cubic nonlin-earities in empirical regressionmodeling does not seem, therefore,to be justified, and does introduce an additional computationalburden. Given this empirical evidence that the underlying system’sdynamics is dominated by quadratic nonlinearities, we will lateralso neglect cubic terms in our MTV-based reduced models.The excellent matches produced by the purely empirical re-

duced models above provide a benchmark low-order statisticalfit to the climate of the full QG3 model. Uniform goodness-of-fitby these reduced models is puzzling and raises questions as tothe dynamical meaning of these results. Unfortunately, the em-pirical models, by construction, are not designed to answer thistype of questions. Our next goal is to use dynamical and empiri-cal–dynamical approaches to construct reduced models similar inperformance to the empirical models, and thus gain more insightinto the dynamics of LFV, as simulated by the full model.

3.2. Empirical–dynamical models

We will construct semi-empirical, or empirical–dynamicalmodels by taking a bare truncation of the full QG3model, projectedon the basis of choice and adding linear and additive noise terms.The coefficients of the noise terms are estimated here by usingthe EKF approach. The construction of such empirical–dynamicalmodels is conducted in a truncated EOF basis of 10–20 modes.Truncation is necessary in order to reduce the number of regressioncoefficients and still achieve stable climates of the reduced model,while retaining the most important low-frequency climate modes.Note that because the coefficients of the nonlinear terms in the

reduced model are determined in a way that is consistent withthe QG3 model’s nonlinearity, the former model is guaranteedto produce a stable climate, which is not always the case in thepurely empirical approach of Section 3.1 [2,3]. This is so sincethe empirical–dynamical models nonlinear terms conserve energy

Page 7: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

K. Strounine et al. / Physica D 239 (2010) 145–166 151

Fig. 2. Individual probability density functions (PDFs) of the first 9 energy-norm EOFs of the ten-component reduced empirical model (light solid, red), compared to thoseof the full QG3 model (heavy solid, blue).

Fig. 3. Same as Fig. 2 but for empirical model with cubic nonlinearities.

at any truncation, provided one uses the energy-norm EOF basis,while the additional damping terms are negative definite.

The full model’s equations are projected on the truncated EOFsbasis. One can write the equations for time derivatives of the PCs

Page 8: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

152 K. Strounine et al. / Physica D 239 (2010) 145–166

Fig. 4. Individual autocorrelations of the first 9 energy-norm EOFs of the ten-component reduced empirical model (light solid, red), compared to those of the full QG3model(heavy solid, blue).

xj(t) given by Eq. (6) in the following symbolic form

x = Lx+ N(x)+ S, (10)

where x is a vector of PCs, L is a linear operator, N(x) is a nonlinearoperator, and S is the forcing. The dynamic equation for theretained PCs xj(t), j = 1, 2, . . . ,N , N being the number of theretained components, with N J = 693, has the form similarto (10), with additional forcing, linear damping, and additive noiseterms [note again that while in (10) the symbol x represents allvariables, in the Eq. (11), x refers to resolved variables only]:

dx = (Lx+ N(x)+ S+ f+ µx)dt + σdξ, (11)

where the operators L and N(x) and the vector S are obtained di-rectly from the equations of the QG3 model by projecting themonto the EOF basis of choice, while the additional forcing fdt ,damping µxdt and noise σdξ represent the parameterized inter-actions between the resolved and unresolved modes. Here f is anN-dimensional deterministic-forcing vector, µ is an N × N diago-nal matrix, σ is an N×N matrix, and ξ is a centered random vectorwith unit variance:

Eξ = 0, EξξT = 1.

The noise coefficients σ and drift coefficients µ are estimated us-ing parameter estimation methodology based on the EKF (see Ap-pendix A). We refer to the state estimator as the primary filterand to the parameter estimator as the secondary filter. The lin-earized, discrete-time dynamic operator9(k) used by the EKF (seeAppendix A) is given by

9(k) = I+(L+

∂N∂x+ µ

)dt, (12)

where I is a N × N identity matrix; see Eqs. (A.6) and (A.8), whilethe noise covariance matrix Q is given by

Q = σTσ. (13)

In Eq. (A.9) we followed Dee et al. [93] by introducing three pa-rameters – α1, α2, and α3 – to estimate the additive noise covari-ance Q: α1 multiplies the main diagonal of the covariance matrix,α2 multiplies the elements on its subdiagonal and superdiagonal,and α3 multiplies all elements of the covariance matrix that do notlie on one of these three diagonals. This choice of the noisemodel ismotivated by the structure ofQ, in which the three diagonals beingmultiplied by α1 and α2 contain the largest values. The first-guesscovariancematrixQ0 is obtained from the differences between thetendencies of the full and truncatedmodels; inmeteorological par-lance, the tendencies refer to the finite-difference formsof the statevector’s time derivatives, i.e., to the right-hand sides of Eqs. (1), (8)or (10), as the case may be. To accumulate a library of differencesbetween tendencies, we run the full and truncated models in par-allel for 5000 days, starting the truncated model every day fromthe state of the full model projected on the retained EOFs. Once wehave a library of these differences dZ in the retained EOFs, Eq. (11),we approximate Q by its sample counterpart,

Q0 = EdZdZT, (14)

where E stands for the expectation operator. In practice, dZ = 1Zand E is calculated as the sample average. The parameters α1, α2,and α3 are then estimated by the secondary filter starting fromtheir initial values α1(0) = α2(0) = α3(0) = 1.The parameters of the additional drift terms fj and µj, j =

1, 2, . . . ,N , in Eq. (11) are estimated by the primary filter.Unknown parameters, assumed constant in time, can be easily in-corporated into the scheme as additional state variables. The di-mension of the system increases by the number of parameters tobe estimated, which in our case is 2N . The dynamical equations forthese additional state variables are assumed to be

µ(k) = µ(k− 1),f(k) = f(k− 1).The matrices9(k), B and H used in the primary Kalman filter (A.6)

Page 9: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

K. Strounine et al. / Physica D 239 (2010) 145–166 153

have to bemodified according to the above assumptions, while theavailable observations are still the full model’s simulated time se-ries of the original state vector.The initial values of the parameters fj of the constant drift

term in Eq. (11), which represents additional forcing provided byunresolvedmodes, are obtained by taking themean of each columnof the time sequence dZ,

f0 = EdZ. (15)

The initial values of the parametersµj in Eq. (11), which dependlinearly on the state vector and represent additional damping of theresolved modes by unresolved ones, are taken to be zero.

3.3. Dynamical model

Stochastic mode reduction formalism. The models describedabove can all be written in the following symbolic form:

∂q∂t= F+ Lq+ B(q, q), (16)

where q denotes the state vector with the basic state subtracted.Basic state effects are included in the definitions of F and L. Theset of PV anomalies q is decomposed into orthogonal subsets qand q′, which are characterized by ‘‘slow’’ and ‘‘fast’’ time scales,respectively, following the MTV approach. Thus, q denotes theevolving ‘climate’ of the system, which changes slowly comparedto the ‘weather’ variables q′; the latter evolve more rapidly in timeand are not resolved in detail in the stochastic climate model. Thedecomposition q = q+q′ can be written in terms of the potential-energy-norm EOFs ei as

q =N∑i=1

xiei =R∑i=1

αiei +N∑

j=R+1

βjej, (17)

where R is the number of resolved modes, xi denote the PCs in theenergy-norm EOF basis, while αi and βj are the expansion coeffi-cients of the resolved and unresolved modes, respectively.By introducing the EOF expansion (17) into the original equa-

tions (16) and using the orthogonality of the EOFs, we get the fol-lowing two sets of equations for the resolved αi and unresolved βjmodes:

αi(t) = Hαi +∑j

Lααij αj(t)+1ε

∑j

Lαβij βj(t)

+

∑jk

Bαααijk αj(t)αk(t)+2ε

∑jk

Bααβijk αj(t)βk(t)

+1ε

∑jk

Bαββijk βj(t)βk(t), (18)

βi(t) = Hβ

i +1ε

∑j

Lβαij αj(t)+1ε

∑j

Lββij βj(t)

+1ε

∑jk

Bβααijk αj(t)αk(t)+2ε

∑jk

Bβαβijk αj(t)βk(t)

+1ε2

∑jk

Bβββijk βj(t)βk(t) (19)

where the nonlinear operators have been symmetrized, i.e. Bijk =Bikj. The upper indicesα andβ indicate the respective subsets of thefull operators in (16), so that the notation Lαβij , for example, meansthat we use index ranges 1 ≤ i ≤ R and R+ 1 ≤ j ≤ N .In the equations above, following themodifiedMTVapproach of

[69,70,63], ε is a small positive parameter which controls the timescale separation between slow and fast modes; it measures the

ratio of the integral correlation time of the slowest of unresolvedmodes q′j∗ to the fastest of resolved modes qi∗ , where i

∗ and j∗are not necessarily equal to R and R + 1, respectively. We havealso assumed that the forcing terms Hα , Hβ , as well as the slowlinear and nonlinear operators Lαα and Bααα act on a time scaleof order one, while all other operators act on a shorter time scaleof ε−1, except for the one representing the self-interaction Bβββof the fast modes, which acts on an even shorter time scale on theorder of ε−2. Furthermore, it is assumed that the dynamical systemgoverning the unresolved modes only is ergodic and mixing andcan be represented by a stochastic process.With these assumptions, the stochastic mode reduction proce-

dure [69] predicts the following effective stochastic equations forthe resolved variables alone:

dαi(t) = Hαi +∑j

Lααij αj(t)dt +∑ij

Bαααijk αj(t)αk(t)dt

+ Hidt +∑j

Lijαj(t)dt

+

∑jk

Bijkαj(t)αk(t)dt +∑jkl

Mijklαj(t)αk(t)αl(t)dt

+√2∑j

σ(1)ij (α(t))dW

(1)j +

√2∑j

σ(2)ij dW

(2)j , (20)

whose deterministic part contains, in addition to the originalterms, constant, linear, and quadratic corrections to these terms,as well as an additional cubic term. The notation W(1)

=

(W (1)1 , . . . ,W (1)

m ) and W(2)= (W (2)

1 , . . . ,W (2)m ) denotes two in-

dependentm-dimensional Wiener processes, while the diagnosticequations for the covariancematrices of the noise are given by two‘‘detailed balance’’ relations [65] of Riccati type:

Q (1)ij +∑k

Uijkαk(t)+∑k

∑l

Vijklαk(t)αl(t)

=

∑k

σ(1)ik (α(t))σ

(1)jk (α(t)), (21)

Q (2)ij =∑k

σ(2)ik σ

(2)jk . (22)

In the above formulas, all constant tensors U and V are computedbased on the lagged-covariance statistics of the unresolved modes(see Appendix B).The system (20) consists of the bare truncation, deterministic

correction terms, as well as additive andmultiplicative noises. Thecorrection terms and noises are predicted by the MTV stochasticmode reduction approach and account for the interaction betweenresolved and unresolved modes, as well as the self-interaction be-tween the unresolved modes; the only inputs from the unresolvedmodes are their variances and autocorrelation times.In order to study the relative importance of various terms in

(20), Franzke and Majda [70] grouped them according to theirphysical origin, and placed a control parameter λp in front of thepth group of terms. After regrouping, the stochastic model (20) canbe written as:

dαi = λB(Hαi +∑j

Lααij αj(t)dt +∑jk

Bαααijk αj(t)αk(t)dt)

+ λ2A

∑j

L(2)ij αj(t)dt + λA√2∑j

σ(2)ij dW

(2)j

+ λ2M

(∑j

L(3)ij αj(t)dt +∑jkl

Mijklαj(t)αk(t)αl(t)dt

)

+ λ2L

(∑j

L(1)ij αj(t)dt

)

Page 10: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

154 K. Strounine et al. / Physica D 239 (2010) 145–166

+ λMλL

(H(1)j dt +

∑jk

Bijkαj(t)αk(t)dt

)+ λAλF H

(2)j dt +

√2∑j

σ(1)ij (α(t))dW

(1)j . (23)

The noise covariance matrix Q(1) now satisfies the parameter-dependent Riccati equation

λ2LQ(1)ij + λLλM

∑k

Uijkαk(t)+ λ2M∑k

∑l

Vijklαk(t)αl(t)

=

∑k

σ(1)ik (α(t))σ

(1)jk (α(t)). (24)

Franzke and Majda [70] showed that the performance of theirreduced model can be improved by fine-tuning the coefficientsλp in (23) and (24). The dynamical motivation for such a tuningcomes from the fact that the observed or model generated vari-ability associated with LFV and other climate-and-weather relatedphenomena does not exhibit a significant time scale separation;hence the parameter ε in (18) is actually of order of one, ratherthan being arbitrarily small, as assumed in the derivation of the re-duced model (20). In fact, there is no natural way of choosing thenumber R of resolved modes and, when choosing R arbitrarily, thescaling assumptions in (18) do not necessarily hold. Formally, allof this means that the actual mode reduction problem has a num-ber of hidden free parameters. Following the suggestions of Refer-ences [69,70], we will choose these parameters to be a set of λ’sthat enter Eqs. (23) and (24).

MTV methodology and data assimilation. Franzke and Majda[70] fine-tuned their reduced model’s coefficients by trial-and-error and managed to moderately improve the model’s perfor-mance. These results underscore the need for a more rigorousprocedure to improve the estimates of reduced-model parame-ters. In addition, while the mathematical expressions for the MTVmodel coefficients above are known, an accurate numerical com-putation of these integrals requires very long and frequently sam-pled libraries of the model’s evolution: for example, Franzke andMajda [70] used a 1000000-day model simulation, sampled witha 1/2-day interval. Thismakes the data requirements for a success-ful application of theMTV strategymuchmore stringent than thoseof the purely empirical or empirical–dynamicalmodel discussed inSections 3.1 and 3.2, respectively. It is desirable, therefore, to beable to use estimates of the correlation integrals for the MTV coef-ficients that require fewer data, and then correct for possible errorsintroduced thereby. To address both of these issues, we apply se-quential estimation via Kalman filtering.The reduced model (23), (24) has several features that render

the EKF attractive for estimating its parameters: (i) the model’slow-dimensionality; (ii) a specified law for the error covariancematrix evolution (24), which guarantees the EKF’s stability; and(iii) a relatively small number of parameters that need to be es-timated.The time series of the resolvedmodes that have been generated

by the full dynamical model, like QG3, or obtained from observa-tions, as in [3], are assimilated into our low-order model equationsto optimally estimate the set of five parameters λA, λB, λF , λL, andλM . The observational record can be made arbitrarily long for theQG3-generated data, but we used 5000-day-long simulated obser-vations tomake it comparablewith the observed atmospheric datathat were available to tune the QG3 model itself [5130 days; [82]].This data set length makes estimating the five parameters of inter-est readily feasible given the actual observed atmospheric data.The five unknown parameters enter our equations in quadratic

combinations; for example, the terms of the form λMλL. Further-more, these parameters enter into both the ‘‘deterministic’’ andthe ‘‘stochastic’’ part of our model. In order to make the system

(23), (24) suitable for our parameter estimation methodology, we‘‘break’’ the connection between the parameters and estimate alarger number of parameters that would enter the equations lin-early. We end up with 10 independent parameters λp, 1 ≤ p ≤ 10,defined as follows: λ1 = λB, λ2 = λ2A, λ3 = λA, λ4 = λ

2M , λ5 = λ

2L ,

λ6 = λMλL, λ7 = λAλF , λ8 = λ2L , λ9 = λLλM , and λ10 = λ2M ; the

parameters λ1 through λ7 enter Eq. (23), while λ8, λ9 and λ10 en-ter into Eq. (24). We also disregard the cubic nonlinearities, sincethe purely empirical models perform equally well with and with-out cubic nonlinear terms (see Section 3.1). Given the above as-sumptions, the effective equations for the resolved variables canbe rewritten in the following form:

dαi = λ1

(Hαi +

∑j

Lααij αj(t)dt +∑jk

Bαααijk αj(t)αk(t)dt

)+ λ2

∑j

L(2)ij αj(t)dt + λ3√2∑j

σ(2)ij dW

(2)j

+ λ4∑j

L(3)ij αj(t)dt + λ5

(∑j

L(1)ij αj(t)dt

)

+ λ6

(H(1)j dt +

∑jk

Bijkαj(t)αk(t)dt

)+ λ7H

(2)j dt +

√2∑j

σ(1)ij (α(t))dW

(1)j . (25)

The noise covariance matrix is given by

λ8Q(1)ij + λ9

∑k

Uijkαk(t)+ λ10∑k

∑l

Vijklαk(t)αl(t)

=

∑k

σ(1)ik (α(t))σ

(1)jk (α(t)). (26)

The estimation of the ten parameters in Eqs. (25) and (26) pro-ceeds now according to the sequential estimation approach dis-cussed in Section 3.2: parameters that multiply the deterministicparts of the dynamic operator in Eq. (25) are treated as additionalstate variables and are estimated by the primary filter, while thethree parameters that multiply stochastic terms are estimated bythe secondary filter (see Appendix A).

4. Performance of the reduced models

4.1. Evaluation methodology

In this section we discuss the results of applying approachesdescribed in Sections 3.2 and 3.3, respectively, to the QG3 model.We use the purely data-based, empirical reduced model (EMR)described in Section 3.1 as a benchmark for the performance ofthe other reduced models. Construction of the reduced modelswill proceed in the bases of total-energy-norm and potential-enstrophy-norm EOFs. In comparing various reduced models, weconsider how well the statistical properties of the reduced mod-els’ leadingmodesmatch the corresponding properties of the samemodes in the full-model and EMR model statistics. The propertiesconsideredwill include one-dimensional PDFs and lagged autocor-relation functions of the leading modes. We will also consider theresults ofmulti-dimensional PDF estimation via Gaussianmixtures[94,84,95].All models in this section retain 10 resolved components. This

number has been chosen by trial-and-error, based on the perfor-mance of the reduced models; it also appears to be optimal as atrade-off between model performance and the computational costinvolved in constructing the empirical–dynamical and MTV mod-els. Note that our benchmark EMRmodel of Section 3.1 has only 10

Page 11: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

K. Strounine et al. / Physica D 239 (2010) 145–166 155

resolved components, although there are a total of 30 model vari-ables. The 20 additional variables correspond to the linear levelstwo and three of the EMR model: these variables merely parame-terize the main-level equations forcing in terms of spatially coher-ent, 10-dimensional red noise. We think therefore that it is fair tocompare reducedmodels that possess an equal number of resolvedcomponents.Note also that our choice of these 10 resolved components was

not primarily motivated by decomposition of the dynamics intofast and slow components. While such a decomposition is notof the essence in the empirical or the empirical–dynamical ap-proaches, it is intrinsic to the MTV approach. In particular, theabsence of time scale separation between the resolved and un-resolved modes may be one of the main reasons for a relativelymediocre performance of MTV models reported below.For all the implementations of sequential parameter estimation

cases, we used output from a 5000-day simulation with the fullQG3model; the data produced by this simulationwere assimilatedinto the reduced model once a day. The results using longer QG3simulations (not shown) are similar to those using the 5000-dayruns. These encouraging results confirm the practical applicabilityof our parameter estimation approach if used in conjunctionwith the observed data or data produced by a high-resolution,comprehensive model. To assess the performance of the reducedmodels, we ran them for 30000 days, starting from randomlychosen initial states, and then computed PDFs and autocorrelationfunctions of the resolved modes.

4.2. Empirical–dynamical model results

In this subsection we show results obtained by applying themodel reduction methodology described in Section 3.2 to the QG3model. We will compare the performance of the semi-empiricalmodels so obtained by evaluating the properties of the leading PCsproduced by these models. We will use total-energy-norm, as wellas potential-enstrophy-norm EOF bases, but show only the resultsfrom the models constructed in the former basis; the enstrophynorm basis results are quite similar [87].Figs. 5 and 6 show the results produced by the reduced model

with additive noise, damping and sources constructed followingthe procedure outlined in Section 3.2. The PDF of the first EOF[the AO mode] in Fig. 5 is much better captured by the reducedmodelwith additive terms than by the bare-truncationmodelwiththe same ten components [not shown]. There is essentially noclimate drift and variances are reproduced well for most of theEOFs. The main discrepancies between the reduced model (lightsolid, red) and the full QG3 model (heavy solid, blue) include aslightly smaller variance,misplacement of the position of themode(i.e., the maximum of the PDF), a slight drift in EOF-1, a slight driftin EOF-5, and a smaller variance of EOF-7.The results of semi-empirical modeling are thus somewhat en-

couraging in reproducing the full model’s PDF structure. However,this ten-component model with additive terms and linear damp-ing performed considerably worse than the purely statistical EMRmodel in termsof these one-dimensional PDFs (comparewith Fig. 2for EMR), as well as in terms of the PCs’ autocorrelation functions(compare Figs. 4 and 6). In particular, the autocorrelation of firstEOF in the semi-empirical reduced model (Fig. 6) decays muchfaster than that of the full QG3model and also than that of the EMRmodel (Fig. 4). Pronounced spurious higher-frequency oscillationsare also present in the autocorrelation functions of EOFs 2, 3, 7,and 8 of the semi-empirical reduced model. Note, however, thatthe LFV EOFs representing the AO, NAO and PNA (EOFs 1, 4, and5, respectively) are less affected than the modes representing thehigh-wavenumber wavetrains (EOFs 2, 3, 8 and 9).

4.3. MTV model results

Here we describe the results of applying the MTV method-ology combined with the sequential parameter estimation viaKalman filtering to construct a reduced model. As mentioned inSection 3.3, [69] used 1000000-day simulations sampled every1/2 day to evaluate the correlation integrals used in their modelcoefficients (see also Appendix B here). In our work, we usedmuchshorter, 30 000-day simulations sampled daily. The parameter es-timation part of our methodology is designed to correct for modelimperfections introduced by the absence of the scale separationor other deviations from the complete MTV formalism. We con-structed reduced models with and without multiplicative noiseterms, either setting the coefficients λ9 and λ10 in Eq. (26) to zero,or estimating them by using Kalman filter methodology.All reduced models of this subsection, as in Section 4.2, retain

ten resolved components, as determined by the trade-off betweenmodel performance and the computational cost for estimatingits coefficients. For the sequential parameter estimation, we used5000-day-long runs to assimilate data produced by the full QG3model into the reduced model, once a day. We then ran the MTV-reduced models for 30000 days, starting from a randomly choseninitial state, to obtain PDFs and autocorrelation functions of thenine leading PCs. We also compared performance of the reducedmodels with and without multiplicative noises. We illustrate ourresults using the reduced model constructed in the energy-normEOF basis. The results for the potential-enstrophy basis are verysimilar [see [87]].

Model without multiplicative noise. As an initial step of themodel construction, we calculated the first-guess MTV coefficientsbased on a 30000-day QG3 simulation. Then, the model parame-ters were corrected using sequential estimation via Kalman filter-ing. Themultiplicative noise coefficientsλ9 andλ10 in Eq. (26)wereset to zero.Fig. 7 shows how the MTV procedure performs without the ad-

ditional, EKF-based parameter estimation. The variance capturedby EOF-1 is somewhat overestimated, while the variances associ-ated with EOFs 2, 3 and 8 are underestimated. EOFs 1, 3–6, and9 also exhibit a substantial climate drift (non-zero time mean ofthe corresponding PCs), while MTV-estimated PDF maxima (lightsolid, red) for EOFs 3–6 and 9 are displacedwith respect to the QG3model’s maxima (heavy solid, blue). The autocorrelation struc-ture of the full QG3 model is captured reasonably well (see Fig. 8)and does not exhibit spurious oscillations of the semi-empiricalmodel (Fig. 6); however, the correspondence between full-modelandMTVmodel autocorrelations ismuch less striking than that be-tween full and EMR models (Fig. 4).Fig. 9 shows the results based on the MTV model with the

EKF-based parameter estimation. The PDF structure of the EOFs2–9 is drastically improved compared to the MTV model withoutdata assimilation. There is still climate drift present in EOF-1, itsvariance is now underestimated, and its non-Gaussianity is notcaptured. Still, the PDFs of the resolved modes are reproducedbetter than in Fig. 7, showing substantial net improvement due tothe Kalman filtering correction of the model coefficients.The autocorrelation functions produced by the ten-component

MTV-reduced model with sequential parameter estimation areshown in Fig. 10. The autocorrelation of EOF-1 decays much toofast, EOFs 2 and 3 exhibit slight spurious oscillations in theirautocorrelations, while the decay of EOFs 4 and 5 is too slow; thesefeatures are similar to the behavior of theMTVmodel without dataassimilation (Fig. 8). Overall, the MTV model performance, afteran improved estimation of its parameters, is similar or slightlybetter compared to that of the empirical–dynamical model fromthe previous subsection.

Model including multiplicative noise.We also estimated themultiplicative noise coefficients λ9 and λ10 of Eq. (26) to see

Page 12: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

156 K. Strounine et al. / Physica D 239 (2010) 145–166

Fig. 5. Individual PDFs of the first 9 energy-norm EOFs of the ten-component semi-empirical model with linear deterministic and additive stochastic corrections (light solid,red) compared to those of the full QG3 model (heavy solid, blue).

Fig. 6. Individual autocorrelation functions of the first 9 energy-norm EOFs of the ten-component semi-empirical model with linear deterministic and additive stochasticcorrections (light solid, red) compared to those of the full QG3 model (heavy solid, blue).

if multiplicative noise plays an important role in the system’sdynamics. The result is shown in Fig. 11. The PDFs of the leading

modes look very similar to those simulated by the MTV modelwithout multiplicative noise (Fig. 9). This result, along with the

Page 13: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

K. Strounine et al. / Physica D 239 (2010) 145–166 157

Fig. 7. As in Fig. 5, but for the ten-component MTV model in an energy-norm basis (light solid, red) compared to those of the full QG3 model (heavy solid, blue).

Fig. 8. As in Fig. 6, but for the ten-component MTV model in an energy-norm basis (light solid, red) compared to those of the full QG3 model (heavy solid, blue).

fact that the pure empirical models of Section 3.1 perform verywell without multiplicative noise terms, shows that the impactof multiplicative noise on reduced-model behavior is not verysignificant and hence one can neglect it. The autocorrelation

functions are almost indistinguishable from those in Fig. 10 andare not shown.

Summary. The results reported in this section show thatboth semi-empirical [4,49,56] and MTV-type [69,70] reduced

Page 14: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

158 K. Strounine et al. / Physica D 239 (2010) 145–166

Fig. 9. As in Fig. 7, but with EKF-based parameter estimation and no multiplicative noise.

Fig. 10. As in Fig. 8, but for the ten-component MTV model in an energy-norm basis, with EKF-based parameter estimation (light solid, red) compared to those of the fullQG3 model (heavy solid, blue).

models perform reasonably well in reproducing key statisticalproperties of the data simulated by the full QG3 model. Kalman

filtering corrections of the first-guess MTV coefficients eliminatethe climate drifts inherent to MTV models in the absence of

Page 15: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

K. Strounine et al. / Physica D 239 (2010) 145–166 159

Fig. 11. Same as Fig. 9, but with multiplicative noise.

such corrections, and improve the match between the variancescaptured by the leading EOFs of the full and reduced models. Thelatter models, however, fail to reproduce non-Gaussian featuresof the full-model PDFs and greatly underestimate the time scaleof the leading, AO mode, while providing a fairly good match tothe temporal behavior of the other stationary modes, NAO andPNA. Finally, inclusion of multiplicative noise does not appear tohave any substantial influence on the performance of the semi-empirical and MTV-type reduced models, suggesting that theseeffects are secondary in affecting the dynamics of the slowestmodes of variability.When compared to the semi-empirical models of Section 3.2,

the MTV-type models have the advantage of not requiring anystochastic modeling on the level of the resolved components (seeSection 3.3). Overall, both these types of the reduced models re-produce the QG3 model’s statistics to a comparable degree, whilethe former models incur a much lower computational cost. Wewere not able to eliminate entirely, by our use of systematic pa-rameter estimation, errors introduced by truncation of the unre-solved modes. Compared to the purely empirical, EMR models ofSection 3.1 (Figs. 2–4), both semi-empirical and MTV-type mod-els (Sections 4.2 and 4.3, respectively) dynamical models failed toproduce goodness-of-fit to full QG3 statistics comparable to that ofthese EMR models.

4.4. Multi-dimensional PDF estimation

The preceding analyses were based on one-dimensional PDFestimation results. In order to study further how well the reducedmodels perform in capturing nonlinear features of low-frequencydynamics, we study behavior of these models in the phase spacespanned by its leading EOFs using PDF estimation via Gaussianmixtures [94,84,95,2]. We show the corresponding results for thereducedmodels constructed in an energy-normEOF basis only; theresults for the potential-enstrophy norm are very similar [87].

4.4.1. Gaussian mixture modelingGeneral. Gaussian mixtures [96] analysis corresponds to the

episodic description of LFV in terms of multiple flow regimes [97].The regimes correspond to the k clusters obtained by the Gaussianmixture modeling applied in the subspace of d leading EOFs.The optimal number k∗ of clusters is determined using a built-in criterion, based on cross-validated log-likelihood estimates.Each data point has a degree of membership in several clusters,depending on its position with respect to each cluster centroidand the weight of that cluster. Based on a combination of thesetwo criteria, one can associate each point with a unique clusterand obtain the composite pattern of anomalies associatedwith thiscluster.

QG3 and EMR model results. The Gaussian mixture PDF of thedata set produced by the QG3 model is shown in Fig. 12, in thesubspace of the QG3 model’s two leading EOFs. The clusters werefound using mixtures of k = 4 Gaussian components. The optimalnumber k∗ of clusters is k∗ = 4, as determined by the cross-validation procedure of Smyth et al. [95]; see also [84,2].The composites over the data points that belong to each of the

ellipses in Fig. 12 represent, in physical space, the patterns of fourplanetary flow regimes [12,37,94,98,99,36,14,95]. The streamfunc-tion anomalies associated with each regime centroid of the QG3model are plotted in Fig. 13. These regimes are associated with theopposite phases of theNorthernHemisphere annularmode, the so-called AO [18,19] and the associated sectorial NAO [17] pattern. InFig. 12, clusters 1, 2, and 3 (NAO+, AO+ andNAO−, respectively) arelocated around the global PDFmaximum, with the centroids of theAO+ and NAO+ lying very close to the global PDFmaximum, whilethe NAO− lies almost precisely under this maximum. Cluster 4(AO−) occupies a distinctive region on the PDF ridge that stretchesalong PC-1. It corresponds to the low-indexphase of theAO [18,19].These four regimes are not identical to but in fairly good agree-

mentwith the observational results [12,97,84,95]. The correspond-ing results for the EMRmodel of Section 3.1 are very similar to the

Page 16: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

160 K. Strounine et al. / Physica D 239 (2010) 145–166

Fig. 12. Mixture-model PDF and clusters of the four leading EOFs for the full QG3 model, as projected on the (EOF-1–EOF-2) plane. The semi-axes of the ellipses equal thestandard deviation σj in each principal direction ej; the singular value σj = µ

1/2j , where µj is the variance associated with the EOF ej (see Section 2.2).

Fig. 13. Mixture-model centroids for the full QG3 model, showing streamfunction anomaly maps at 500 hPa. Negative contours are light and zero contours are omitted,while contour intervals are shown in each panel’s legend.

corresponding EMR model results in [2,3], despite of the use of adifferent EOF basis here, namely energy norm vs. streamfunctionnorm in [2,3]. On the other hand, as it was the case in that previ-ouswork, the topology of the PDF and locations of the clusters cen-troids – and, therefore, the corresponding streamfunction anomaly

patterns – are very close to those in Figs. 12 and 13, respectively:these results are virtually identical to those shown in Fig. 6b of [3]and, therefore, are not shown here. Moreover, the results for theEMR model are quite similar when using the potential-enstrophybasis of EOFs.

Page 17: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

K. Strounine et al. / Physica D 239 (2010) 145–166 161

4.4.2. Gaussian mixture PDFs of the other reduced modelsWe apply now Gaussian mixture modeling to the multi-

dimensional PDFs produced by the semi-empirical and MTV-typereduced models constructed in the energy-norm EOF basis. Doingso will allow us to verify whether the non-Gaussianity of the fullQG3 model’s behavior is captured by any of them.

Semi-empirical model. Fig. 14 shows the centroids obtainedfor the output of the empirical–dynamical model of Section 3.2.Cluster AO− still occupies a distinctive region on the approximatelystraight PDF ridge that now stretches in a direction at an angleclose to the bisector of the +PC-1 and −PC-2 axes. This slope isthe major difference between this PDF, on the one hand, and thefull-model PDF shown in Fig. 13 and EMR model PDF [not shownhere; see, however Fig. 6b in [3]], on the other. Other differencesinclude the fact that the relative positions of the other three clus-ters are substantiallymodified and that all four centroids are closerto the global PDF maximum.These results show that the empirical–dynamicalmodel still re-

produces to a certain extent non-Gaussianity and multiple-regimebehavior of the full QG3model. The slope of the ridge in the multi-dimensional PDF of Fig. 14 makes the PDF projection on either ofthe EOF axis much more Gaussian-looking, explaining apparentnear-Gaussianity of the one-dimensional PDF seen, for example, inFig. 5 (light solid, red line).

MTV model. Fig. 15 illustrates the multi-dimensional PDF fortheMTV-type reducedmodel, withoutmultiplicative noise and theparameters estimated sequentially, as in Fig. 11; the results for thereduced model with multiplicative noise are quite similar [87]. Incontrast to the PDFs in Figs. 12 and 14, this PDF is very close toGaussian, with a fairly symmetric shape and all centroids locatedvery close to the global PDF maximum. This means that the dy-namical model constructed using MTV methodology, even whenimproved by our systematic parameter estimation approach (com-pare Figs. 7 and 9 of Section 4.3), fails to capture non-Gaussianity ofthe full QG3model’s PDF, aswell as the associated regimebehavior.

4.4.3. SummaryThe multi-dimensional PDF analysis of this section reveals that

the empirical–dynamical models of Section 3.2 capture the regimebehavior of the full QG3 model better than the reduced modelsbased on the MTV approach (Section 3.3). Once again, both thesemi-empirical and MTV-type models perform less well than thepurely empirical EMR model (Section 3.1), which reproduces boththe linear statistics of the full QG3 model and its regime behaviorquite well [2,3].

5. Concluding remarks

5.1. Summary

This study applied different mode reduction strategies to ahigh-dimensional atmospheric model, which produces a reason-ably realistic mean climate, as well as low-frequency variabil-ity (LFV) in Earth’s Northern Hemisphere. This quasi-geostrophic,three-level (QG3) model [1], with its J = 693 discrete variables,is a reasonable compromise between detailed simulation of mid-latitude LFV behavior and flexible use for theoretical studies [58,70,84,2,56].Low-dimensional, reduced models were constructed in sub-

spaces of leading empirical orthogonal functions (EOFs) obtainedusing inner products for the covariance matrices that were basedon either the energy norm or the potential-enstrophy norm ofthe flow field. The construction of the reduced models proceededin several distinct ways. The reduced models considered were

based mainly of three approaches: (i) a purely empirical, multi-level regression procedure [3]; (ii) a semi-empirical, or empiri-cal–dynamical approach, which retains only a few components inthe projection of the full-model equations onto a specified basis,and finds the linear deterministic and stochastic corrections em-pirically [5]; and (iii) a purely dynamics-based technique, employ-ing the stochastic mode reduction approach of [62], dubbed theMTV approach by its originators.The empirical–dynamical and MTV-type reduced models were

further improved by sequential parameter estimation using the ex-tended Kalman filter (EKF) [79–81]. The statistical characteristicsof the simulated data sets produced by the reduced models werecompared to those of the full QG3 model’s simulations. The purelyempirical model reduction (EMR) models provided a benchmarkfor the intercomparison of the other reduced models.The EMRmodels use only the broadest and most innocuous as-

sumptions about the system’s dynamics and apply multiple poly-nomial regression to obtain the coefficients of the deterministicdynamical operator, whose functional form is pre-specified to beof polynomial character. The empirical–dynamicalmodels uses thebare truncation of the full QG3 model for the deterministic dy-namical operator, plus additive noise, source and linear dampingterms. Finally, the MTV-type reduced models use standard projec-tion methods from the theory of stochastic differential equations[65–67] to derive both the deterministic and the stochastic part ofthe reduced dynamics.The EMR approach is not based in any way on the information

contained in a dynamical model that may have produced the data;instead, it applies a purely data-driven methodology by assuminga general functional form of the reduced model and reconstruct-ing its parameters by regression. It thus has the advantage of ap-plying equally well to constructing dynamical models by using ob-servational data alone [89], when the governing equations are notknown at all or are too cumbersome to use in a given application.EMRmodels have produced excellent goodness-of-fit results whencompared to the full QG3model [2], aswell as to observational sea-surface temperature data [89], and to several other problems [45].The EMR regression procedure parameterizes the systems dynam-ics in purely statistical terms and is thus very flexible; for example,Kondrashov et al. [2] fit EMR models using the QG3s time seriesof the 500-mb streamfunction field alone – vs. the full three-levelstreamfunction used here – and obtained essentially the same-quality match between the full- and reduced-model behavior aswe do in the present study. A detailed discussion of the reasons forthe more parsimonious, barotropic EMR models success goes be-yond the purview of this purely methodological paper.These models, however, do not automatically provide causal,

dynamical information on the statistical features of the data setthat they manage to capture. Moreover, EMRmodels may produceunstable climates since their nonlinearities do not conservequadratic invariants such as total energy or (potential) enstrophy.In an attempt to obtain deeper insights into the nature of mid-latitude atmospheric LFV, we merely used this type of reducedmodels to benchmark the empirical–dynamical and MTV-typemodels’ performance. To provide a level ‘‘playing field’’ for all threeapproaches, all reduced models retained R = 10 leading EOFs astheir resolved modes, while the other J−R = 693−10modes wereconsidered as unresolved and modeled stochastically.Empirical–dynamical reduced models were constructed by

specifying the nonlinear terms to be the same as in the full QG3model, while fitting the coefficients of the linear terms, sourcesand additive noise via EKFmethodology [93]. The full-model equa-tions were projected on the truncated EOF basis of choice; theorthogonality of the EOF basis with respect to the chosen in-ner product and conservation properties of the QG3 model’s non-linearities guarantee that the projected equations conserve the

Page 18: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

162 K. Strounine et al. / Physica D 239 (2010) 145–166

Fig. 14. Same as in Fig. 12, but for the semi-empirical model.

Fig. 15. Same as in Fig. 12, but for the MTV model with EKF-based parameter estimation.

appropriate quadratic invariant at any truncation. Interactionsbetween resolved and unresolved modes were then modeled bya simple Ornstein–Uhlenbeck stochastic process. The coefficientsof this stochastic process were estimated via sequential parameterestimation.The semi-empirical reduced models thus obtained reproduce

the statistics of theQG3model’s simulated data set reasonablywellfor both the EOF bases consistent with the energy and potential-enstrophy norm, respectively. The climate drift present in the lead-ing modes of the bare-truncation model – i.e., in the absence ofany EKF-based correction – was virtually eliminated in the cor-rected models, and the variance of the retained EOFs was wellcaptured. We found, however, that even the EKF-corrected em-pirical–dynamical models do not fully capture the non-Gaussian

features of the QG3 model’s LFV. Overall, the best semi-empiricalmodels failed to capture the full-model statistics we tested to thesame extent as the purely empirical, EMRmodels. A reason for thisshortfall might be that empirical–dynamical reduced models needmore degrees of freedom to reproduce the full model’s statisticalproperties as well as the EMR models did.In the MTV stochastic mode reduction approach, no assump-

tions about system dynamics – except for the one of quadraticnonlinearity – are made on the level of the resolved variables. Thisapproach, however, has two disadvantages which we attemptedto compensate for by using sequential parameter estimation. First,the main assumption of the MTV approach is that there is timescale separation between the resolved and unresolved modes; inother words, the parameter that measures the ratio of the integral

Page 19: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

K. Strounine et al. / Physica D 239 (2010) 145–166 163

correlation time of the slowest unresolved mode to the fastest re-solved mode is assumed to be close to zero. This assumption doesnot hold for the QG3 model used in the present study; nor does ithold for other realistic atmospheric models that we are aware of.Second, calculating the optimal coefficients for the resolvedmodeswould require an enormous computational effort for realistic high-dimensionalmodels. Franzke et al. [69] and Franzke andMajda [70]had to use million-day-long model simulations to properly deter-mine these coefficients.Since the typical number of days of observational data for at-

mospheric LFV is a few thousand, at best, and the simulation timeof high-resolution, complete-physics climate models is also quitelimited, we used the above-mentioned truncated EOF basis, as wellas shorter time series to estimate the correlation integrals ne-cessitated by the MTV approach. Sequential parameter estimationworked quite well in correcting the mean and variance of the re-solved modes, even with these much shorter QG3-simulated datasets. We also learned that including multiplicative noise in theeffective equations, as proposed by the MTV approach, does notshow any significant improvement in the reduced-model perfor-mance. Still, even when enhancing the MTV-type reduced modelsby sequential parameter estimation, these models failed to cap-ture the non-Gaussianity of the QG3 model’s data set: the modi-fied MTV models produced essentially Gaussian one-dimensionalPDFs, and could not identify the multiple-regime behavior presentin the full QG3 model and the observational data for the NorthernHemisphere.

5.2. Discussion

Overall, both the semi-empirical and the MTV-type reducedmodels failed to produce goodness-of-fit close to that of the EMRmodels for the statistical properties examined; these propertiesincluded one-dimensional PDFs and autocorrelations, as well asmulti-dimensional PDFs and multiple-regime behavior. The per-formance of these two types of models, while less impressive thanthat of the EMR models, was comparable between the two typesin terms of ‘‘linear’’ statistical characteristics, such as the one-dimensional autocorrelation functions, with somewhat better re-sults produced by MTV models. However, these MTV-type modelsperformed considerably worse than the semi-empirical models incapturing the multiple-regime behavior of the full QG3 model andthe atmospheric observations.Our examination of the three types of the reduced models

thus shows that their performance degrades as fewer statisticalcorrections are employed to construct the model: the purely em-pirical models perform best of all, the statistically enhanced bare-truncation models or EKF-corrected MTV models do worse, butstill capture some linear and, for the case of semi-empirical mod-els, nonlinear characteristics of LFV, while the theoretically mostpromising, dynamics-based MTV approach without data assimila-tion produces the worst results in capturing full-model behavior,including substantial climate drift and wrong variances of the sim-ulated EOF modes.Comparison of two distinct versions of the MTV-type models

does provide some insight into the dynamics behind atmosphericLFV, by suggesting that the interaction between fast and slowmodes – at least as modeled by ‘‘multiplicative noise’’ – is sec-ondary. This result might imply that the so-called synoptic-eddyfeedback [26,100,32,101] could be less important in determiningthe properties of atmospheric LFV than the interactions among theLFV modes themselves.We need to point out two apparent differences in our conclu-

sions regarding the ability of MTV-type models to capture the fullQG3 models behavior vs. the conclusions drawn by Franzke andMajda [69]. In particular, the latter authors appliedMTV analysis to

the hemispheric version of the QG3model, in contrast to the globalversion used in the present study; they then argued that the bestMTV model version is the one with downscaled bare-truncationterms and upscaled augmented linearity and multiplicative-triadterms.We think that the quantitative differences between theMTVmodel performance here and in [69] are due, at least in part, todifferent, global vs. hemispheric versions of the QG3 model be-ing studied. More importantly, we note that our MTV model us-ing EKF-based corrections performs as well or better than the bestmodel of [69] in capturing the probability density and autocorre-lation functions of the full QG3 model, despite our neglecting thecubic nonlinearity terms in the reducedmodels. Franzke andMajdadid not attempt to consider their reducedmodel behavior with cu-bic nonlinearities suppressed, so there is no factual contradictionbetween our results. Furthermore, the PDFs of the reduced modelsin [69] are all nearly Gaussian, which is also consistent with ourresults.Our overall conclusion is that a satisfactory theory of mid-

latitude LFV has to explain the surprisingly good performance oflow-order EMR models. The limitations of the two other types ofreduced models suggest that such a theory is still to be developed.In spite of the fact that EMR models do not automatically providesuch a theory — i.e., their quadratically nonlinear dynamics andmulti-level driving noise do not provide an easy way to ‘‘read’’ thehidden causality of the fullmodel’s complex behavior directly fromthemodel’s analytical form– they can be used for providing impor-tant insights.First of all, one can combine EMR methodology and the semi-

empirical approach by introducing the multi-level regressionmodeling of the residual tendencies into the empirical–dynamicalmodels. Doing so would still allow one to interpret the quadraticterms of such a semi-empirical models main level as representingenergy exchange between the resolved modes.Second, [2] showed how to use the EMR models’ much greater

flexibility in studying the dynamic and stochastic contributions toobtaining the multiple regimes and the intraseasonal oscillationspresent in the QG3 model and in atmospheric observations ofLFV; see also [34]. They applied standard tools from numericalbifurcation theory for deterministic dynamical systems to thequadratically nonlinear deterministic operator of their optimalEMR model, and used a continuation method on the variance ofthe multi-level noise process. This somewhat ad hoc combinationof continuation methods allowed them to move all the way fromfixed points of the deterministic operator to the multiple regimesof the complete EMR model, with its optimal parameter values.Finally, the relatively recent theory of random dynamical sys-

tems [102] might allow one to proceed from the ad hoc combina-tion of continuation methods in [2] to a more systematic way ofextracting the causal mechanisms of atmospheric LFV. For in-stance, Ghil et al. [103] applied concepts and methods from ran-dom dynamical systems to problems of natural and anthropogenicclimate change on interdecadal time scales. These authors showed,on the hand of simple examples, that the smoothing effect of thenoise may permit a more parsimonious coarse-grained classifica-tion of a model’s behavior in its phase-parameter space than in thepurely deterministic case. Such coarse-graining of behavior mightapply not only to the reduced model of interest, whether of EMR-type or of the other two classes we considered in this paper, butalso to the full model or the actual observational data sets under-lying the entire problem under study.

Acknowledgements

This paper is based on Kirill Strounine’s Ph.D. dissertation inUCLAs Atmospheric and Oceanic Sciences Department, under thejoint guidance of M. Ghil and S. Kravtsov. It is a pleasure to thank

Page 20: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

164 K. Strounine et al. / Physica D 239 (2010) 145–166

J.D. Neelin and J.C. McWilliams for useful comments and sugges-tions, and A. Majda for office facilities at New York UniversitysCourant Institute of Mathematical Sciences, while KS carried outhis work in the New York area. The comments and suggestions oftwo anonymous reviewers helped us to improve the manuscriptpresentation.We acknowledge the financial support byDOE grantsDE-FG-03-01ER63260 and DE-FG02-07ER64428 from the ClimateChange Prediction Program, NASA grant NNG-06-AG66G-1, andNSF grant ATM00-82131 from the Climate and Large-Scale Dynam-ics Program; SKwas also supported by UWMResearch Growth ini-tiative awards for 2006 and 2007.

Appendix A. Parameter estimation methodology

Herewe give a brief account of the parameter estimationmeth-ods used in this paper. The main idea behind this methodology isusing the higher-order statistics of a given time series to estimatethe unknown parameters of the noise covariancematrix; these pa-rameters are estimated by using a secondary Kalman filter. The pa-rameters of the deterministic part of the stochastic process x, aswell as the statistics used as an input for the secondary Kalman fil-ter are estimated by the primary EKF, along with the state vectorof the system.

Primary filter. Consider first a discrete nonlinear stochastic-dynamic system

wt(k) = 9(k− 1)wt(k− 1)+ B(k− 1)bt(k− 1) (A.1)

with observations

wo(k) = H(k)wt(k)+ bo(k), (A.2)

where the n-vector wt is the true state at the time k, wo(k) isthe p-vector of observations, bt(k) is the q-vector of system noiseand bo(k) is the p-vector of observational noise. Matrices 9(k),B(k), andH(k) are uniformly completely observable and uniformlycompletely controllable. The system noise bt and observationalnoise bo are centered, Gaussian, white in time, and uncorrelatedwith each other:

Ebt = 0, Ebt(k)bt(l)T = δklQ, (A.3)

Ebo = 0, Ebo(k)bo(l)T = δklR, (A.4)

Ebo(l)bt(l)T = 0. (A.5)

Under the stated assumptions, the Kalman filter is the optimaldata assimilation system, and it is given by:

wf (k) = 9(k− 1)wa(k− 1),Pf (k) = 9(k− 1)Pa(k− 1)9(k− 1)T

+ B(k− 1)Q(k− 1)B(k− 1)T,

K(k) = Pf (k)H(k)T(H(k)Pf(k)H(k)T + R(k))−1, (A.6)Pa(k)= (I− K(k)H(k))Pf (k),

wa(k) = wf (k)+ K(k)(wo(k)− H(k)wf (k)),

where K(k) is the Kalman gain matrix, while Pf (k) and Pa(k) areforecast and analysis error covariance matrices defined by

Pf ,a(k) = E(wf ,a(k)−wt(k))(wf ,a(k)−wt(k))T. (A.7)

In the present context, observational noise bo(k) is zero sinceour observations are in fact simulated data obtained by integratingthe QG3 model. Furthermore, matrices H and B are stationary.In the idealized situation above, when the dynamics of the sys-

tem, aswell as the observations, are linear in the state variables andare disturbed only by additive white Gaussian noise with knownfirst- and second-order statistics, the Kalman filter yields optimalstate estimates. In practice, even when the state and observation

models are linear, filters are generally suboptimal because insuffi-cient information about the noise statistics is available. If themodeldynamics includes nonlinearities, one of the simplest approximatenonlinear filters is the extended Kalman filter (EKF). In the EKF, thenow nonlinear dynamics operator 9(w(k)) is replaced by its lin-earization around the current analyzed state of the systemwa(k):

ψ(k) =∂9

∂w

∣∣∣∣w=wa(k)

. (A.8)

This replacement is only used in the covariance evolution equationfor Pf (k), while the state itself is still allowed to evolvewith the fullnonlinear operator9(w(k), k).We use methodology based on a secondary Kalman filter to

estimate the noise parameters [93]. This methodology invokesthe assumption that the noise covariance matrix is stationary anddepends on the parameters linearly, so that

Q =N∑i=1

αiQi, (A.9)

where N is a number of noise parameters. For the truly optimalKalman filter of Eqs. (A.1)–(A.7), the innovation sequence v(k) =wo(k) − Hwf (k) is a white noise: in this case, the filter extractsall the useful information from observations. It can be shown that,under the assumption (A.9), the lag-correlation function of theinnovations v(k) is also a linear function of these unknown noiseparameters [104] so that

Ev(k)v(k− l) =N∑i=1

αiFi(k, l)+ F(k, l), (A.10)

where the matrices Fi(k, l) are known and can be calculatedrecursively, while the residue F(k, l) is small and bounded for all k:

Ev(k)v(k− l) ≈N∑i=1

αiFi(k, l). (A.11)

It follows from (A.11) that

v(k)v(k− l) =N∑i=1

αiFi(k, l)+ Γ (k, l), (A.12)

where Γ (k, l) is a random matrix with EΓ (k, l) = 0. The samplelag-covariances on the left side of (A.12) and the matrices Fi(k, l)under the summation are both known and are obtained from theprimary filter output.

Secondary filter. The approach of Dee et al. [93] is to regard αas a secondary state vector, to be estimated via a secondary filter.Eq. (A.12) is the observation model for α, Fi(k, l) are blocks of theobservation matrix, and Γ (k, l) is the observational noise.Given this observational model one can derive a secondary

Kalman filter to estimate α. To do that we need to rewrite the ob-servationmodel (A.12) in amore convenient vector form. Let vec Adenote the p2-vector obtained by concatenating the columns ai ofany p× pmatrix A. Then we define

σ(k, l) = vec[v(k)v(l)T],G(k, l) = [vec F1(k, l), . . . , vec FN(k, l)], (A.13)η(k, l) = vec Γ (k, l),

where σ and η are p2-vectors and G is a p2×N matrix. We define acounter t = t(k, l) that corresponds to the order in which the sam-ple lag-covariances are generated.We now canwrite equations for

Page 21: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

K. Strounine et al. / Physica D 239 (2010) 145–166 165

the state and observation models for the secondary state vector α:α(t) = α(t − 1), (A.14)σ(t) = Γ (t)α(t)+ η(t). (A.15)Consider now the secondary observation noise η(t), which has

mean zero and covariance matrixW(t1, t2;α), so thatW(t1, t2;α) = E[σ(k1, l1)− Eσ(k1, l1)][σ(k2, l2)− Eσ(k2, l2)]T.

(A.16)According to (A.13),W is therefore a function of fourth-order mo-ments of the innovations v(k), which in turn depends on α. Theinherently nonlinear nature of parameter estimation, even in thepresence of the linear dynamics 9 and observations H, is man-ifested in this dependence of W on α. The problem is linearizedupon approximatingW(t1, t2;α) byWe(t1, t2) = W(t1, t2;αe). Ifαe = α, the primary filter would be optimal and, by the innovationproperty of the Kalman filter, the innovation sequence v(k)wouldbewhite and Gaussian. It then follows thatWe(t1, t2) = 0 if t1 6= t2and we only need to consider

We(t) = W(t1, t2;αe). (A.17)With this linearization, the Kalman filter for the secondary

model (A.14) isKα(t) = Θ(t − 1)Γ (t)T[Γ (t)Θ(t − 1)Γ (t)T +We(t)]−1,

αa(t) = αa(t − 1)+ Kα(t)[σ(t)− Γ (t)αa(t − 1)], (A.18)Θ(t) = [I− Kα(t)Γ (t)]Θ(t − 1),whereαa(t) denotes the estimate ofα(t) andΘ(t) is the estimatederror covariance. The filter is provided initial values αa(0) = αe

andΘ(0), the latter being an a priori estimate of E(αe−α)(αe−α)T.Dee, et al. [93] developed the methodology described above

for a linear problem. We extend it here to the nonlinear systemsof interest in the context of reduced LFV models. The successwith which noise covariances are estimated will depend on howwell a given nonlinear primary filter maintains ‖F(k, l)‖ small andbounded. As we saw in Sections 4.2 and 4.3, this was indeed thecase for the semi-empirical andMTV-type models under consider-ation.

Appendix B. Expressions for MTV coefficients

The constant tensors in Eqs. (21) and (22) are computed asfollows:

Q (1)ij =12

∑lk

∫∞

−∞

dtLαβil Lαβ

jk Blk(t), (B.1)

Q (2)ij =12

∑lk

∫∞

−∞

dtBαββikl Bαββ

jmn (Bkm(t)Bln(t)+Bkn(t)Blm(t)),

(B.2)

Uijk =∑ln

∫∞

−∞

dt(Lαβil Bααβ

jkn + Lαβ

jl Bααβ

ikn )Bln(t), (B.3)

Vjikl =∑mn

∫∞

−∞

dtBααβikm Bααβ

jln Bmn(t), (B.4)

where the quantityBij(s) is defined as

Bij(s) = Bji(−s) = limT→∞

1T

∫ T

0dtβi(t)βj(t + s). (B.5)

It can be shown that the left-hand sides in Eqs. (21) and (22) arepositive definite m × m tensors so σ (1)ij (α(t)) and σ

(2)ij exist and

can be calculated by Cholesky decomposition. Note that the noisecovariance matrix depends nonlinearly on the large-scale flow α.The quantities Lij, Bijk and Mijkl in (20) are given by expressions of

the type (B.1)–(B.4), involvingweighted integrals of the unresolvedprocesses’ autocorrelation (B.5).

References

[1] J. Marshall, F. Molteni, Toward a dynamical understanding of atmosphericweather regimes, J. Atmospheric Sci. 50 (1993) 1792–1818.

[2] D. Kondrashov, S. Kravtsov, M. Ghil, Empirical mode reduction in a modelof extratropical low-frequency variability, J. Atmospheric Sci. 63 (2006)1859–1877.

[3] S. Kravtsov, D. Kondrashov, M. Ghil, Multi-level regression modeling ofnonlinear processes: Derivation and applications to climatic variability, J.Climate 18 (2005) 4404–4424.

[4] U. Achatz, G.W. Branstator, A two-layer model with empirical linearcorrections and reduced order for studies of internal climate variability, J.Atmospheric Sci. 56 (1999) 3140–3160.

[5] F.M. Selten, An efficient description of the dynamics of the barotropic flow, J.Atmospheric Sci. 52 (1995) 915–936.

[6] A.J. Majda, I. Timofeyev, E. Vanden-Eijnden, Models for stochastic climateprediction, Proc. Natl. Acad. Sci. USA 96 (1999) 14687–14691.

[7] J.M. Wallace, The climatological mean stationary waves: Observationalevidence, in: B.J. Hoskins, R.P. Pearce (Eds.), Large-Scale Dynamical Processesin the Atmosphere, Academic Press, 1983, pp. 27–53.

[8] G. Branstator, A striking example of the atmospheric leading travelingpattern, J. Atmospheric Sci. 44 (1987) 2310–2323.

[9] M. Ghil, K.C. Mo, Intraseasonal oscillations in the global atmosphere. Part I:Northern Hemisphere and tropics, J. Atmospheric Sci. 48 (1991) 752–779.

[10] M. Ghil, K.C. Mo, Intraseasonal oscillations in the global atmosphere. Part II:Southern Hemisphere, J. Atmospheric Sci. 48 (1991) 780–790.

[11] Y. Kushnir, Retrograding wintertime low-frequency disturbances over theNorth Pacific ocean, J. Atmospheric Sci. 44 (1987) 2727–2742.

[12] X. Cheng, J.M. Wallace, Cluster analysis of the Northern Hemispherewintertime 500-hPa height field: Spatial patterns, J. Atmospheric Sci. 50(1993) 2674–2696.

[13] M. Ghil, Dynamics, statistics and predictability of planetary flow regimes,in: C. Nicolis, G. Nicolis (Eds.), Irreversible Phenomena and DynamicalSystems Analysis in the Geosciences, D. Reidel, Dordrecht, Boston, Lancaster,1987, pp. 241–283.

[14] K. Mo, M. Ghil, Statistics and dynamics of persistent anomalies, J.Atmospheric Sci. 44 (1987) 877–901.

[15] M. Ghil, A.W. Robertson, Solving problems with GCMs: General circulationmodels and their role in the climate modeling hierarchy, in: D. Randall (Ed.),General CirculationModel Development: Past, Present and Future, AcademicPress, 2000, pp. 285–325.

[16] G. Plaut, R. Vautard, Spells of low-frequency oscillations andweather regimesin the Northern Hemisphere, J. Atmospheric Sci. 51 (1994) 210–236.

[17] J.E. Hurrel, Decadal trends in the North Atlantic Oscillation: Regionaltemperatures and precipitation, Science 269 (1995) 676–679.

[18] C. Deser, On the teleconnectivity of the arctic oscillation, Geophys. Res. Lett.27 (2000) 779–782.

[19] J.M.Wallace, North Atlantic Oscillation/annularmode: Two paradigms—Onephenomenon, Q. J. R. Meteorol Soc. 126 (2000) 791–805.

[20] J.M. Wallace, D.S. Gutzler, Teleconnections in the geopotential height fieldduring the Northern Hemisphere Winter, Mon. Weather Rev. 109 (1981)784–812.

[21] A.W. Robertson, Influence of ocean–atmosphere interaction on the ArcticOscillation in two general circulation models, J. Climate 14 (2001)3240–3254.

[22] G. Branstator, Organization of storm track anomalies by recurring low-frequency circulation anomalies, J. Atmospheric Sci. 52 (1995) 207–226.

[23] T.J. Cuff, M. Cai, Interaction between the low- and high-frequency transientsin the Southern Hemisphere winter circulation, Tellus 47A (1995) 331–350.

[24] R.M. Dole, N.D. Gordon, Persistent anomalies of the extratropical NorthernHemisphere wintertime circulation—Geographical-distribution and regionalpersistence characteristics, Mon. Weather Rev. 111 (1983) 1567–1586.

[25] L. Illari, A diagnostic study of the potential vorticity in a warm blockinganticyclone, J. Atmospheric Sci. 41 (1984) 3518–3526.

[26] S. Koo, A.W. Robertson, M. Ghil, Multiple regimes and low-frequencyoscillations in the Southern Hemisphere’s zonal-mean flow, J. Geophys. Res.(2003) doi:10.1029/2001JD001353.

[27] J. Namias, Thirty-day forecasting: A review of a ten-year experiment,in: Meteor. Monogr., vol. 6, Amer. Meteor. Soc., 1953, pp. 1–83.

[28] A.W. Robertson, W. Metz, Three-dimensional instability of persistentanomalous large-scale flows, J. Atmospheric Sci. 46 (1989) 2783–2801.

[29] A.W. Robertson, W. Metz, Transient eddy feedbacks derived from lineartheory and observations, J. Atmospheric Sci. 47 (1990) 2743–2764.

[30] G.J. Shutts, The propagation of eddies in diffluent jet streams: Eddy vorticityforcing of blocking flow fields, Q. J. R. Meteorol Soc. 109 (1983) 737–761.

[31] J.-Y. Yu, D.L. Hartmann, Zonal flow vacillation and eddy forcing in a simpleGCM of the atmosphere, J. Atmospheric Sci. 50 (1993) 3244–3259.

[32] B.B. Reinhold, R.T. Pierrehumbert, Dynamics of weather regimes: Quasi-stationary waves and blocking, Mon. Weather Rev. 110 (1982) 1105–1145.

[33] E. Da Costa, R. Vautard, A qualitative realistic low-order model of theextratropical low-frequency variability built from long records of potentialvorticity, J. Atmospheric Sci. 54 (1997) 1064–1084.

[34] M. Ghil, A.W. Robertson,Waves vs. particles in the atmospheres phase space:A pathway to long-range forecasting? Proc. Natl. Acad. Sci. 99 (Suppl. 1)(2002) 2493–2500.

Page 22: Reduced models of atmospheric low-frequency variability: Parameter estimation and comparative performance

166 K. Strounine et al. / Physica D 239 (2010) 145–166

[35] H. Itoh, M. Kimoto, Weather regimes, low-frequency oscillations, andprincipal patterns of variability: A perspective of extratropical low-frequencyvariability, J. Atmospheric Sci. 56 (1999) 2684–2705.

[36] B. Legras, M. Ghil, Persistent anomalies, blocking and variations inatmospheric predictability, J. Atmospheric Sci. 42 (1985) 433–47l.

[37] M. Ghil, S. Childress, Topics in Geophysical Fluid Dynamics: AtmosphericDynamics, Dynamo Theory and Climate Dynamics, Springer-Verlag, NewYork, Berlin, London, Paris, Tokyo, 1987, 485 pp.

[38] C. Penland, Random forcing and forecasting using principal oscillationpattern analysis, Mon. Weather Rev. 117 (1989) 2165–2185.

[39] C. Penland, A stochastic model of Indo-Pacific sea-surface temperatureanomalies, Physica D 98 (1996) 534–558.

[40] C. Penland, M. Ghil, Forecasting Northern Hemisphere 700-mb geopotentialheight anomalies using empirical normal modes, Mon. Weather Rev. 121(1993) 2355–2372.

[41] S.D. Johnson, D.S. Battisti, E.S. Sarachik, Empirically derived Markov modelsand prediction of tropical Pacific sea surface temperature anomalies, J.Climate 13 (2000) 3–17.

[42] C. Penland, P.D. Sardeshmukh, The optimal growth of tropical sea-surfacetemperature anomalies, J. Climate 8 (1995) 1999–2024.

[43] C. Penland, L. Matrosova, Prediction of tropical Atlantic sea-surfacetemperatures using linear inverse modeling, J. Climate 11 (1998) 483–496.

[44] C.R. Winkler, M. Newman, P.D. Sardeshmukh, A linear model of wintertimelow-frequency variability. Part I: Formulation and forecast skill, J. Climate 14(2001) 4474–4494.

[45] S. Kravtsov, D. Kondrashov, M. Ghil, Empirical model reduction and themodelling hierarchy in climate dynamics and the geosciences, in: T. Palmer,P. Williams (Eds.), Stochastic Physics and Climate Modelling, CambridgeUniversity Press, 2009, pp. 35–72.

[46] B.F. Farrell, P.J. Ioannou, Stochastic forcing of the linearized Navier–Stokesequations, Phys. Fluids A 5 (1993) 2600–2609.

[47] B.F. Farrell, P.J. Ioannou, Stochastic dynamics of the midlatitude atmosphericjet, J. Atmospheric Sci. 52 (1995) 1642–1656.

[48] R.W. Preisendorfer, Principal Component Analysis in Meteorology andOceanography, Elsevier, New York, 1988, p. 425.

[49] S.D. Schubert, A statistical–dynamical study of empirically determinedmodes of atmospheric variability, J. Atmospheric Sci. 42 (1985) 3–17.

[50] K. Hasselman, PIPs and POPs: The reduction of complex dynamical systemsusing principal interaction and oscillation patterns, J. Geophys. Res. 93 (1988)11015–11021.

[51] F. Kwasniok, The reduction of complex dynamical systems using principalinteraction patterns, Physica D 92 (1996) 28–60.

[52] F. Kwasniok, Empirical low-order models of barotropic flow, J. AtmosphericSci. 61 (2004) 235–245.

[53] F. Kwasniok, Reduced atmospheric models using dynamically motivatedbasis functions, J. Atmospheric Sci. 64 (2007) 3452–3474.

[54] M.D.Mundt, J.E. Hart, Secondary instability, EOF reduction, and the transitionto baroclinic chaos, Physica D 78 (1994) 65–92.

[55] J. Rinne, V. Karhila, A spectral barotropic model in horizontal empiricalorthogonal functions, Quart. J. Roy. Meteor. Soc. 101 (1975) 365–382.

[56] F.M. Selten, Baroclinic empirical orthogonal functions as basis functions in anatmospheric model, J. Atmospheric Sci. 54 (1997) 2100–2114.

[57] L. Sirovich, J.D. Rodriguez, Coherent structures and chaos—Amodel problem,Phys. Lett. 120 (1987) 211–214.

[58] F. D’Andrea, Extratropical low-frequency variability as a low-dimensionalproblem. Part II: Stationarity and stability of large-scale equilibria, Q. J. R.Meteorol. Soc. 128 (2002) 1059–1073.

[59] F. D’Andrea, R. Vautard, Extratropical low-frequency variability as a low-dimensional problem. Part I: A simplified model, Q. J. R. Meteorol. Soc. 127(2001) 1357–1374.

[60] D.T. Crommelin, A.J. Majda, Strategies for model reduction: Comparingdifferent optimal bases, J. Atmospheric Sci. 61 (2004) 2206–2217.

[61] T. DelSole, Optimally persistent patterns in time-varying fields, J. Atmo-spheric Sci. 58 (2001) 1341–1356.

[62] A.J. Majda, I. Timofeyev, E. Vanden-Eijnden, A mathematical framework forstochastic climate models, Commun. Pure Appl. Math. 54 (2001) 891–974.

[63] A.J. Majda, I. Timofeyev, E. Vanden-Eijnden, A priori test of a stochastic modereduction strategy, Physica D 170 (2002) 206–252.

[64] A.J. Majda, I. Timofeyev, E. Vanden-Eijnden, Systematic strategies forstochastic mode reduction in climate, J. Atmospheric Sci. 60 (2003)1705–1722.

[65] C.W. Gardiner, Handbook of Stochastic Methods, Springer-Verlag, 1985, 442pp..

[66] R.Z. Khasminsky, Principle of averaging for parabolic and elliptic differentialequations and for Markov processes with small diffusion, Theory Probab.Appl. 8 (1963) 1–21.

[67] T.G. Kurtz, A limit theorem for perturbed operators semigroups withapplications to random evolution, J. Funct. Anal. 12 (1973) 55–67.

[68] A.J. Majda, I. Timofeyev, Low dimensional chaotic dynamics versus intrinsicstochastic chaos: A paradigm model, Physica D 199 (2004) 339–368.

[69] C. Franzke, A.J. Majda, E. Vanden-Eijnden, Low-order stochastic modereduction for a realistic barotropic model climate, J. Atmospheric Sci. 62(2005) 1722–1745.

[70] C. Franzke, A.J. Majda, Low order stochastic mode reduction for a prototypeatmospheric GCM, J. Atmospheric Sci. 63 (2006) 457–479.

[71] M. Ghil, S. Cohn, J. Tavantzis, K. Bube, E. Isaacson, Applications of estimationtheory to numerical weather prediction, in: L. Bengtsson, M. Ghil (Eds.),Dynamic Meteorology: Data Assimilation Methods, Springer Verlag, 1981,pp. 139–224.

[72] R.E. Kalman, A new approach to linear filtering and prediction problems,Trans. ASME 82D (1960) 35–45.

[73] R.N. Miller, M.A. Cane, A Kalman filter analysis of sea-level height in thetropical Pacific, J. Phys. Oceanogr. 19 (1989) 773–790.

[74] P. Courtier, O. Talagrand, Variational assimilation of meteorological observa-tions with the direct and adjoint shallow-water equations, Tellus 42A (1990)531–549.

[75] Y. Sasaki, Some basic formalisms in numerical variational analysis, Mon.Weather Rev. 98 (1970) 875–883.

[76] Z. Sirkes, E. Tziperman, W.C. Thacker, Combining data and a global primitiveequation ocean general circulation model using the adjoint method,in: P. Malanotte-Rizzoli (Ed.), Modern Approaches to Data Assimilation inOcean Modeling, Elsevier Sci., 1996, pp. 119–144.

[77] M. Ghil, Advances in sequential estimation for atmospheric and oceanicflows, J. Met. Soc. Japan 75 (1997) 289–304.

[78] M. Ghil, P. Malanotte-Rizzoli, Data assimilation in meteorology andoceanography, Adv. Geophys. 33 (1991) 141–266.

[79] K. Ide, M. Ghil, Extended Kalman filtering for vortex systems. Part I:Methodology and point vortices, Dyn. Atmos. Oceans 27 (1997) 301–332.

[80] K. Ide, M. Ghil, Extended Kalman filtering for vortex systems. Part II:Rankine vortices andobserving-systemdesign, Dyn. Atmos. Oceans 27 (1997)333–350.

[81] R.N. Miller, M. Ghil, F. Gauthiez, Advanced data assimilation in stronglynonlinear dynamical systems, J. Atmospheric Sci. 51 (1994) 1037–1056.

[82] E. Kalnay, et al., The NCEP/NCAR 40-year reanalysis project, Bull. Amer.Meteor. Soc. 77 (1996) 437–471.

[83] D.J. Lorenz, D.L. Hartmann, Eddy–zonal flow feedback in the NorthernHemisphere winter, J. Climate 16 (2003) 1212–1227.

[84] D. Kondrashov, K. Ide, M. Ghil, Weather regimes and preferred transitionpaths in a three-level quasigeostrophic model, J. Atmospheric Sci. 61 (2004)568–587.

[85] H. Tennekes, J.L. Lumley, A First Course in Turbulence, MIT Press, 1972.[86] M. Ehrendorfer, The total energy norm in a quasi-geostrophic model, J.

Atmospheric Sci. 57 (2000) 3443–3451.[87] K. Strounine, Reduced models of extratropical low-frequency variability.

Ph.D. Thesis, Department of Atmospheric Sciences, University of California,LA, 2007.

[88] S.L. Sobolev, Applications of Functional Analysis in Mathematical Physics,American Mathematical Society, Providence, RI, 1963.

[89] D. Kondrashov, S. Kravtsov, A.W. Robertson, M. Ghil, A hierarchy of data-based ENSO models, J. Climate 18 (2005) 4425–4444.

[90] R. Von Mises, Mathematical Theory of Probability and Statistics, AcademicPress, New York, 1964.

[91] R. Mañé, On the dimension of the compact invariant sets of certain nonlinearmaps, in: D.A. Rand, L.S. Young (Eds.), Dynamical Systems and Turbulence,Springer-Verlag, New York, 1981, pp. 230–242.

[92] F. Takens, Detecting strange attractors in turbulence, in: D.A. Rand, L.S. Young(Eds.), Dynamical Systems and Turbulence, Springer-Verlag, New York, 1981,pp. 366–381.

[93] D.P. Dee, S.E. Cohn, A. Dalcher, M. Ghil, An efficient algorithm for estimatingnoise covariances in distributed systems, IEEE Trans. Automat. Control AC-30(1985) 1057–1065.

[94] A. Hannachi, A. ONeill, Atmospheric multiple equilibria and non-Gaussianbehavior in model simulations, Q. J. R. Meteorol. Soc. 127 (2001) 939–958.

[95] P. Smyth, K. Ide, M. Ghil, Multiple regimes in Northern Hemisphere heightfields viamixturemodel clustering, J. Atmospheric Sci. 56 (1999) 3704–3723.

[96] D.M. Titterington, A.F.M. Smith, U.E. Makov, Statistical Analysis of FiniteMixture Distributions, John Wiley and Sons, 1985, 243 pp.

[97] M. Ghil, A.W. Robertson,Waves vs. particles in the atmospheres phase space:A pathway to long-range forecasting? Proc. Natl. Acad. Sci. 99 (Suppl. 1)(2002) 2493–2500.

[98] M. Kimoto, M. Ghil, Multiple flow regimes in the Northern Hemispherewinter. Part I: Methodology and hemispheric regimes, J. Atmospheric Sci. 50(1993) 2625–2643.

[99] M. Kimoto, M. Ghil, Multiple flow regimes in the Northern Hemispherewinter. Part II: Sectorial regimes and preferred transitions, J. Atmospheric Sci.50 (1993) 2645–2673.

[100] S. Kravtsov, A.W. Robertson, M. Ghil, Multiple regimes and low-frequencyoscillations in the Northern Hemisphere’s zonal-mean flow, J. AtmosphericSci. 63 (2006) 840–860.

[101] W. Robinson, A baroclinic mechanism for the eddy feedback on the zonalindex, J. Atmospheric Sci. 57 (2000) 415–422.

[102] L. Arnold, Random Dynamical Systems, Springer-Verlag, 1998, 616 pp.[103] M. Ghil, M.D. Checkroun, E. Simonnet, Climate dynamics and fluid

mechanics: Natural variability and related uncertainties, Physica D 237(2008) 2111–2126. doi:10.1016/j.physd.2008.03.036.

[104] P.R. Bélanger, Estimation of noise covariance matrices for a linear time-varying stochastic process, Automatica 10 (1974) 267–275.