9780230 240124 01 prexl - macmillanihe.com · list of illustrations vii notes on contributors xiv...

40
v Contents List of Illustrations vii Notes on Contributors xiv Preface xxi Introduction xxii Part I Interest Rate Modelling and Forecasting 1 Combining Canadian Interest Rate Forecasts 3 David Jamieson Bolder and Yuliya Romanyuk 2 Updating the Yield Curve to Analyst’s Views 31 Leonardo M. Nogueira 3 A Spread-Risk Model for Strategic Fixed-Income Investors 44 Fernando Monar Lora and Ken Nyholm 4 Dynamic Management of Interest Rate Risk for Central Banks and Pension Funds 64 Arjan B. Berkelaar and Gabriel Petre Part II Portfolio Optimization Techniques 5 A Strategic Asset Allocation Methodology Using Variable Time Horizon 93 Paulo Maurício F. de Cacella, Isabela Ribeiro Damaso and Antônio Francisco da Silva Jr. 6 Hidden Risks in Mean–Variance Optimization: An Integrated-Risk Asset Allocation Proposal 112 José Luiz Barros Fernandes and José Renato Haas Ornelas 7 Efficient Portfolio Optimization in the Wealth Creation and Maximum Drawdown Space 134 Alejandro Reveiz and Carlos León 8 Copulas and Risk Measures for Strategic Asset Allocation: A Case Study for Central Banks and Sovereign Wealth Funds 158 Cyril Caillault and Stéphane Monier PROOF

Upload: ngotuyen

Post on 25-Jul-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

v

Contents

List of Illustrations vii

Notes on Contributors xiv

Preface xxi

Introduction xxii

Part I Interest Rate Modelling and Forecasting

1 Combining Canadian Interest Rate Forecasts 3 David Jamieson Bolder and Yuliya Romanyuk

2 Updating the Yield Curve to Analyst’s Views 31 Leonardo M. Nogueira

3 A Spread-Risk Model for Strategic Fixed-Income Investors 44 Fernando Monar Lora and Ken Nyholm

4 Dynamic Management of Interest Rate Risk for Central Banks and Pension Funds 64

Arjan B. Berkelaar and Gabriel Petre

Part II Portfolio Optimization Techniques

5 A Strategic Asset Allocation Methodology Using Variable Time Horizon 93

Paulo Maurício F. de Cacella, Isabela Ribeiro Damaso and Antônio Francisco da Silva Jr.

6 Hidden Risks in Mean–Variance Optimization: An Integrated-Risk Asset Allocation Proposal 112

José Luiz Barros Fernandes and José Renato Haas Ornelas

7 Efficient Portfolio Optimization in the Wealth Creation and Maximum Drawdown Space 134

Alejandro Reveiz and Carlos León

8 Copulas and Risk Measures for Strategic Asset Allocation: A Case Study for Central Banks and Sovereign Wealth Funds 158

Cyril Caillault and Stéphane Monier

PROOF

Page 2: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

vi Contents

9 Practical Scenario-Dependent Portfolio Optimization: A Framework to Combine Investor Views and Quantitative Discipline into Acceptable Portfolio Decisions 178

Roberts L. Grava

10 Strategic Tilting around the SAA Benchmark 189 Aaron Drew, Richard Frogley, Tore Hayward and Rishab Sethi

11 Optimal Construction of a Fund of Funds 207 Petri Hilli, Matti Koivu and Teemu Pennanen

Part III Asset Class Modelling and Quantitative Techniques

12 Mortgage-Backed Securities in a Strategic Asset Allocation Framework 225

Myles Brennan and Adam Kobor

13 Quantitative Portfolio Strategy – Including US MBS in Global Treasury Portfolios 249

Lev Dynkin, Jay Hyman and Bruce Phelps

14 Volatility as an Asset Class for Long-Term Investors 265 Marie Brière, Alexander Burgues and Ombretta Signori

15 A Frequency Domain Methodology for Time Series Modelling 280

Hens Steehouwer

16 Estimating Mixed Frequency Data: Stochastic Interpolation with Preserved Covariance Structure 325

Tørres G. Trovik and Couro Kane-Janus

17 Statistical Inference for Sharpe Ratio 337 Friedrich Schmid and Rafael Schmidt

Index 359

PROOF

Page 3: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

Part I

Interest Rate Modelling and Forecasting

PROOF

Page 4: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

PROOF

Page 5: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

3

1Combining Canadian Interest Rate ForecastsDavid Jamieson Bolder and Yuliya Romanyuk

1.1 Introduction and motivation

Model risk is a real concern for financial economists using interest-rate forecasts for the purposes of monetary policy analysis, strategic portfolio allocations, or risk-management decisions. The issue is that one’s analysis is always conditional upon the model selected to describe the uncertainty in the future evolution of financial variables. Moreover, using an alterna-tive model can, and does, lead to different results and possibly different decisions. Selecting a single model is challenging because different models generally perform in varying ways on alternative dimensions, and it is rare that a single model dominates along all possible dimensions.

One possible solution is the use of multiple models. This has the advan-tage of diversifying away, to a certain extent, the model risk inherent in one’s analysis. It does, however, have some drawbacks. First of all, it is time consuming insofar as one must repeat one’s analysis with each alternative model. In the event one uses a simulation-based algorithm, for example, this can also substantially increase one’s computational burden. A second drawback relates to the interpretation of the results in the context of mul-tiple models. In the event that one employs n models, there will be n separ-ate sets of results and a need to determine the appropriate weight to place on these n separate sets of results. The combination of these two drawbacks reduces the appeal of employing a number of different models.

A better approach that has some theoretical and empirical support involves combining, or averaging, a number of alternative models to create a single combined model. This is not a new idea. The concept of model averaging has a relatively long history in the forecasting literature. Indeed, there is evidence dat-ing back to Bates and Granger (1969) and Newbold and Granger (1974) suggest-ing that combination forecasts often outperform individual forecasts. Possible reasons for this are that the models may be incomplete, they may employ dif-ferent information sets, and they may be biased. Combining forecasts, there-fore, acts to offset this incompleteness, bias, and variation in information sets.

PROOF

Page 6: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

4 David Jamieson Bolder and Yuliya Romanyuk

Combined forecasts may also be enhanced by the covariances between indi-vidual forecasts. Thus, even if misspecified models are combined, the combin-ation may, and often will, improve the forecasts (Kapetanios et al. 2006).

Another motivation for model averaging involves the combination of large sets of data. This application is particularly relevant in economics, where there is a literature describing management of large numbers of explanatory variables through factor modelling (see, for example, Moench 2006 and Stock and Watson 2002). We can also combine factor-based models to enrich the set of information used to generate forecasts, as suggested in Koop and Potter (2003) in a Bayesian framework. There is a vast literature on Bayesian model averaging; for a good tutorial on Bayesian model averaging, see Hoeting et al. (1999). Draper (1995) is also a useful reference. A number of papers investi-gate the predictive performance of models combined in a Bayesian setting and find that there are accuracy and economic gains from using combined forecasts (for example, Andersson and Karlsson 2007, Eklund and Karlsson 2007, Ravazzolo et al. 2007, and De Pooter et al. 2007).

However, model averaging is not confined to the Bayesian setting. For example, Diebold and Pauly (1987) and Hendry and Clements (2004) find that combining forecasts adds value in the presence of structural breaks in the frequentist setting. Kapetanios et al. (2005) use a frequentist information-theoretic approach for model combinations and show that it can be a powerful alternative to both Bayesian and factor-based methods. Likewise, in a series of experiments Swanson and Zeng (2001) find that com-binations based on the Schwartz Information Criterion perform well relative to other combination methods. Simulation results in Li and Tkacz (2004) sug-gest that the general practice of combining forecasts, no matter what combin-ation scheme is employed, can yield lower forecast errors on average.

It appears, therefore, that there is compelling evidence supporting the combination of multiple models as well as a rich literature describing alter-native combination algorithms. This chapter attempts to explore the impli-cations for the aforementioned financial economist working with multiple models of Canadian interest rates. We ask, and attempt to answer, a sim-ple question: does model averaging work in this context and, if so, which approach works best and most consistently? While the model averaging lit-erature finds its origins in Bayesian econometrics, our analysis considers both frequentist and Bayesian combination schemes. Moreover, the prin-cipal averaging criterion used in determining how the models should be combined is their out-of-sampling forecasting performance. Simply put, we generally require that the weight on a given model should be larger for those models that forecast better out of sample. This is not uniformly true across the various forecasting algorithms, but it underpins the logic behind most of the nine combination algorithms examined in this chapter.

The rest of the chapter is organized in four main parts. In Section 1.2, we describe the underlying interest-rate models and review their out-of-sample

PROOF

Page 7: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

Combining Canadian Interest Rate Forecasts 5

forecasting performance. Next, in Section 1.3, we describe the alternative combination schemes. Section 1.4 evaluates the performance of the differ-ent model averaging approaches when applied to Canadian interest-rate data, and Section 1.5 concludes.

1.2 Models

The primary objective of this chapter is to investigate whether combined forecasts improve the accuracy of out-of-sample Canadian interest-rate fore-casts. The first step in attaining this objective is to introduce, describe, and compare the individual interest-rate models that we will be combining. Min and Zellner (1993) point out that if models are biased, combined forecasts may perform worse than individual models. Consequently, it is critically important to appraise the models and their forecasts carefully before com-bining them. The models used in this work are empirically motivated from previous work in this area. In particular, Bolder (2007) and Bolder and Liu (2007) investigate a number of models, including affine (see, for example, Dai and Singleton 2000, Duffie et al. 2003, Ang and Piazzesi 2003), in which pure-discount bond prices are exponential-affine functions1 of the state var-iables, and empirical-based (such as those in Bolder and Gusba 2002 and the extension of the Nelson-Siegel model by Diebold and Li 2003). The results indicate that forecasts of affine term-structure models are inferior to those of empirically-motivated models.

Out of these models, we choose those with the best predictive ability, in the hope that their combinations will further improve term-structure forecasts. The four models examined in this chapter, therefore, are the Nelson-Siegel (NS), Exponential Spline (ES), Fourier Series (FS) and a state-space approach (SS). It should be stressed that none of these models are arbitrage-free; in our experience, the probability of generating zero-cou-pon rate forecasts that admit arbitrage is very low2. An attractive feature of the selected models is that they allow us to easily incorporate macroeco-nomic factors into our analysis of the term structure, assuming a unidirec-tional effect from macroeconomic factors to the term structure. This has a documented effect of increasing forecasting efficiency. We do not model feedback between macroeconomic and yield factors, since Diebold et al. (2006) and Ang et al. (2007) find that the causality from macroeconomic factors to yields is much higher than that from yields to macroeconomic factors.

The models have the following basic structure:

Z t G t Y

Y C FY N

t

t ll

L

t l t t

, , ,

, : ,

t t

n n

( ) = ( )

= + + ( )=

−∑1

0 �

(1)

PROOF

Page 8: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

6 David Jamieson Bolder and Yuliya Romanyuk

Here, Z(t, �) denotes the zero-coupon rate at time t for maturity �, (� − t) the term to maturity, and G the mapping from state variables (factors) Y to zero-coupon rates. We model the vector Yt by a VAR(L) with L = 2, which we find works best for our purposes. For the ES and FS models, Z t P t t, ln , /t t t ( ) = ( )( )� � and P t Y g tk

nk t k, ,t t( ) = ( )=∑ 1 � , where P(t, �) is the price of a zero-coupon bond

at t for maturity �. In the ES model, gk(� − t) are orthogonalized exponen-tial functions; in the FS model, they are trigonometric basis functions (see Bolder and Gusba 2002 for details).

For all models except SS, we find the factors Yt at each time t by minimiz-ing the square distance between P(t, �) above and the observed bond prices. We augment the factors with three macroeconomic variables – the output gap xt, consumer price inflation πt, and the overnight rate rt – and collect these to form a time series. This procedure and the estimation of model-specific parameters for the NS, ES and FS models are given in Bolder and Liu (2007) and the references therein. In the SS model, we simply regress the vector of zero-coupon rates Zt on the first three principal components, extracted from the observed term structure up to time t, and the three contemporaneous macroeconomic variables. Note that only the SS model allows for a direct connection between the macroeconomic factors and the zero-coupon rates. In the other three models, only the term-structure fac-tors determine the yields or bond prices: in the mapping from state variables to bond prices or zero-coupon rates, the coefficients for macroeconomic fac-tors are set to zero3.

1.2.1 A few words about Bayesian frameworks

The task of selecting appropriate parameters for the prior distributions is not a trivial one, and a number of papers discuss this issue (see, for instance, Litterman 1986, Kariyala and Karlsson 1997, Raftery et al. 1997, Fernandez et al. 2001). We have tried a variety of specifications, includ-ing those in the references above as well as some calibrated ones. We have found that for our purposes, the g-prior (Zelner 1986) appears to produce the most satisfactory results. We estimate the parameters for the g-prior from the in-sample data. While this may not be the most optimal way to estimate a prior distribution, and ideally we would like to set aside a part of our data just for this purpose, we are constrained by the length of the available time series. First, we have to forecast for relatively long horizons and thus set aside a large proportion of the time series for the out-of-sample testing. Second, we have to leave some part of the time ser-ies to train model combinations. Third, our models are multidimensional and require a sizeable portion of the data just for estimation. Finally, it is difficult to have a strong independent (from observed data) prior belief about the behaviour of parameters in high-dimensional models. For these reasons, we estimate the g-prior and the posterior distribution using the same in-sample data.

PROOF

Page 9: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

Combining Canadian Interest Rate Forecasts 7

While our models have the general structure of state-space models, there are differences. We assume that zero-coupon rates Z in observation equa-tions are observed without error for all models except the SS. To estimate the models in a full Bayesian setup, we could have introduced an error term in each of these equations and then we would have had to use a fil-ter to extract the unobserved state variables Y. However, because FS and ES models are highly nonlinear (and the dimensions of the corresponding factors are high), such a procedure would be very computationally heavy and might not be optimal4. Instead of this, we take the state variables as given (from Bolder 2007) and estimate the transition VAR(2) equations in the Bayesian framework for each of the models. This facilitates computa-tions greatly, because we can use existing analytic results for VAR(L) mod-els (for details and derivations, please refer to the appendix in Bolder and Romanyuk 2008).

We use transition equations to determine weights for Bayesian model averaging schemes. For consistency with the other models, we compute the weights based on the transition equation of the SS model, even though the observation equation for the SS model is a regression with an error term. Technically speaking, this approach does not give proper Bayesian poster-ior model probabilities for the four models that are competing to explain the observed term structure, since the data y has to be the same (with the same observed zero-coupon rates Z) and the explanatory variables different depending on the model Mk. In our case, the y data differs for each transi-tion equation: it is the NS, ES, FS or SS factors. So in effect we are assigning weights to each model in the forecast combination based on how well the transition equations capture the trends in the underlying factors of each model. In light of our assumption that observation equations do not con-tribute any new information since they have no error term5, this approach appears reasonable.

1.2.2 Forecasts of individual models

In practice, we do not observe zero-coupon rates. We do not even observe prices of pure-discount bonds. We must use the observed prices of coupon-bearing bonds and some model for zero-coupon rates to extract the zero-coupon term structure. A number of alternative approaches for extracting zero-coupon rates from government bond prices are found in Bolder and Gusba (2002). Figure 1.1 shows the Canadian term structure of zero-coupon rates from January 1973 to August 2007. As in many industrialized economies, the Canadian term structure is characterized by periods of high volatile rates in the late 1980s and the 1990s. Moreover, starting in 2005, the term struc-ture becomes rather flat. Any single model will generally have difficulties describing and forecasting both volatile and stable periods equally well.

To evaluate the forecasts of the four models, we use monthly data for bond prices for different tenors and macroeconomic variables (output gap,

PROOF

Page 10: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

8 David Jamieson Bolder and Yuliya Romanyuk

consumer price inflation, and overnight rate) from January 1973 to August 2007. This constitutes 416 observations. We take the first 120 points as our initial in-sample estimation data. Once the models are estimated, we make out-of-sample interest rate forecasts for horizons h = 1, 12, 24, and 36 months at time T = 120 (the information set up to time T will be denoted by filtration FT). Next, for each model Mk, k = 1, ... , 4, we evaluate the vec-tor of N tenors of forecasted zero-coupon rates ˆ | ,Z ZT h T h T k+ += (E F M against the actual zero-coupon rates ZT+h, N × 1, extracted from observed bond prices:

eZ Z Z Z

NT h

T h T h T h T hk

k k

+

+ + + +=

−( ) −( )

M

M Mˆ ’ ˆ

(2)

A schematic describing the various steps in the determination of these overlapping forecasts is found in Figure 1.2.

We subsequently re-estimate each model for each T ∈ [121, 416 − h] in-sample points, calculating the corresponding forecast errors for each model. Figure 1.3 shows the root mean squared deviations between the actual and

5

10

15

1975198019851990199520002005

0.05

0.1

0.15

0.2

Tenor (years)

Time (years)

Zer

o-co

upon

rat

es

Figure 1.1 Zero-coupon rates from January 1973 to August 2007. The rates are extracted from Government of Canada treasury bill and nominal bond prices using a nine-factor exponential spline model described in Bolder and Gusba (2002).

PROOF

Page 11: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

Combining Canadian Interest Rate Forecasts 9

forecasted zero-coupon rates relative to the errors from random walk fore-casts using a rolling window of 48 observations6. We include the Root Mean Squared Error (RMSE) for the random walk model as a reference because, in the term-structure literature, it is frequently used as a benchmark model and it is not easy to beat, at least for affine models (see, for example, Duffee 2002 and Ang and Piazzesi 2003). Note that the forecasts of the random walk are just the last observed zero-coupon rates.

From Figure 1.3, we observe that for all horizons, there are periods when the models outperform the random walk, but none of the models seem to outperform the random walk on average (over the sample period). As one would expect, the forecasting performance of all four models deteriorates as the forecasting horizon increases. For horizons beyond one month, all mod-els have difficulties predicting interest rates during the period of high inter-est rates in the early 1990s. The models also struggle to capture the flat term structure observed in the early 2000s; however, the FS and the ES models appear to be more successful at this than the NS and the SS models. While all models perform similarly for the short-term horizon, certain patterns emerge at longer horizons: the NS and SS models tend to move together, as do the FS and ES models7. The heterogeneity between the models is a strong motivating factor for model averaging. In particular, it suggests that there is some potential for combining models to complement the information car-ried by each model and thereby produce superior forecasts.

Figure 1.4 shows the performance of our models estimated in the Bayesian setting relative to the random walk. Comparing with Figure 1.3, we see that Bayesian forecasts are virtually identical to frequentist forecasts. We do not

Figure 1.2 Forecasting interest rates. This schematic describes the steps involved in generating rolling interest-rate forecasts, which in this work, act as the principal input for the parametrization of our model-averaging schemes.

1. Set i = s and k = 1.2. Formulate EM F

k i h iZt t+( )| .

3. Observe Zti h+ .4. Compute et t t ti h

k

i h k i h iZ Z

+ + += − ( )M

M FE | .5. Repeat steps 13 for k = 2, ... , n models.6. Repeat steps 14 for i = s + 1, ... , T − h observations.7. Repeat steps 15 for h = 1, ... , H months.

Starting Data Rolling Forecasts

These data points xt ts1

,..., x{ } are used for the first forecasts.

We continue to update the data set and perform new forecasts.

t1 ts ts+1 ts+2

... .tT

PROOF

Page 12: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

10 David Jamieson Bolder and Yuliya Romanyuk

test whether the Bayesian forecasts are statistically significantly different from the frequentist ones, since we are not comparing frequentist versus Bayesian estimation methods. We estimate the models in the Bayesian set-ting only because we need Bayesian forecast distributions to obtain weights for Bayesian model averaging schemes.

1.3 Model combinations

In this work, we investigate nine alternative model combination schemes, which we denote C1−C9. They are Equal Weights, Inverse Error, Simple OLS, Factor OLS, MARS, Predictive Likelihood, Marginal Model Likelihood, Log Predictive Likelihood, and Log Marginal Model Likelihood. We refer to the first five schemes as ad-hoc, and the last four as Bayesian8. Our goal is to calculate weights for each model Mk, horizon h, and combination C k j hj k h

Cj: , , , , , ,,w = = =1 4 1 9 1 12 24 3, ... , , ... , months. Conceptually, therefore, different model averaging schemes merely amount to alternative methods for determining the amount of weight (i.e. the w ’s) to place on each individual forecast.

Models estimated in the frequentist setting produce point forecasts, whereas in the Bayesian setting we obtain forecast densities. There are two

1990 1995 2000 2005 1990 1995 2000 2005

0.9

1

1.1

1.2

1.3

1-Month horizon (BASE)

0.8

1

1.2

1.4

1.6

1.8

2

2.212-Month horizon (BASE)

1990 1995 2000 20050.8

1

1.2

1.4

1.6

1.8

2

2.2

24-Month horizon (BASE)

1990 1995 2000

0.8

1

1.2

1.4

1.6

1.8

2

36-Month horizon (BASE)

RWNSESFSSS

Figure 1.3 Predictive performance for frequentist forecasts relative to random walk

PROOF

Page 13: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

Combining Canadian Interest Rate Forecasts 11

approaches to combine Bayesian forecasts: the first refers to averaging the individual densities directly (Mitchell and Hall 2005, Hall and Mitchell 2007, and Kapetanios et al. 2005), while the second refers to combining the moments of individual densities (Clyde and Georde 2004). For example, as indicated in the last article, a natural point prediction at time T for a zero-coupon rate vector h-steps ahead is

ˆ | | , ˆ ,,Z Z Z ZT h T h T k hC

kT h T k k T h

k

j k+ +

=+ +

=

= ( ) = ( ) =∑ ∑E EF F M Mw w1

4

1

4

(3)

where ZT hk

+M are the means of individual forecast densities.

1.3.1 C1: Equal weights

This is the simplest possible combination scheme. Each individual forecast receives an equal weight as follows:

wk hC

n,1

1= (4)

Figure 1.4 Predictive performance for Bayesian forecasts relative to random walk

1990 1995 2000 2005

0.8

0.9

1

1.1

1.2

1.3

1-Month horizon

1990 1995 2000 20051990 1995 2000 20050.8

1

1.2

1.4

1.6

1.8

2

2.212-Month horizon

RWNSESFSSS

0.8

1

1.2

1.4

1.6

1.8

2

2.224-Month horizon

1990 1995 2000

0.8

1

1.2

1.4

1.6

1.8

2

36-Month horizon

PROOF

Page 14: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

12 David Jamieson Bolder and Yuliya Romanyuk

While the Equal Weights combination is very simple, it is a standard bench-mark for the evaluation of alternative model-averaging algorithms precisely because it performs quite well relative to individual forecasts and more complicated schemes (see, for example, Hendry and Clements 2004 and Timmermann 2006).

1.3.2 C2: Inverse error

In this combination scheme, we assign higher weights to models whose fore-casts perform better out of sample. We set aside M points from our sample to evaluate the predictive performance of each model, and then we average the forecast errors over these M points. More specifically, we estimate the models using T = 120 initial points, make h-step forecasts and evaluate each model’s performance by calculating the forecast error (2). Then we repeat these steps for each T ∈ [121, 120 + M − h]. This procedure yields M − h + 1 forecast errors, which we average. The resulting weights are given by

wk hC T

M hT h

k TM h

e M h

e

k

,

/ /

/2

1 1

1120

120

14

120120

=− +( )( )=

++

= =+ −

∑∑

� M

TT hk M h+ − +( ) ∑ M / 1

(5)

This combination scheme is also simple, but it differs from the Equal Weights approach in that it requires data. We use M observations to train the weights for this and all subsequent model combinations depending on the evalu-ation approach. Indeed, the Equal Weights combination is the only tech-nique that does not require a training period.

1.3.3 C3: Simple OLS

Here we combine the forecasts from individual models using simple OLS regression coefficients as weights. First, we estimate the models and make h-step forecasts for each T ∈ [120, 120 + M − h]. We treat these M − h + 1 forecasts ZT h

k+

M as realizations of four predictor variables, and for each tenor i ∈ [1, N], we regress9 the actual zero-coupon rates ZT+h against these individ-ual forecasts for the respective tenor i:

Z i i i Z iT h h k hk

T hk

+=

+( ) = ( ) + ( ) ( )∑b b01

4

, ,M

(6)

The weights for the simple OLS scheme are given by

wk hC

k hi i, ,3 ( ) = ( )b (7)

This type of combination scheme is very flexible, since the weights are unconstrained. What this implies is that one can place negative weights

PROOF

Page 15: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

Combining Canadian Interest Rate Forecasts 13

on certain forecasts and significant positive weights on other forecasts. As a consequence of this flexibility, this approach turns out to be our best-performing combination. Its flexibility is not, however, without a cost since we find the approach can be sensitive to the training period. We dis-cuss these points later in the discussion.

1.3.4 C4: Factor OLS

A drawback of the simple OLS scheme is that we estimate the weights separ-ately for a set of prespecified zero-coupon tenors and then interpolate for the remaining tenors. This leads to a fairly large number of regressions. To reduce the number of parameters, therefore, we construct a lower-dimensional alternative, which we term the factor OLS scheme.

First, we perform a basic decomposition of the zero-coupon term struc-ture as follows:

Y Z Y Z Z Y

Z

t t t t t t1 2 3

2

15 15 3( ) = ( ) = − ( )

=

Level

y

Slope

y m

Curve

, , ,, ,

tt t tZ Z, , ,2 3 15y m y− +( ) (8)

Here 3m, 2y and 15y refer to the 3-month bill, and 2- and 15-year bonds respectively. Clearly, this approach is motivated by the notions of well-known level, slope and curvature variables stemming from principal com-ponents analysis.

Now we have only three components from which we build the term structure of zero-coupon yields. To obtain the OLS weights, we regress10 the actual l-th factor YT+h(l), l = 1, 2, 3, on the factors forecasted by each model, Y lT h

Mk

+ ( ) :

Y l l l Y lT h h k hk

T hMk

+=

+( ) = ( ) + ( ) ( )∑b b01

4

, ,

(9)

The weights for the factor OLS scheme are

v bk hC

k hl l, ,4 ( ) = ( ) (10)

Once we have the combined forecasted factors Y lT h+ ( ), we invert the decom-position iteratively as follows:

Z Y Z Y Y ZY Y Y

t t t t t tt t t

, , ,, ,15 3 21 1 23 2 1 2

2y m y= ( ) = ( ) − ( ) = ( ) + ( ) − ( )

(11)

PROOF

Page 16: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

14 David Jamieson Bolder and Yuliya Romanyuk

The advantage of this averaging approach is that it reduces the number of regressions and thus estimated parameters. Its disadvantage is that we are forced now to interpolate the entire curve from on only three points. In some cases, the error with such an approximation may be substantial.

1.3.5 C5: MARS

The previous four schemes are relatively straightforward. For the purposes of comparison, however, we opted to include a more mathematically complex approach to combine the forecasts from individual models. The approach we selected is termed Multiple Adaptive Regression Splines (MARS), which is a function-approximation technique based on the recursive-partitioning algorithm. The basic idea behind this technique is to define piecewise lin-ear spline functions on an overlapping partition of the domain (Bolder and Rubin 2007 provide a detailed description of the MARS algorithm). As such, the MARS combination scheme can be considered an example of a math-ematically complicated nonparametric, nonlinear aggregation of our four alternative models.

The combination is trained on a set of M + h − 1 realized zero-coupon rates ZT+h and their forecasts ZT h

k+

M , T � [120, 120 + M − h], for all tenors, horizons and models. Once trained, we combine the individual forecasts according to the MARS algorithm. Note that, unlike in the previous four schemes, we cannot write the combined forecast ZT h+ as a linear combination of weights wk h

C,5 and individual forecasts ZT h

k+

M due to the nonlinearity and complexity of the MARS scheme.

1.3.6 C6: Predictive likelihood

In our Bayesian model averaging schemes, the weights are some version of posterior model probabilities. Theoretically, the posterior model probabil-ities P(Mk|Y) are

P

P

MM

M

M M

kk

j j

p Y

k k

Yp Y

p Y

p Y

|,

,

|

( ) = ( )( )

= ( ) (

=

( )

∑ 14

))( ) ( )=∑ j

nj jp Y1 |M MP

(12)

We think that all of the models are equally likely, so we take prior model probabilities P Mk n( ) = 1/ .

The quantity p(Y|Mk) is the marginal model likelihood for model Mk, which measures in-sample fit and fit to prior distribution only. However, out-of-sample forecasting ability is our main criterion for selecting models and evaluating model combinations (Geweke and Whiteman 2006 indicate

PROOF

Page 17: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

Combining Canadian Interest Rate Forecasts 15

that ‘a model is as good as its predictions’). This and other recent papers (for example, Ravazzolo et al. 2007, Eklund and Karlsson 2007, and Andersson and Karlsson 2007) use predictive likelihood, which is the predictive dens-ity evaluated at the realized value(s), instead of the marginal model likeli-hood, to average models in a Bayesian setting11. Following this stream of literature to obtain the weights for combination C6, for each model Mk and horizon h, we (a) formulate EM

MFk

Y YT h T T+ +( ) =| , (b) formulate p(YT|Mk, FT−h), (c) observe YT and evaluate p(YT|Mk, FT−h), and (d) use p(YT|Mk, FT−h) to combine EM F

kYT h T+( | .

Substituting the predictive likelihood into (12) in place of the marginal model likelihood, we obtain the weights for the predictive likelihood com-bination. Similarly to the previous four combinations, we calculate the weights for each T � [120, 120 + M − h] and average the resulting M − h + 1 weights to get the fixed weights that will be used to evaluate model combin-ations out of sample:

wk hC

Th T k T h

j T j T h

p Y

p Y,

| ,

| ,6

120120

14

=

( )( )

=

+ − −

= −∑ ∑

M M FM F

− +M h 1 (13)

Strictly speaking, such weights are not proper posterior model probabilities, but their advantage is measuring the out-of-sample predictive ability.

1.3.7 C7: Marginal model likelihood

Even though marginal model likelihood evaluates in-sample fit only, we use it as one of our model combination schemes, since this is the classical Bayesian model averaging approach (see, for instance, Madigan and Raftery 1994 and Kass and Raftery 1995). To generate a combined forecast, we cal-culate the marginal model likelihood p(YT|Mk) for model Mk using T in-sample data points. The weight for each model is its posterior probability. Then we average the weights for each T ∈ [120, 120 + M − h], as with pre-vious model combinations, to obtain the weights for the marginal model likelihood combination:

wkC

TM h T k

j T j

p Y

p Y

M h7

120120

14

1=

( )( )

− +

=+ −

=∑ ∑

|

|

MM

(14)

Unlike with weights based on the predictive likelihood, the weights based on the marginal model likelihood do not depend on the forecasting horizon h.

PROOF

Page 18: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

16 David Jamieson Bolder and Yuliya Romanyuk

1.3.8 C8 and C9: Log likelihood weights

It turns out that in practice the weights based on marginal model likeli-hood and predictive likelihood vary significantly depending on the estima-tion period (see Bolder and Romanyuk 2008). To obtain a smoother set of weights based on the marginal model (or predictive) likelihood, we take the logarithms of the marginal model (predictive) likelihood values and trans-form them linearly into weights. We want these weights wk, k = 1, ... , 4, to satisfy wk ∈ (0, 1), k k=∑ =

1

4 1w and the relative distance between the weights should be preserved by the transformation.

One possibility for such transformation is to let a be the lower bound of the interval on which our observed log likelihoods lie, order the log likelihoods in ascending order, and specify that [log | ]/[log | ] /, ,p Y a p Y aT i T j i T i TM M( )( ) − ( )( ) − = w w for i = 1, 2, 3, j = 2, 3, 4, with k k=∑ =1

4 1w . For marginal model likelihoods (alternatively, we could have used logs of predictive likelihoods), the set of weights

wp Y a

p Y ak T

T k

j T j

,

log |

log |=

( )( ) −

( )( ) −( )=∑M

M1

4

(15)

solve the linear system and satisfy the desired properties for weights stated above. Now the only tricky part is to choose a appropriately12. We take a = log (p(YT|M1)) − s, where s is the standard deviation of the log marginal model (predictive) likelihoods from their mean.

Figures 1.5 and 1.6 show logs of predictive likelihood and marginal model likelihood weights, respectively, for T ∈ [120, 120+M − h] and M = 120. They are more stable than the raw predictive likelihood and marginal model like-lihood weights. Note that in Figure 1.6 the weights are the same for all four forecasting horizons, since log marginal model likelihood weights are inde-pendent of the forecasting horizon.

Finally, we average the weights over the training period. For log marginal model likelihood combination, the weights are

vkC

T

M h T k

j T j

p Y M a

p Y M a8

120

120

1

4

=

( )( ) −

( )( ) −( )

=

+ −

=

∑∑

log |

log |

− +M h 1 (16)

For log predictive likelihood combination, we have

w

p Y a

p Yk hC

T

M h T k T h

j T j T

,

log | ,

log | ,9

120

120

1

4

=

( )( ) −=

+ − −

= −

∑∑

M F

M F hh a

M h

( )( ) −( )

− + 1

(17)

PROOF

Page 19: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

Combining Canadian Interest Rate Forecasts 17

Figure 1.5 Log predictive likelihood weights over the training period of 120 points

1984 1986 1988 1990 1992

1984 1986 1988 1990 1992

1984 1986 1988 1990 1992

1984 1986 1988 1990 1992

0

0.2

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

1

0

0.2

0.4

0.6

0.8

11-Month horizon

NSESFSSS

12-Month horizon

24-Month horizon 36-Month horizon

Figure 1.6 Log marginal model likelihood weights over the training period of 120 points

1983 1984 1985 1986 1987 1988 1989 1990 1991 19920

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1All horizons

sMML NSsMML ESsMML FSsMML SS

PROOF

Page 20: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

18 David Jamieson Bolder and Yuliya Romanyuk

1.4 Evaluating model-combination schemes

We use two methods to evaluate the performance of the nine previously described model combinations schemes. We call these approaches dynamic and static model averaging. For both we require the following ingredients: forecasts from individual models to be combined, a subset of the data to train the weights for model combinations, and the remainder of the data to evaluate the out-of-sample forecasts of different model combinations.

We generate individual forecasts for our models Z kT hMk

+ =, ,... ,1 4 for T � [120, 416 − h], as described in Section 1.2.2, and set these aside. Next we take a sub-set of these forecasts of length M to evaluate the predictive ability of the mod-els and use this information to obtain the weights for model combinations. In Section 1.3 we refer to this as training the weights. The last observation used in the training period to evaluate individual forecasts is 120 + M. Starting at this point T = 120 + M, we can combine the models using their respective weights and evaluate the out-of-sample predictive ability of the combinations using the remainder of the sample. That is, we calculate the forecast error

eZ Z Z Z

NT hC T h T h

CT h T h

C

j

j j

+

+ + + +=

−( ) −( )

ˆ ’ ˆ

(18)

for j = 1, ... , 9 model combinations at points T ∈ [120 + M, 416 − h]. Schematics with a graphic description of the dynamic and static forecasting approaches are found in Figures 1.7 and 18.

The key difference between the two methods for evaluating the combin-ations is their treatment of the training period. In the dynamic approach, the parameters of the model averaging scheme are updated gradually as we move forward in time. In this way, the most recent information regarding the forecasting performance of the models is incorporated in the model-averaging algorithm. The static approach, however, involves only a single computation of the model-combination parameters. As we move through time, therefore, the parameters are not updated to incorporate the most recent forecasting performance. Such evaluation is not the typical approach used in the forecasting literature, but is nonetheless appropriate for exam-ining the usefulness of a given model-combination scheme for simulation analysis, where one does not have the liberty of continuously updating one’s information set. We expect that with a limited training set, the static forecast combinations should underperform their dynamic counterparts.

1.4.1 Dynamic model averaging

The idea with dynamic model averaging is to use as much recent information as possible to train the weights for model combinations. We consequently

PROOF

Page 21: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

Combining Canadian Interest Rate Forecasts 19

Figure 1.7 Dynamic model averaging, This schematic describes the steps involved in dynamic model averaging whereby the parameters for each model-averaging algo-rithm are updated as new information becomes available.

Starting Data Rolling Forecasts

These data points X Xt ts1,...,{ }

are used for the first forecasts.We continue to update the data set and perform new forecasts.

t1 ts ts+1 ts+2 tT

Starting Data Training Data

These data points X Xt ts1,...,{ }

are used for the first forecasts.Forecasts from these periods are used to estimate model averaging parameters.

t1 ts tm tm+1 tm+2 tT

0. Set i = m, j = 1, and h = 1.1. Estimate Pc k tj i

M F|( ) for k = 1, ... , n.2. Apply weights to ˆ , ,...,Z k nti h

k

+={ }M 1 to form EC t tj i h

Z+( | F .

3. Compute etC

t C t ti h

j

i h j i h iZ Z

+ + += − ( )E | F .

4. Repeat steps 13 for j = 2, ... , k model-averaging approaches.5. Repeat steps 14 for i = m + 1, ... , T − h.6. Repeat steps 15 for h = 2, ... , H forecasting horizons.

Figure 1.8 Static model averaging. This schematic describes the steps involved in static model averaging whereby the parameters for each model-averaging algorithm are estimated only once with a fixed set of training data and are not updated as new information becomes available.

Starting Data Rolling Forecasts

These data points X Xt ts1,...,{ }

are used for the first forecasts.We continue to update the data set and perform new forecasts.

t1 ts ts+1 ts+2 tT

Starting Data Training Data

These data points X Xt ts1,...,{ }

are used for the first forecasts.Forecasts from these periods are used to estimate model averaging parameters.

t1 ts tm tT

0. Estimate once PC k tj mM F|( ) for k = 1, ... , n. Note: m is fixed.

1. Set i = m, j = 1, and h = 1.2. Apply fixed weights to ˆ , ,...,Z k nti h

k

+={ }M 1 to form EC t tj i h i

Z+( )| F .

3. Compute etC

t C t ti h

j

i h j i h iZ Z

+ + += − ( )E | F .

4. Repeat steps 23 for j = 1, ... , κ model-averaging approaches.5. Repeat steps 24 for i = m + 1, ... , T − h observations.6. Repeat steps 2 for h = 2, ... , H forecasting horizons.

PROOF

Page 22: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

20 David Jamieson Bolder and Yuliya Romanyuk

update the training period as new information arrives: starting with M = 120, we increase the training period until we run out of data (the last value for M is 416 − h). The steps involved are given in Figure 1.7.

Figure 1.9 shows the predictive performance of frequentist combinations (C1 − C5) relative to the random walk using a rolling window of 48 observa-tions. With the exception of factor OLS, all combinations beat the random walk on average for a one-month horizon. As the horizon increases, the per-formance of Inverse Error, Equal Weights and especially MARS combinations worsen13, while factor scheme OLS improves significantly. Past the one-month horizon, the simple OLS scheme outperforms all other frequentist combin-ations, approaching the random walk at one- and two-year horizons, and beating the random walk for the entire out-of-sample evaluation period at the three-year horizon. An interesting result is that the predictive performance of Inverse Error and Equal Weights are almost identical in our setting.

Figure 1.10 shows the performance of the Bayesian model averaging schemes C6 and C7 relative to the random walk, as well as Equal Weights and simple OLS, for comparison with the frequentist combinations. We see

Figure 1.9 Dynamic predictive performance for frequentist combinations relative to random walk

1998 2000 2002 2004 2006

1

1.2

1.4

1.6

1.8

Time (yrs.)

Rol

ling

RM

SE

(bp

s.)

Rol

ling

RM

SE

(bp

s.)

Rol

ling

RM

SE

(bp

s.)

Rol

ling

RM

SE

(bp

s.)

1-Month horizon

1998 2000 2002 2004 2006

1

1.5

2

2.5

Time (yrs.)

12-Month horizon

1998 2000 2002 2004

0.8

1

1.2

1.4

1.6

1.8

2

Time (yrs.)

24-Month horizon

1998 2000 2002 2004

1

1.5

2

Time (yrs.)

36-Month horizon

RWEWIEsOLSfOLSMARS

PROOF

Page 23: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

Combining Canadian Interest Rate Forecasts 21

that our Bayesian schemes do not beat the frequentist ones in the dynamic-evaluation approach.

Figure 1.11 compares Bayesian log combinations C8 and C9 to the random walk. The Equal Weights and simple OLS schemes are also displayed for ref-erence. We observe that using weights based on the logs of marginal model and predictive likelihoods improves the performance of Bayesian schemes significantly: they beat the random walk and the simple OLS scheme at the one-month horizon and get close to the Equal Weights combination at longer horizons.

1.4.2 Static model averaging

We may not always be in the position where we can increase the training period as is done in the dynamic setting14. So we have to test how well the different combinations perform if we calculate the weights over a fixed training period and apply these weights to all remaining individual fore-casts out-of-sample, without updating the training period. The steps for static model averaging are given in Figure 1.8.

Figure 1.10 Dynamic predictive performance for Bayesian combinations relative to random walk

1998 2000 2002 2004 2006

0.9

1

1.1

1.2

Time (yrs.)

1-Month horizon

1998 2000 2002 2004 2006

0.8

1

1.2

1.4

1.6

1.8

2

2.2

Time (yrs.)

12-Month horizon

1998 2000 2002 2004

0.8

1

1.2

1.4

1.6

1.8

2

2.2

Ti ( )

24-Month horizon

1998 2000 2002 20040.6

0.8

1

1.2

1.4

1.6

1.8

2

Ti ( )

36-Month horizon

RWPLMMLsOLSEW

Rol

ling

RM

SE

(bp

s.)

Rol

ling

RM

SE

(bp

s.)

Rol

ling

RM

SE

(bp

s.)

Rol

ling

RM

SE

(bp

s.)

PROOF

Page 24: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

22 David Jamieson Bolder and Yuliya Romanyuk

Figures 1.12–1.14 show the predictive performance of our nine combinations in the static model averaging setting. Comparing to the same figures from the dynamic setting, we see that Equal Weights, Inverse Error, and Bayesian schemes are more robust to the training period than other combinations – MARS, simple OLS, and factor OLS – in the sense that predictive performance of the former combinations is quite similar in both dynamic and static set-tings and thus not very sensitive to the estimation period. The performance of the latter schemes (particularly MARS) deteriorates when we estimate the weights over a fixed training period. However, the performance of the com-binations relative to each other is the same in both dynamic and static set-tings: Equal Weights and simple OLS are still the best frequentist schemes, and Bayesian log likelihood schemes are close to Equal Weights. Finally, for horizons beyond one month, simple OLS combination beats all other schemes and is only slightly worse than the random walk at long horizons.

1.4.3 Best combinations vs. best individual models

Since the objective of this chapter is to answer the question of whether there is benefit from using combinations of models as opposed to a single

Figure 1.11 Dynamic predictive performance for Bayesian log combinations relative to random walk

1998 2000 2002 2004 2006

Time (yrs.)

0.8

1

1.2

1.4

1.6

1.812-Month horizon

1998 2000 2002 2004

Time (yrs.)

0.6

0.8

1

1.2

1.4

1.6

1.8

36-Month horizon

RWPLMMLsOLSEW

1998 2000 2002 2004 2006

0.85

0.9

0.95

1

1.05

1.1

Time (yrs.)

1-Month horizonR

ollin

g R

MS

E (

bps.

)

Rol

ling

RM

SE

(bp

s.)

Rol

ling

RM

SE

(bp

s.)

1998 2000 2002 2004

0.8

1

1.2

1.4

1.6

1.8

Time (yrs.)

24-Month horizon

Rol

ling

RM

SE

(bp

s.)

PROOF

Page 25: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

Combining Canadian Interest Rate Forecasts 23

Figure 1.12 Static predictive performance for frequentist combinations relative to random walk

1998 2000 2002 2004 20060.8

1

1.2

1.4

1.6

1.8

2

2.21-Month horizon

1998 2000 2002 2004 2006

1

1.5

2

2.512-Month horizon

1998 2000 2002 2004 1998 2000 2002 2004

1

1.5

2

2.5

1

1.5

2

2.524-Month horizon 36-Month horizon

RWEWIEsOLSfOLSMARS

best-performing model, it makes sense to address this question directly. From Figure 1.3, we see that the Nelson-Siegel model performs well for short horizons, and the Fourier Series model performs well for longer horizons. Figure 1.15 compares these two models, and the combination schemes that perform best in the static model averaging setting (Equal Weights, Log Predictive Likelihood, and simple OLS), to the random walk.

We can make the following observations. All of our best combinations beat the best individual models at the one-month horizon on average. As the length of the horizon increases, Equal Weights and Log Predictive Likelihood schemes outperform the Nelson-Siegel model, but not the Fourier Series model. On average, the simple OLS combination outper-forms both individual models at all horizons. While it may be tempting to conclude that the simple OLS combination should be implemented instead of a single model, we are not ready to accept this conclusion. First, simple OLS is unconstrained, which means that the weights can be negative and they need not sum to one. The idea of assigning negative weights to par-ticular forecasts may be difficult to accept for policymakers. Consequently, there may be practical obstacles to implementing this combination scheme. Also, forecasts with unconstrained OLS weights and no intercept

PROOF

Page 26: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

24 David Jamieson Bolder and Yuliya Romanyuk

(as is the case in our situation) may be biased, as pointed out in Diebold and Pauly (1987). Second, some preliminary testing results (not reported here) show that the simple OLS scheme is sensitive to the subset of data used for the training period and to the length of the training period, as can be expected with least squares estimation in a relatively small sample. Further analysis of this particular combination scheme, including hypoth-esis testing and forecast error analysis such as that done in Li and Tkacz (2004), is left for future work.

1.5 Final remarks

The main question of this chapter is whether or not one can combine multiple interest-rate models to create a single model that outperforms any one individual model. To this end, nine alternative model averaging techniques are considered, including choices from the frequentist and Bayesian literature as well as a few new alternatives. These approaches are compared, in the context of both a dynamic and a static forecasting exer-cise, with more than thirty years of monthly Canadian interest-rate and macroeconomic data. We do not conduct hypothesis tests in this chapter,

Figure 1.13 Static predictive performance for Bayesian combinations relative to random walk

1998 2000 2002 2004 2006 1998 2000 2002 2004 2006

0.85

0.9

0.95

1

1.05

1.1

1.151-Month horizon

0.8

1

1.2

1.4

1.6

1.8

12-Month horizon

1998 2000 2002 2004 1998 2000 2002 2004

0.8

1

1.2

1.4

1.6

1.8

224-Month horizon

0.8

1

1.2

1.4

1.6

1.8

36-Month horizon

RWPLMMLsOLSEW

PROOF

Page 27: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

Combining Canadian Interest Rate Forecasts 25

so we do not claim any statistical improvements, but we can still make some observations regarding the predictive performance of the different model combinations.

The principal observation is that we find evidence of model combin-ations outperforming the best individual forecasts over the evaluation period. The degree of outperformance depends, however, on both the fore-casting horizon and the type of model combination. At shorter forecasting horizons, for example, almost all model combinations outperform the best single forecast. As the forecasting horizon increases, however, only the sim-ple OLS averaging scheme consistently outperforms the best single-model forecast. Indeed, the simple OLS approach also outperforms, on a number of occasions, the rather difficult random-walk forecasting benchmark; this is something that none of the individual forecasts achieve on a consistent basis. It is also clear that the simpler model combination approaches tend to outperform their more complex counterparts. Similarly to our results, Ravazzolo et al. (2007) find that the unconstrained OLS combination scheme (like our simple OLS scheme) and combinations with time-varying weights outperform more complex schemes. While this is consistent with the evidence in the literature that simpler schemes dominate their more

Figure 1.14 Static predictive performance for Bayesian log combinations relative to random walk

1998 2000 2002 2004 2006

0.85

0.9

0.95

1

1.05

1.1

1-Month horizon

1998 2000 2002 2004 2006

0.8

1

1.2

1.4

1.6

1.812-Month horizon

RWPLMMLsOLSEW

1998 2000 2002 2004

0.8

1

1.2

1.4

1.6

1.8

24-Month horizon

1998 2000 2002 2004

0.8

1

1.2

1.4

1.6

1.836-Month horizon

PROOF

Page 28: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

26 David Jamieson Bolder and Yuliya Romanyuk

complex counterparts, Stock and Watson (2004) note that it is difficult to explain such findings in the context of combining weights in a stationary environment.

Even though the simple OLS combination scheme generally performs quite well, it does have the disadvantage of demonstrating some instabil-ity with respect to the training period selected for the determination of the model-combination parameters. We need to investigate the simple OLS combination scheme further and test its sensitivity to the training period (its length and the time over which the weights are trained). This type of analysis should also be done for other combination schemes, such as Log Predictive Likelihood, that have shown promise in our study. Another inter-esting direction is to investigate the predictive performance of the com-bination of the less stable simple OLS and the very stable, and generally well-performing, Equal Weights.

One more possibility for further investigation is to consider combinations that are based on time-varying weights. Ravazzolo et al. (2007) find that time-varying combinations perform well in terms of predictive ability as

Figure 1.15 Predictive performance of best individual models and best combinations relative to random walk, static setting

1998 2000 2002 2004 2006 1998 2000 2002 2004 2006

1998 2000 2002 2004

24-Month horizon

1998 2000 2002 2004

36-Month horizon

0.850.9

0.951

1.051.1

1.151.2

1-Month horizon

0.8

1

1.2

1.4

1.6

1.8

2

2.212-Month horizon

RWPL(log)NSFSsOLSEW

0.8

1

1.2

1.4

1.6

1.8

2

2.2

0.8

1

1.2

1.4

1.6

1.8

2

PROOF

Page 29: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

Combining Canadian Interest Rate Forecasts 27

well as in economic sense, based on the results of an investment exercise. Time-varying weights have the advantage that they may capture structural breaks by assigning varying weights to the combined models at different periods. However, we have to be careful about incorporating time-varying weights in the context of funds management, since we may not be at liberty to update the information set in operational activities.

Acknowledgements

We would like to thank Scott Hendry, Greg Tkacz, Greg Bauer, Chris D’Souza, and Antonio Diez de los Rios from the Bank of Canada; Francesco Ravazzolo from the Norges Bank; Michiel de Pooter from the Econometric Institute, Erasmus University Rotterdam; and David Dickey from North Carolina State University. We retain any and all responsibility for errors, omissions, and inconsistencies that may appear in this work.

Notes

More complex mappings are considered by Leippold and Wu (2000) and Cairns 1. (2004), among others.If such outcomes occur, there are a number of possible solutions. For example, 2. one co-uld substitute for the arbitrage forecast the previous forecast or some combination of previous forecasts.Using the state-space (Diebold et al. 2006) adaptation of the Nelson-Siegel3. model, De Pooter et al. (2007) account for the effects of macroeconomic variables in a similar manner.De Pooter et al. (2007) discuss issues that arise in the Bayesian inference of affine 4. models, whose parameters are highly nonlinear, similarly to our models.While some may argue that such assumption is not realistic, we feel that it is 5. justified by the tangible benefits of greatly reduced estimation complexity and computational effort. We think that such benefits would not be outweighed by the advantages of introducing error into the observation equations to make the already stylized models more realistic.The random walk is scaled to one. Consequently, values higher than one imply 6. worse, and lower than one better, performance than the random walk. We opt for graphs with relative root mean squared forecast errors as opposed to the com-monly reported tables with the same information, because we have found graphs easier to read.The correlation between the forecast errors from the NS, SS, ES and FS models is 7. shown in Bolder and Romanyuk (2008).The difference between the two types of schemes is that ad-hoc combinations can 8. be applied to forecasts generated in either frequentist or Bayesian setting, where as Bayesian combination schemes should be applied to Bayesian forecasts.This can be done with or without the intercept 9. β0,h and/or forcing βk,h to add up to one. We have found (in studies unreported here) that unconstrained regres-sion without an intercept works best in our case.

PROOF

Page 30: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

28 David Jamieson Bolder and Yuliya Romanyuk

As with the simple OLS combination scheme, we can do this with or without an 10. intercept or forcing the coefficients to add up to one, but we obtain better results for the specification with no intercept and no restrictions.Model averaging based on predictive likelihood methods is not limited to 11. Bayesian framework. Kapetanios et al. 2006 use predictive likelihood, as opposed to the likelihood of observed data, to construct weights based on information criteria in a frequentist setting.There are many ways to do this. We are not claiming that our suggested method 12. is superior in any way; it is just a way to measure dispersion in the observed data.The MARS result is not surprising: as shown in Sephton, the MARS scheme 13. is very promising in-sample, but its out-of-sample performance is not entirely accurate.For instance, as debt managers in a central bank, we may have to use weights 14. calculated over some fixed period to calculate term-structure forecasts for the purposes of managing a foreign reserves portfolio or debt issuance for the next couple of years.

Bibliography

Andersson, M.K. and Karlsson, S (2007). ‘Bayesian Forecast Combination for VAR Models’. Sveriges Riksbank Working Paper 216.

Ang, A., S. Dong, and Piazzesi, M. (2007). ‘No-Arbitrage Taylor Rules’. National Bureau of Economic Research Working Paper 13448.

Ang, A. and M. Piazzesi (2003). ‘A No-Arbitrage Vector Autoregression of Term Structure Dynamics with Macroeconomic and Latent Variables’. Journal of Monetary Economics, 50, 745–787.

Bates, J.M. and Granger, C. W. J. (1969). ‘The Combination of Forecasts’. Operational Research Quarterly, 20(4), 451–468.

Bolder, D.J. (2007). ‘Term-Structure Dynamics for Risk Management: A Practitioner’s Perspective’. Bank of Canada Working Paper 2006–48.

Bolder, D.J. and Gusba, S. (2002). ‘Exponentials, Polynomials, and Fourier Series: More Yield Curve Modelling at the Bank of Canada’. Bank of Canada Working Paper 2002–29.

Bolder, D.J. and Liu, S. (2007). ‘Examining Simple Joint Macroeconomic and Term-Structure Models: A Practitioner’s Perspective’. Bank of Canada Working Paper 2007–49.

Bolder, D.J. and Romanyuk, Y. (2008). ‘Combining Canadian Interest-Rate Forecasts’. Bank of Canada Working Paper 2008–34.

Bolder, D.J. and Rubin, T. (2007). ‘Optimization in a Simulation Setting: Use of Function Approximation in Debt Strategy Analysis’. Bank of Canada Working Paper 2007–13.

Cairns, A.J.G. (2004). ‘A Family of Term-Structure Models for Long-Term Risk Management and Derivative Pricing’. Mathematical Finance, 14(3), 415–444.

Clyde, M. and George, E. I. (2004). ‘Model Uncertainty’. Statistical Science, 19(1), 81–94.

Dai, Q. and Singleton, K. J. (2000). ‘Specification Analysis of Affine Term Structure Models’. Journal of Finance, 55(5), 1943–78.

PROOF

Page 31: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

Combining Canadian Interest Rate Forecasts 29

De Pooter, M., Ravazzolo, F. and van Dijkm, D. (2007). ‘Predicting the Term Structure of Interest Rates: Incorporating Parameter Uncertainty, Model Unce rtainty and Macroeconomic Information’. Tinbergen Institute Discussion Paper TI 2007028/4.

Diebold, F.X. and Li, C. (2003).’Forecasting the Term Structure of Government Bond Yields’. National Bureau of Economic Research Working Paper 10048.

Diebold, F.X., and Pauly, P. (1987). ‘Structural Change and the Combination of Forecasts’. Journal of Forecasting, 6, 21–40.

Diebold, F.X., Rudebusch, G. D. and Aruoba, S. B. (2006). ‘The Macroeconomy and the Yield Curve: A Dynamic Latent Factor Approach’. Journal of Econometrics, 131, 309–338.

Draper, D. (1995). ‘Assessment and Propagation of Model Uncertainty’. Journal of the Royal Statistical Society, Series B (Methodological), 57(1), 45–97.

Duffee, G.R. (2002). ‘‘Term Premia and Interest Rate Forecasts in Affine Models’. Journal of Finance, 57(1), 405–443.

Duffie, D., Filipovic, D. and Schachermayer, W. (2003). ‘Affine Processes and Applications in Finance’. Annals of Applied Probability, 13(3), 984–1053.

Eklund, J. and Karlsson, S. (2007). ‘Forecast Combination and Model Averaging Using Predictive Measures’. Econometric Reviews, 26(2–4), 329–363.

Fernandez, C., Ley, E. and Steel, M. F. J. (2001). ‘Benchmark Priors for Bayesian Model Averaging’. Journal of Econometrics, 100, 381–427.

Geweke, J. and Whiteman, C. (2006). ‘Bayesian Forecasting’, In Handbook of Economic Forecasting, Vol. 1, Elliott, G., C.W.J. Granger and A. Timmermann (Eds), North-Holland.

Hall, S.G. and Mitchell, J. (2007). ‘Combining Density Forecasts’. International Journal of Forecasting, 23, 1–13.

Hendry, D.F. and Clements, M. P. (2004). ‘Pooling of Forecasts’. Econometrics Journal, 7, 1–31.

Hoeting, J.A., Madigan, D., Raftery, A. E. and Volinsky, C. T. (1999). ‘Bayesian Model Averaging: A Tutorial’. Statistical Science, 14(4), 382–417.

Kadiyala, K.R. and Karlsson, S. (1997). ‘Numerical Methods for Estimation and Inference in Bayesian VAR-Models’. Journal of Applied Econometrics, 12, 99–132.

Kapetanios, G., Labhard, V. and Price, S. (2005). ‘Forecasting Using Bayesian and Information Theoretic Model Averaging: An Application to UK Inflation’. Bank of England Working Paper 268.

Kapetanios, G., Labhard, V. and Price, S. (2006). ‘Forecasting Using Predictive Likelihood Model Averaging’. Econometric Letters, 91, 373–379.

Kass, R.E., and Raftery, A. E. (1995). ‘Bayes Factors’. Journal of the American Statistical Association, 90(430), 773–795.

Koop, G. and Potter, S. (2003). ‘Forecasting in Dynamic Factor Models Using Bayesian Model Averaging’. Econometrics Journal, 7, 550–565.

Leippold, M. and Wu, L. (2000). ‘Quadratic Term Structure Models’. Swiss Institute of Banking and Finance Working Paper.

Li, F. and Tkacz, G. (2004). ‘Combining Forecasts with Nonparametric Kernel Regressions’. Studies in Nonlinear Dynamics and Econometrics, 8(4), Article 2.

Litterman, R.B. (1986). ‘Forecasting with Bayesian Vector Autoregressions – Five Years of Experience’ Journal of Business and Economic Statistics, 4(1), 25–38.

Litterman, R.B. and Scheinkman, J. (1991). ‘Common Factors Affecting Bond Returns’. Journal of Fixed Income 1, 54–61.

PROOF

Page 32: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

30 David Jamieson Bolder and Yuliya Romanyuk

Madigan, D. and Raftery, A. E. (1994). ‘Model Selection and Accounting for Model Uncertainty in Graphical Models Using Occam’s Window’. Journal of the American Statistical Association, 89(428), 1535–1546.

Min, C. and Zellner, M. (1993). ‘Bayesian and Non-Bayesian Methods for Combining Models and Forecasts with Applications to Forecasting International Growth Rates’. Journal of Econometrics 56, 89–118.

PROOF

Page 33: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

359

ACR, see Adjusted for Credit Ratio (ACR)ADF, see augmented Dickey-Fuller

(ADF) testadjustable rate mortgages (ARMs), 227Adjusted for Credit Ratio (ACR), 117–18,

123, 128–31Adjusted for Skewness Sharpe Ratio

(ASSR), 115–16, 123, 127–31agency guaranteed mortgage-backed

securities, see mortgage-backed securities

aging populations, xxii–xxiiiAIC, see Akaike information criteria

(AIC)Akaike information criteria (AIC), 165,

166–8, 175analysts’ expectations, yield curves and,

31–42ARCH-GARCH-based models, 160ARMs, see adjustable rate mortgages

(ARMs)Asian financial crisis, 158asset accumulation, xxii–xxivasset allocation

of long-term investors, 265strategic. see strategic asset

allocation (SAA)tactical, 190–1time horizon and, 95–6

asset allocation problem, 66–7, 210–11

for public investment funds, xxx–xxxiof savings and heritage funds,

xxvi–xxviiasset allocation return, 183–4asset class modelling, xxxvii–xxxixasset management companies, 158asset management, stakeholders in,

179–80asset returns, 189–90ASSR, see Adjusted for Skewness Sharpe

Ratio (ASSR)augmented Dickey-Fuller (ADF) test, 68,

69, 70, 242–7

autocovariance, 335autoregressive spectral estimators,

293–5auto-spectrum, 284–5

BarCap Point risk management system, 160

Basel Accord, 159Bayesian frameworks, 4, 6–7,

10–11, 33benchmark yields, 31, 42, 160–1Black & Scholes option pricing model,

154n11Black-Litterman equation, 153bond yield curves

analyst’s views and, 31–42global, 38–41uncertainty matrix, 38US Treasury, 35–7, 58

bondscredit-risk, 52discount, 227fixed-rate, 227tilting between equities and, 194–7

Box, George, 202–3Brownian Bridge, 325, 327–9, 334–5buffer funds, xxvbuy and hold (BH) strategy, 208

Calmar Ratio (CR), 144, 148Canadian interest rate forecasts, 3–27

combined forecasts, 3–5, 10–17evaluation of combining, 18–24models, 5–10

Canadian term structure, of zero-coupon rates, 7, 8

capital preservation, xxvcentral bank foreign exchange

reserves, xxiicentral bank reserves, xxv, xxviii

estimates of, 159excess, xxviii, 164growth of, 158managements of, xxv–xxvi

Index

PROOF

Page 34: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

360 Index

central banksasset allocation by, 66–7asset classes for, 164–5benchmarks for, 160distributions of, 164–70interest rate risk management for,

64–88investment horizon, 160strategic asset allocation for, 158–60,

170–2strategic policy, 65types of investments of, xxxi

Clayton copula, 160cokurtosis, 271, 279combined interest rate models, 10–17

advantages of, 3–5dynamic model averaging, 18–21equal weights, 11–12evaluation of, 18–24factor OLS, 13–14inverse error, 12log likelihood weights, 16–17marginal model likelihood, 15MARS, 14predictive likelihood, 14–15simple OLS, 12–13vs. single models, 22–7static model averaging, 19, 21–2

commodity funds, xxiicommodity prices, xxiicommodity revenues, xxvicommodity-exporting countries,

savings and heritage funds in, xxvi–xxvii

conditional value-at-risk (CVaR), 184, 210

constant proportion portfolio insurance (CPPI) strategy, 209

convex risk measures, 210copula functions, 160, 162–4,

169–70Cornish-Fisher expansion, 269correlation matrix, 279coskewness, 271, 279coupon return, 234–5covariance model, 101CR, see Calmar Ratio (CR)credit risk, 95, 112, 113, 115–18, 132credit spreads, 44–6, 48–9credit-risk bonds, 52

credit-spread modelling, 45, 47see also spread-risk model

Custom Pan-Euro Treasury Index, 260CVaR, see conditional value-at-risk

(CVaR)

dataestimating mixed frequency, 325–35missing, 326–7

Data Generating Process (DGP), 317debt repayment, xxvdemographics shifts, xxii–xxiiidescriptive statistics, 278–9development funds, xxixDiebold-Li model, 33disasters, 140–4discount bonds, 227distributions

of central banks, 164–70Gaussian, 138, 140, 160, 164–5non-normal, 114, 302–3return, 268–9use of appropriate, 162–4

diversification, 135dominance, 138Dutch disease, xxviidynamic model averaging, 18–21dynamic Nelson-Siegel model, 45,

47, 56

efficient frontier (EF), 120–9, 136–7, 272

emerging marketsincrease in reserves in, xxvsavings and heritage funds in,

xxvi–xxviisize of domestic markets in, xxxi

equities, tilting between bonds and, 194–7

estimation risk, 332–4EUR securitized debt, 259–60Euro-Aggregate Index, 259–60European Central Bank, 249event risk, 94, 96, 97excess reserves, xxv, xxviii, 164exchange rate risk, xxviiexchange-traded funds (ETFs), 349–50exit strategies, 97expected return, 135–6

MDD-adjusted, 150

PROOF

Page 35: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

Index 361

expected return model, 101expected shortfall (ES), 160exponential-affine functions, 5Exponential Spline (ES) model, 5–6exponential weighted moving average

(EWMA) volatility model, 160

Federal Home Loan Mortgage Corporation (FHLMC), 226

Federal National Mortgage Association (FNMA), 226

filtering, 288–91filters, 285–75-asset frontier, 155n17fixed income analysts, yield curves and,

31–42fixed proportions (FP) strategy, 208fixed-income investing, spread-risk

model for strategic, 44–62fixed-rate bonds, 227fixed-weight strategy, 200forecasting yields, 33–5forecasts, 95, 97

interest rate, see interest rate forecastsforeign debt, xxvforeign exchange reserves, xxiiforeign investments, xxxiforeign reserves

academic publications on, xxxiigrowth of, xxxii

Fourier Series (FS) model, 5–6, 23frequency data, estimating mixed,

325–35frequency domain, 283–8

versus time domain, 284for time series modelling, 282–303

frequency models, 297–300fund of funds, 207–20

G7 Treasury index, 257–9, 261–2Gaussian distribution, 138, 140, 160,

164, 165genetic algorithm (GA), 149–52Global Aggregate index (GlobalAgg),

250, 256–60, 261–3Global Multi-Factor Risk Model, 252government holding management

companies, xxviii, xxixGovernment National Mortgage

Association (GNMA), 226

Government Sponsored Enterprises, 227–8

Greenspan-Guidotti rule, 158Gumbel copula, 160

hedge funds, 113, 114–15, 120, 122heritage funds, xxvi–xxvii

implied volatility, 265, 266–8independent component analysis (ICA),

41–2inflation, 173–4institutional issues, 179–80integrated measure of performance,

115–18interest rate forecasts, 182–3

Canadian, 3–37combined, 10–17combined forecasts, 3–5evaluation of combining, 18–24model risk and, 3models, 5–10

interest rate modelling and forecasting, xxxiii–xxxiv

interest rate models, 5–10best-performing model, 22–4combination vs. single, 24–7combinations, 10–17forecasts of individual, 7–10performance of, 9–10

interest rate risk, dynamic management of, 64–88

interest rate volatility, 301–2interest rates, mean reversion in, 67–71,

87–8investment decisions, 178investment grade credit and currency

hedges, 197–8investment horizon, 160investment portfolios, monitoring of,

281–2investment return, xxvinvestment strategies

basic, 207–10buy and hold (BH) strategy, 208of central banks, xxv–xxviconstant proportion portfolio

insurance (CPPI) strategy, 209fixed proportions (FP) strategy, 208target date fund (TDF), 208–9

PROOF

Page 36: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

362 Index

investment tranche, xxii, xxviinvestor views, 178–87

Kolmogorov-Smirnov (KS) test, 165, 166–8, 175

KPSS test, 68, 69, 70kurtosis, 122, 165, 166–8, 266

leakage effect, 287–8Lehman Brothers, 227level-dependent strategies, 65, 73, 75–7LIBOR/SWAP rates, 49, 51, 57, 60linear filter G(L), 285–7linear regression-based strategy, 73,

77–9liquidity crises, 158long volatility (LV), 278long-run mean reversion, 189long-term investors, volatility as asset

class for, 265–76long-term time series data, 309–12

Manipulation-proof Performance Measure (MPPM), 123, 127

marginal model likelihood, 15market risk, 95, 112, 115–18Markowitz model, 93–5, 114, 134–7maximum drawdown (MDD), 134

benefits of using, 153as measure of risk, 140–4portfolio optimization problem

under, 144–9maximum entropy spectral analysis,

292–5, 302–3MBS Index, 250, 251–2MDD, see maximum drawdown (MDD)MDD-adjusted expected returns

(MDDAER), 150–2mean reversion

in asset markets, 190in interest rates, 67–71, 87–8long-run, 189

mean-variance analysis, 94mean-variance criteria (MVC), 135,

138–9, 144, 170mean-variance dominance, 138–9mean-variance model, 114mean-variance optimization

empirical study, 118–31hidden risks in, 112–32

portfolio performance evaluation of, 123–9

mixed frequency data, estimating, 325–35

model averaging, 3–4Bayesian, 4

model risk, 3modified VaR, 266, 269momentum-based strategies, 73,

83–6monotonicity, 142, 143, 155n14Monte Carlo simulations, 193, 198–201,

204n2, 316–22, 329–34mortgage-backed securities, 225–47

attribution model for, 232–41comparing to Global Aggregate index,

256–60coupon return, 234–5historical performance, 228–32implications of market development

in 2007–2008, 241–7introduction to, 225–6, 249–51investor considerations for, 228–32market depth and liquidity, 249–50paydown return, 238–41price return, 235–8quantitative portfolio strategy and,

249–64return forecasts, 242–7as strategic asset class, 226–32structure of, 227–8TBA proxy portfolio, 251–4

MSCI-Emerging Markets (MSCI-EM), 147

multi-objective optimization, 95, 98multi-period mean-variance

analysis, 94Multiple Adaptive Regression Splines

(MARS), 14, 28n13

negative carry, xxvNelson-Siegel (NS) model, 5–6, 23, 56,

72observation equation for, 47

New Zealand Superannuation Fund (NZSF), 189

Newton-Raphson type algorithms, 162non-normal distributions, 114, 302–3non-parametric spectral estimators, 292Normal Inversion Gaussian, 160

PROOF

Page 37: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

Index 363

oil revenues, xxviOmega function, 160, 170optimal diversifications, between funds,

207–20optimization, see portfolio optimizationordinary least squares (OLS) regressions,

67–8Ornstein Uhlenbeck Bridge, 335

parametric spectral estimators, 293Pareto optimality, 98–9paydown return, 238–41pension fund management, case study,

212–17pension reserve funds, xxii–xxiii,

xxvii, xxviiiaccumulation phase of, xxviiinterest rate mismatch, 67interest rate risk management for,

64–88investment horizon, 67types of investments of, xxxiwithdrawal phase, xxix

pension reserves, xxiiperspective distortion, 309–11policy benchmarks, 64portfolio design goals, 180–1portfolio optimization, xxix

of fund of funds, 207–20Markowitz model, 93–5, 114, 134–7maximum drawdown and, 140–9process inputs, 182–4risk measurement and, 137–40scenario-dependent, 178–87using alternative performance

measures, 130–1wealth creation-MDD, 147–52

portfolio optimization problem, 210–11

portfolio optimization techniques, xxxiv–xxxvii, 65, 114

dynamic duration strategies, 72–88level-dependent strategies, 65, 73,

75–7mixed strategies, 85–6momentum-based strategies, 73,

83–6multi-objective optimization, 95, 98regression-based strategies, 65, 73,

77–80

scoring strategies, 66, 73, 81–3, 85–6variable time horizon strategic asset

allocation, 93–110portfolio risk, 136portfolio sampling, 114portfolio selection, 95positive homogeneity, 142, 143Power Transfer Function (PTF), 285–7premium bonds, 227price return, 235–8principal component analysis (PCA),

31, 41–2probit regression model, 79–80proportional exposure, 209PTF, see Power Transfer Function (PTF)public investment funds

asset allocation for, xxx–xxxibalance sheet considerations, xxxlargest, xxiii–xxivobjectives and liabilities, xxv–xxixpolicy objectives, xxxreputational considerations, xxxitypes of, xxv–xxix

quantitative portfolio strategy, 249–64quantitative techniques, xxxvii–xxxix,

179–80

random walk model, 9rebalancing frequencies, 65regime switching models, 160regression-based strategies, 65, 73,

77–80reserves, xxii–xxiv

academic publications on, xxxiiestimates of central bank, 159growth of, 158

reserves diversification, xxiireserves investment corporations, xxii,

xxviiireturn distributions, 268–9return volatility, 112risk

credit, 95, 112, 113, 115–18, 132estimation, 332–4event, 94, 96, 97, 268–9exchange rate, xxviiinterest rate, 64–88market, 95, 112, 115–18market price of, 44–5

PROOF

Page 38: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

364 Index

risk aversion, xxiirisk integration, 96, 112, 115–18risk measurement, 137–40risk measures, 140–4, 170

conditional value-at-risk (CVaR), 184, 210

convex, 210expected shortfall (ES), 160Omega function, 160, 170for strategic asset allocation, 158–76value at risk (VaR), 159volatility, 159

risk models, 46, 112, 139risk preferences, time horizon and, 95–6risk premiums, 112, 115risk scenarios, 180–1, 184–6risk-management mechanisms, 191RiskMetrics, 160

safety first concept, 139savings and spending rules, xxvisavings funds, xxvi–xxviiscenario-dependent portfolio

optimization, 178–87scoring strategies, 66, 73, 81–3, 85–6Sharpe ratios, 117–18, 123, 125–6, 128

differences between two, 353–6introduction to, 337–8statistical inference for, 337–56time aggregation and stochastic

dominance, 338–9Shiller data, 325single factor spread model (SM1), 46,

48–51skewness, 94, 113, 114–18, 121, 122,

125, 132, 266social security funds, xxii, xxii–xxiii,

xxvii, xxviii, xxixtypes of investments of, xxxi

sovereign wealth funds (SWFs)academic publications on, xxxiiasset class universe, 164–70benchmarks for, 160–1creation of, 158definition, xxiiiinvestment horizon, 160reserves estimates for, 159strategic asset allocation for, 158–60,

170, 172–4spectral analysis, 305

maximum entropy, 292–5

techniques, 283–8, 302–3spectral densities, 284–5spectral windows, 292spread movement, 183spread-risk model

data, 46–7dynamics for the factors, 55–6empirically founded, 48–51Nelson-Siegel model and, 47out-of-sample comparison, 56–61single factor, 46, 48–51for strategic fixed-income investors,

44–62two-factor, 46, 48–51

squared gain, 285SSA, see strategic asset allocation (SAA)stabilization funds, xxv, xxviiistakeholders, 179–80state dependencies, in time series

modelling, 300–2state-space (SS) model, 5–6static model averaging, 19, 21–2statistical inference

under general conditions, 351–2general result, 340–2for Sharpe Ratio, 337–56temporal independence, 342–6under volatility clustering, 346–51

stochastic dominance, 339stochastic interpolation, 325–35

methodology, 327–9Monte Carlo simulation, 329–34

stochastic volatility, 346–51strategic asset allocation (SAA),

xxix–xxxiiiappropriate distributions and, 162–4for central banks, 158–60, 170–2decision framework for, xxxinstitutional issues, 179–80interest rate risk and, 64–88methodology, 161–2of mortgage-backed securities, 225–47optimization problems, 94–5policy benchmarks, 64risk measures for, 158–76for sovereign wealth funds, 158–60,

170, 172–4strategic tilting around, 189–205time horizon and, 95–6, 281uncertainty and, 280using variable time horizon, 93–110

PROOF

Page 39: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

Index 365

strategic asset allocation (SAA) – continued

weakness of traditional approaches to, 94–6

strategic fixed-income investing, spread-risk model for, 44–62

strategic tilting, 189–205enhancing sustainability of, 201–3between equities and bonds, 194–7future directions, 203–4historical back-tests, 194–8introduction to, 189–91Monte Carlo analysis, 198–201overview of methodology, 191–3as package, 198

stress scenarios, 97, 139–40subadditivity, 142, 143swap spreads, 49, 59SWFs, see sovereign wealth funds (SWFs)

tactical asset allocation (TAA), 190–1target date fund (TDF), 208–9TBAs (to-be-announced contracts)

normalized tracking error performance, 254–6

proxy performance record, 253–4replicating performance of MBS Index

using, 251–2term structure of risk and return, 2813-asset frontier, 155n17tilting, see strategic tiltingtime aggregation, of Sharpe ratios,

338–9time horizon, 97, 281

for central banks, 160versus frequency domain, 284impact of, 95–6for SWFs, 160

time series analysis, 292–5time series decomposition, 288–91

filter requirements, 289–90zero correlation property, 291Zero Phase Frequency Filter, 290–1

time series models/modelling, 217–20, 280–323

of complex dependencies, 322–3construction of, 280–1data used for, 309–16frequency domain methodology for,

282–323model analysis, 303–6

model specification and estimation, 295–303

Monte Carlo simulations, 316–22samples and observation frequencies,

309–12state dependencies, 300–2understanding data and model

dynamics, 306–7tracking error volatility (TEV), 254–6transfer rules, xxvitransition equations, 7translation invariance, 142–3trend model, 296–72-asset frontier, 155n17two-factor spread model (SM2), 46,

48–51

uncertainty matrix, in yield curves, 38unconditional forecasts, 34unit root tests, 68–70US dollars (USD), wealth accumulation

in, xxviiUS mortgage backed-securities, 249–64

see also mortgage-backed securitiesUS Treasury bond yield curve, 35–7, 58

value at risk (VaR), 140, 159, 266, 269variable time horizon strategic asset

allocation, 93–110data for, 101–2evolutionary algorithm, 99–100examples, 102–9modelling limitations, 101multi-objective optimization, 98set of objectives for, 96–7

variance decomposition, 305–6variance ratio test, 70–1vector autoregressive (VAR) model,

55–6, 280vector equilibrium correction model,

214, 217–20VIX index, 267–8volatility, 115–18, 137–8, 159

as asset class, 265–76efficient portfolio with, 272–4implied, 265, 266–8interest rate, 301–2portfolio construction and, 268–70stochastic, 346–51

volatility risk premium (VRP), 265, 268, 271, 278

PROOF

Page 40: 9780230 240124 01 prexl - macmillanihe.com · List of Illustrations vii Notes on Contributors xiv ... Lev Dynkin, Jay Hyman and Bruce ... binations based on the Schwartz Information

366 Index

wealth creation, 147–9wealth creation-MDD optimization,

146–52weighted average coupon (WAC), 234–5

yield curvesanalyst’s views and, 31–42global, 38–41Nelson-Siegel model, 72posterior distribution of, 33

spread models, 48–55uncertainty matrix, 38US Treasury, 58US Treasury bond, 35–7

yield paths, 182–3

zero correlation property, 291, 295, 302

Zero Phase Frequency Filter, 290–1, 293zero-coupon rates, 6, 7, 8, 9

PROOF